Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-absolute image path support #1224

Merged
merged 2 commits into from
Jul 12, 2021
Merged

Non-absolute image path support #1224

merged 2 commits into from
Jul 12, 2021

Conversation

hungcs
Copy link
Contributor

@hungcs hungcs commented Jul 10, 2021

Code Pull Requests

Please provide the following:

  • Provides support for non-absolute image paths and adds naive image inference (if specified via preprocessing arguments).
  • Images are inferred using the formula min(avg([width | height]), 256) sampling the first 100 images. Checking the sizes of all images seems rather complicated and involves format-specific logic that may break if header data is malformed.
  • is_http check passes the url directly to imread because fsspec can error trying to retrieve image info, which may be related to the jpeg/header issue above.
  • Sample dataset with http image paths: insuranceLite.csv

Copy link
Collaborator

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgaddair tgaddair merged commit 60b12d4 into master Jul 12, 2021
@tgaddair tgaddair deleted the fsspec-images branch July 12, 2021 19:00
sample_images = [ImageFeatureMixin.read_image(
ImageFeatureMixin.get_image_from_path(src_path, img)) for img in input_feature_col[:sample_size]]

height_avg = min(
Copy link
Collaborator

@w4nderlust w4nderlust Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ImageFeatureMixin.MIN_IMAGE_INFER_DIM should be MAX not MIN, as we want to make sure the inferred image size is not above a certain amount.

Also, having a default ImageFeatureMixin.MAX_IMAGE_INFER_DIM = 256 is ok, but maybe that could be a preprocessing parameter too.

@w4nderlust
Copy link
Collaborator

On more thing that would be great is to have a test for remote images (you could use the images in the Ludwig readme for this purpose)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants