Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added option to optionally compute channel statistics #66

Closed
wants to merge 1 commit into from

Conversation

lossyrob
Copy link
Contributor

@lossyrob lossyrob commented Jul 7, 2017

If you want to use the generators for tasks that do not need to normalize or un-normalize the images, the requirement to compute channel statistics is burdensome (having to download the correct precomputed channel statistics or the imagery). This PR gives an option that allows for the channel statics to not be computed if the tasks do not need them.

@lossyrob lossyrob requested a review from lewfish July 7, 2017 17:09
@lewfish
Copy link
Contributor

lewfish commented Jul 7, 2017

I can see that there are times when we don't actually need the images to be normalized, or to even iterate over images, but why is it a burden to download the channel_stats file if that happens automatically anyway and is a tiny file?

@lossyrob
Copy link
Contributor Author

lossyrob commented Jul 7, 2017

It's unclear what the codepath is, but when I don't set this option, my current notebook tries to calculate from the images.

The idea of the burden of downloading the file was before I realized the data generator did that automatically. If the DL happened in a separate step, then it wouldn't have to download the image tiffs either. I think this speaks a bit to that the data generator holds a lot of responsibility, and if it were split out into a file index generator and something that actually loaded and normalized the images, I could use the former class only and avoid any unnecessary downloads.

@lewfish
Copy link
Contributor

lewfish commented Jul 7, 2017

I think this speaks a bit to that the data generator holds a lot of responsibility, and if it were split out into a file index generator and something that actually loaded and normalized the images, I could use the former class only and avoid any unnecessary downloads.

True, it could use refactoring. So, since it happens automatically do you still need the functionality in this PR?

@lossyrob
Copy link
Contributor Author

lossyrob commented Jul 7, 2017

The problem was with setting the proper environment variables so that the Jupyter notebook can use the right S3 path; also when downloads fail it continues as if it had succeeded.

These parts of the code need general refactoring, like you mentioned; this would be a band-aide where more work is needed, so I'll close this out.

@lossyrob lossyrob closed this Jul 7, 2017
@lewfish lewfish deleted the re/avoid-unneeded-channel-stats branch July 25, 2017 14:18
@lewfish lewfish restored the re/avoid-unneeded-channel-stats branch July 25, 2017 14:19
@lewfish lewfish deleted the re/avoid-unneeded-channel-stats branch August 30, 2017 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants