Skip to content
This repository has been archived by the owner on May 7, 2020. It is now read-only.

Add "preload" option to Polyps912 dataset #3

Merged
merged 4 commits into from
Jul 4, 2017

Conversation

lamblin
Copy link
Contributor

@lamblin lamblin commented May 26, 2017

The total amount of memory is < 1 GB I think, so it can be reasonable. It is still off by default.
I'm not sure what to add as a test.

Copy link
Owner

@fvisin fvisin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, I like the idea of having a preload option to preload the dataset in memory.

image_name_to_idx = {}
for idx, img_name in enumerate(self.filenames):
image_name_to_idx[img_name] = idx
img = io.imread(os.path.join(self.image_path, img_name + ".bmp"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the code that loads images and masks (L119-L127) in a separate function _load_image(image_batch, mask_batch, filename_batch, img_name, prefix=None) and call that function from here and from load_sequence? This way the common code doesn't get duplicated.

As a more general step, I think it might be worthwhile to do the same for all the datasets and have a preload flag in parallel_loader.py, but I don't know if you feel like going that far with this PR or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll refactor the loading code for that dataset.
I'm not familiar enough with the rest of the code to have a good sense of what can be refactored in parallel_loader.py, and what needs to be done in each dataset, and unfortunately I will not have time to dive into that soon.

@lamblin
Copy link
Contributor Author

lamblin commented May 31, 2017

Updated.

Copy link
Owner

@fvisin fvisin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from a small thing, LGTM. Once you are done, can you please run the test at the end of the file (I know, it's not a very formal test unfortunately) just to double check that it still works? Once that's out of the way we can merge!

Thank you again for the PR!

# Add to minibatch
image_batch.append(img)
mask_batch.append(mask)
filename_batch.append(img_name)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will still need to append img_name to filename_batch.

@lamblin
Copy link
Contributor Author

lamblin commented Jun 20, 2017

Sorry about the delay, I'll update.

@lamblin
Copy link
Contributor Author

lamblin commented Jun 20, 2017

After the last fixes, running the test:

$ python dataset_loaders/images/polyps912.py
N classes: 3
Void label: [2]
Train n_images: 547, batch_size: 10, n_batches: 55
Validation n_images: 183, batch_size: 1, n_batches: 183
Test n_images: 182, batch_size: 1, n_batches: 182
Minibatch 0 time: 4.83353710175 (4.83353710175)
Minibatch 1 time: 0.475756883621 (5.30929398537)
Minibatch 2 time: 0.469623088837 (5.7789170742)
[...]
Minibatch 54 time: 0.298352003098 (29.8576231003)

If I change the test to use preload=True:

Minibatch 0 time: 0.1208589077 (0.1208589077)
Minibatch 1 time: 0.115332841873 (0.244853019714)
Minibatch 2 time: 0.114579200745 (0.359432220459)
[...]
Minibatch 54 time: 0.0719630718231 (5.77815103531)

Should I also commit that switch to test with preload=True? Or even test both in the same function?
I guess I could time preloading as well.

@fvisin
Copy link
Owner

fvisin commented Jun 22, 2017

Thanks for updating the PR!

Should I also commit that switch to test with preload=True?

Sure, why not? Thanks.

I guess I could time preloading as well.

The best test here would be to wrap the _load_image function in the test with some code to make sure it's called only len(self.filenames) times. If you have some time to give it a try it would be great, but if not just commit the switch and we can merge.

Thanks for the PR! :)

@fvisin fvisin force-pushed the master branch 2 times, most recently from c6a8d70 to ff0bbfe Compare June 22, 2017 18:30
@lamblin
Copy link
Contributor Author

lamblin commented Jun 22, 2017

I added counts for _load_image, it seems to be correct.
I tried locally with max_epochs = 2 as well.

@fvisin fvisin merged commit 29451cb into fvisin:master Jul 4, 2017
@fvisin
Copy link
Owner

fvisin commented Jul 4, 2017

Thank you for the test, everything LGTM: merged!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants