-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ImageFolderDataset
#1232
Add ImageFolderDataset
#1232
Conversation
Note that I added an We could remove the options until they are implemented. Let me know your thoughts :) /edit: welp, looks like I have some tests to fix tomorrow. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1232 +/- ##
==========================================
- Coverage 84.41% 84.40% -0.02%
==========================================
Files 549 550 +1
Lines 61952 62174 +222
==========================================
+ Hits 52295 52475 +180
- Misses 9657 9699 +42 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool and complete. I have a few comments to access before merging, but great job! 👏
/// | ||
/// Taken from [burn-dataset](https://github.com/tracel-ai/burn/blob/main/burn-dataset/src/vision/downloader.rs). | ||
#[tokio::main(flavor = "current_thread")] | ||
pub async fn download_file_as_bytes(url: &str, message: &str) -> Vec<u8> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be duplicated from the mnist
dataset implementation right? Maybe we can actually add cifar to burn-dataset
and reuse the setup from mnist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's why I linked to it. That download function can be used for any other future dataset we might want to add, I just wasn't sure if we wanted to add CIFAR-10 explicitly so I put it as an example. And the downloader module is private so I copied the code over with a reference.
Btw the original CIFAR-10 source is in bytes form (I used a mirror with the folder structure to illustrate the ImageFolderDataset
which is more common), closer to MNIST.
TrainingConfig::new(SgdConfig::new().with_momentum(Some(MomentumConfig { | ||
momentum: 0.9, | ||
dampening: 0., | ||
nesterov: false, | ||
}))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is: is Adam better in this case 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually haven't fiddled with the training recipe too much for this example 😄 a bunch of things could probably be changed to further optimize the results, I just wanted to keep it simple.
Lmk if you want any changes to be applied to the training example.
Thanks for the feedback! Wasn't clear if you wanted to add CIFAR-10 to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we could add CIFAR to the burn-dataset crates, but since you mentioned you used the folder version as an example, maybe it's best to just keep it in the example dir. Maybe we can add another version more similar to MNIST in the crate.
Checklist
run-checks all
script has been executed.Related Issues/PRs
Closes #1132
Changes
ImageFolderDataset
to load images from disk (following a folder structure)custom-image-dataset
example that downloads CIFAR-10 and trains a simple CNN model using theImageFolderdataset
Testing
Added a couple of unit tests and an example.