-
Notifications
You must be signed in to change notification settings - Fork 16
Example refactor #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example refactor #28
Conversation
fl4health/utils/load_data.py
Outdated
| return train_loader, validation_loader, num_examples | ||
|
|
||
|
|
||
| def load_cifar10_data(data_dir: Path, batch_size: int) -> Tuple[DataLoader, DataLoader, Dict[str, int]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: we can get a sampler object here and subsample cifar10 as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like Fatemeh's suggestion here. You could make the samplers Optional[LabelBasedSampler] in both load MNIST and CIFAR methods that default to none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the function to include a sampler argument.
emersodb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to go from my perspective. I'd double check with Fatemeh that this is what she had in mind.
fatemetkl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
PR Type
Refactoring
Short Description
For examples that use CIFAR-10, I simply moved the code for loading data to utils/load_data.py to reduce code duplication a bit, since it is also where the code for loading MNIST is located.
I considered writing a function for loading datasets in general, but that would require the client code to provide dataset-specific information which I thought might not be desirable.
Tests Added
...