-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3 double drop #12
3 double drop #12
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be good to whiteboard around design here so I'll finish up my review for now and we can discuss in person!
I've overhauled the code in line with our discussions. The What isn't handled in the current code is:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I've added a few comments (some of which may just be me misunderstanding what's going on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Structure looks good.
A couple of minor comments.
Further, could you see if you can iterate through the resulting dataloaders and whether they behave as expected? I get an error trying to loop through dmpair.A.train_dataloader()
would be good to test looping through and count the no. of items. This should probably be a separate test as otherwise just testing the inds which are separate.
I've added several updates to the PR:
I haven't addressed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not gone through everything in full but LGTM - I just have some minor comments on the formatting/structure of the new splitting function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed on slack, let's add the setup
override even if it's just a copy+paste of the transform code in the parent. In addition, could you add some comments and logging to clarify that we will not be doing late loading and that the data setup is happening in the initialisation of the class?
I suggest just using logging
with a logger set up at module level logger = logging.getLogger(__name__)
(as per https://docs.python.org/3/howto/logging.html#advanced-logging-tutorial) and a warning level message in __init__
like logger.warning('sensible message here')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, nice work. Very minor comment re. moving the logging statement. Happy for it to be merged after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
These commits do 2 broad things to contribute to #3:
CIFAR10DataModuleDrop
class's dataloader into one that can mange dropping the same % from both A and B, while still being useful for the case where we only want to drop from one but not the other