-
Notifications
You must be signed in to change notification settings - Fork 1.3k
2310 Add load_csv_datalist utility API #2349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
merge master
merge master
merge master
merge master
Signed-off-by: Nic Ma <nma@nvidia.com>
|
Thanks for Seyed's example code and use cases, I totally changed my previous local code and switched to Thanks. |
Signed-off-by: Nic Ma <nma@nvidia.com>
b861d79 to
61208cf
Compare
Signed-off-by: Nic Ma <nma@nvidia.com>
61208cf to
58a0fa7
Compare
Signed-off-by: Nic Ma <nma@nvidia.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
|
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
|
/black |
|
I think the CSV reading should be implemented with the MONAI dataset API, with an option of partially loading large csv files e.g. https://discuss.pytorch.org/t/how-to-use-dataset-larger-than-memory/37785 |
|
Hi @wyli , Thanks for your suggestion. Thanks. |
|
Hi @wyli , I want to double confirm your suggestion: if we partially load a large CSV in dataset, do you mean to only load chunks of the CSV for a training, or still use the whole dataset for training but don't load data before training, everytime we only open the CSV file to read 1 row based on the shuffled index of dataset? Thanks. |
|
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html @ericspod @rijobro @wyli What's the typical use case for a very big CSV file during training? Thanks in advance. |
Signed-off-by: Nic Ma <nma@nvidia.com>
|
/black |
|
I'd suggest the assert should be changed to have an error message that states the contents of |
Signed-off-by: Nic Ma <nma@nvidia.com>
5dd3e44 to
48d4ef7
Compare
|
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
Signed-off-by: Nic Ma <nma@nvidia.com>
|
/black |
Signed-off-by: Nic Ma <nma@nvidia.com>
|
Hi @ericspod , Thanks for your suggestions, I tried to print out error message locally and solved the issue. @wyli The GPU tests failed due to below error: Should I wait a while and try again? Thanks. |
Signed-off-by: Nic Ma <nma@nvidia.com>
|
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
looks like multiple ninja builds sharing the same build cache is still creating some issues. any idea @charliebudd ? |
wyli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, please add some basic support for missing values, we can have another iteration to update the modules
Signed-off-by: Nic Ma <nma@nvidia.com>
|
/black |
Signed-off-by: monai-bot <monai.miccai2019@gmail.com>
Fixes #2310 .
Description
This PR added the
load_csv_datalistAPI to load extra information from CSV files.Users can easily combine this datalist with the
imageandlabel, etc. and put inDataset.Status
Ready
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests.make htmlcommand in thedocs/folder.