-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support donwloading datasets from opendatalab #212
[Feature] Support donwloading datasets from opendatalab #212
Conversation
mim/commands/download.py
Outdated
process = subprocess.Popen( | ||
['odl', 'get', src_name, '-d', download_root], | ||
stdin=sys.stdin, | ||
stdout=subprocess.PIPE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think stdout=sys.stdout
is better, user can know it is downloading, how much has been downloaded, and how much time is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially redirected the output to sys.stdout
, but the problem was that I couldn't capture the output of odl
in the terminal and return a proper error message in mim download
. Of course, this shouldn't have been the responsibility of mim
to handle; odl
should ensure the readability of its own error messages. Unfortunately, their error handling is not good, and they don't raise an error when there's no login
but simply return "401: {"msg":"login required"} to stdout.
Regardless, redirecting the logs to a pipe indeed affected the user experience, so I have reverted the changes.
## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification - add dataset-index.yml ## Dependencies - [ ] open-mmlab/mim#212
## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification - add dataset-index.yml ## Dependencies - [ ] open-mmlab/mim#212
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Support downloading datasets from opendatalab and preprocessing the data by the script provided by OpenMMLab downstream repositories.
For example, users can download the target format for specified dataset by the following commands:
Then users can run
training
,testing
without preparing the dataset by themselves.The downstream repos should provide the
dataset-index.yml
in the project root directory like this:the script field represents the preparing script:
Modification
Please briefly describe what modification is made in this PR.
BC-breaking (Optional)
Does the modification introduce changes that break the back-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist