Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change argument variable output_directory to data_directory #66

Closed
thodrek opened this issue Aug 25, 2021 · 4 comments
Closed

Change argument variable output_directory to data_directory #66

thodrek opened this issue Aug 25, 2021 · 4 comments
Assignees
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request

Comments

@thodrek
Copy link

thodrek commented Aug 25, 2021

What is the documentation lacking? Please describe.
The variable output_directory in the input arguments is overloaded. The directory is used to include both the input and output data for a data set. Imprecise naming leads to wrong use of the system.

Describe the improvement you'd like
Rename the variable to data_directory instead of output_directory

@thodrek thodrek added bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request labels Aug 25, 2021
@AnzeXie
Copy link
Collaborator

AnzeXie commented Aug 26, 2021

For maiurs_preprocess, when preprocessing supported datasets, output_directory contains both input (downloaded data) and output data (preprocessed data), when preprocessing custom datasets, output_directory only contains output data.

@shivaram
Copy link

Maybe we should have a separate directory called download_directory that will store the downloaded files. I think it is confusing to put both input and output files of preprocessing in a directory called output_directory?

@thodrek
Copy link
Author

thodrek commented Aug 26, 2021

@AnzeXie so for custom datasets how is the files input of general_parser set? The above is confusing and should be cleaned up. Specifically, either unify the dir to be data_dir or split things to input_dir and output_dir; the current choice is inconsistent and confusing.

@AnzeXie
Copy link
Collaborator

AnzeXie commented Aug 27, 2021

For custom datasets, path to files input of general_parser is set by users. For supported datasets, the files input are first downloaded then preprocessed. These downloaded files were put into the output_dir. Ok, I will add a separate directory called download_directory especially for these files downloaded for supported datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants