Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set output_folder to path outside of ml repo #201

Closed
erikr opened this issue Apr 2, 2020 · 6 comments
Closed

Set output_folder to path outside of ml repo #201

erikr opened this issue Apr 2, 2020 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@erikr
Copy link

erikr commented Apr 2, 2020

What
When calling recipes.py, if a user sets an --output_folder path that is not within the repo directory, no results are saved on the host machine.

It would be great if a user could specify any path on their machine in which to save results from running ML4CVD!

Why
The repo should contain code. Results should live in a different directory. Results in the repo directory can clutter the output of git status and subsequent adds, commits, and pushes. Having to move results out of the repo directory adds a step to user workflow.

How
I think this limitation is due to a Docker mount setting. The solution is probably to mount the home directory, and accept a limitation that --output_folder must be within ~/ and not upstream of that. Mounting / seems problematic.

Acceptance Criteria
User can set any --output_folder regardless of whether it is in the repo directory, and results appear in a subdirectory specified by id arg.

@erikr erikr added the enhancement New feature or request label Apr 2, 2020
@erikr erikr added this to Backlog in Infrastructure via automation Apr 2, 2020
@lucidtronix lucidtronix moved this from Backlog to Todo in Infrastructure Apr 17, 2020
@paolodi paolodi self-assigned this Apr 24, 2020
@paolodi
Copy link
Contributor

paolodi commented Apr 24, 2020

A current workaround for this is to manually signal the output folder to tf.sh via the -m flag. For example, if the desired output folder is /data/models/output,
./scripts/tf.sh -m /data/output /home/user/ml/ml4cvd/recipes.py --mode train --output_folder /data/models/output

./scripts/tf.sh -m /data/models/output /home/user/ml/ml4cvd/recipes.py \
                --mode train \
                --output_folder /data/models/output

would mount /data/models/output as a shared volume in Docker and the results will be accessible after the run.

I agree that realizing that the results are gone only after the run is complete is a bit annoying. I'll see if we can at least release a warning when the user is running the risk of this happening. Also, saving output results outside the repo is a good practice, as it makes startup time faster, so we should encourage it by making it easier.

@erikr
Copy link
Author

erikr commented Apr 24, 2020

Thanks for the workaround, I will try to use that for now!

@paolodi paolodi moved this from Todo to In progress in Infrastructure Apr 24, 2020
@erikr
Copy link
Author

erikr commented Apr 25, 2020

Paolo, in your example above, did you mean to suggest mounting the same directory as the user specifies for --output_folder, e.g.:

./scripts/tf.sh -m /data/models/output /home/user/ml/ml4cvd/recipes.py \
    --mode train \
    --output_folder /data/models/output

rather than:

./scripts/tf.sh -m /data/output /home/user/ml/ml4cvd/recipes.py \
    --mode train \
    --output_folder /data/models/output

I did the former and it worked!

@paolodi
Copy link
Contributor

paolodi commented Apr 25, 2020

You're absolutely right, Erik, sorry for the typo!

@erikr
Copy link
Author

erikr commented Apr 25, 2020

Ok thanks -- I know it seems like a silly question but I know so little about Docker I wanted to be sure!

@erikr
Copy link
Author

erikr commented May 25, 2020

@paolodi I am not sure exactly what changed, but I no longer have to use -m flag to mount drives outside of the repo. Can this issue be closed?

@erikr erikr closed this as completed Jul 1, 2020
Infrastructure automation moved this from In progress to Done Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants