Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stage add: fails with dirs that will be created later by the cmd(s) #5802

Closed
jorgeorpinel opened this issue Apr 12, 2021 · 8 comments · Fixed by #8644
Closed

stage add: fails with dirs that will be created later by the cmd(s) #5802

jorgeorpinel opened this issue Apr 12, 2021 · 8 comments · Fixed by #8644
Labels
A: pipelines Related to the pipelines feature p2-medium Medium priority, should be done, but less important

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 12, 2021

Bug Report

Description

If the stage being defined creates a dir and files in this dir are stage outputs, stage add will fail (because it can't write the corresponding .gitignore file).

Reproduce

$ dvc stage add -n hidir -o dir/hi 'mkdir dir && echo hi > dir/hi'
ERROR: unexpected error - [Errno 2] No such file or directory: '/.../dir/.gitignore'

Expected

I guess the full path to the .gitignore in question should be created if needed. Or maybe create it at repro? Not sure how, but I would expect the stage definition not to fail for this.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.0.17 (pip)
---------------------------------
Platform: Python 3.6.9 on Linux-5.4.72-microsoft-standard-WSL2-x86_64-with-Ubuntu-18.04-bionic
Supports: gdrive, hdfs, http, https, s3, ssh, oss
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/sdb
Repo: dvc, git
@jorgeorpinel jorgeorpinel changed the title stage add: fails with dirs that will be created by the stage stage add: fails with dirs that will be created by the cmd(s) Apr 12, 2021
@dberenbaum dberenbaum added this to To do in DVC Sprint 20 April - 4 May 2021 via automation Apr 19, 2021
@efiop efiop added p1-important Important, aka current backlog of things to do p2-medium Medium priority, should be done, but less important and removed p1-important Important, aka current backlog of things to do labels Apr 20, 2021
@efiop efiop added this to To do in DVC 4 - 18 May 2021 via automation May 4, 2021
@efiop efiop removed this from To do in DVC 4 - 18 May 2021 May 18, 2021
@efiop efiop added this to To do in DVC May 18 - Jun 1 2021 via automation May 18, 2021
@efiop efiop removed this from To do in DVC May 18 - Jun 1 2021 Jun 1, 2021
@daavoo daavoo added the A: pipelines Related to the pipelines feature label Oct 20, 2021
@dberenbaum
Copy link
Contributor

@skshetry This also can cause dvc exp init to fail:

$ dvc exp init --model models/model.h5 python src/train.py
ERROR: unexpected error - [Errno 2] No such file or directory: '/Users/dave/repo/models/.gitignore'

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

@jorgeorpinel jorgeorpinel changed the title stage add: fails with dirs that will be created by the cmd(s) stage add: fails with dirs that will be created later by the cmd(s) Apr 12, 2022
@dberenbaum
Copy link
Contributor

@skshetry I don't want to slow down other work for this, but this seems like a fairly serious bug for exp init. It actually breaks the example in https://dvc.org/doc/command-reference/exp/init#example-interactive-mode. What do you think about priority and solution?

@skshetry
Copy link
Member

skshetry commented May 3, 2022

@dberenbaum, for exp init, we can consider creating the parent directory of those outputs, where the .gitignore file will be added.

However, we can also choose not to generate .gitignore entries on exp init/stage add in these edge cases, as successive dvc repro and dvc exp run will try to generate them as well.

If we do want to always generate .gitignore, we need to fix how we create .gitignore files. We need to change the gitignore-generator to backtrack up to the dvc's root directory, likewise on dvc remove.

@dberenbaum
Copy link
Contributor

dberenbaum commented May 3, 2022

successive dvc repro and dvc exp run will try to generate them as well.

In that case, do you know why we try to create .gitignore entries on stage add?

@dberenbaum dberenbaum added p1-important Important, aka current backlog of things to do and removed p2-medium Medium priority, should be done, but less important labels May 11, 2022
@dberenbaum
Copy link
Contributor

Marking this as p1 until at least the dvc exp init scenario is resolved (see iterative/dvc.org#3430 (comment)).

@daavoo
Copy link
Contributor

daavoo commented May 13, 2022

In that case, do you know why we try to create .gitignore entries on stage add?

Any answer to this?

@dberenbaum dberenbaum added p2-medium Medium priority, should be done, but less important and removed p1-important Important, aka current backlog of things to do labels Jun 7, 2022
@dberenbaum
Copy link
Contributor

Bumping down to p2 following #7752

@dberenbaum
Copy link
Contributor

In that case, do you know why we try to create .gitignore entries on stage add?

Any answer to this?

See #7740 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: pipelines Related to the pipelines feature p2-medium Medium priority, should be done, but less important
Projects
None yet
5 participants