Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add fails to add unlocked file which now would need to go to git if was added anew #1651

Closed
yarikoptic opened this issue Jul 21, 2017 · 4 comments

Comments

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Jul 21, 2017

so there is a change of .gitattributes interim:

yoh@hopa:~> datalad create /tmp/testunlock
[INFO   ] Creating a new annex repo at /tmp/testunlock 
create(ok): /tmp/testunlock (dataset)                                                                           

yoh@hopa:~> cd /tmp/testunlock

yoh@hopa:/tmp/testunlock> echo 123 > 123

yoh@hopa:/tmp/testunlock> datalad add 123
add(ok): /tmp/testunlock/123 (file)                                                                             
save(ok): /tmp/testunlock (dataset)
action summary:
  add (ok: 1)
  save (ok: 1)

yoh@hopa:/tmp/testunlock> datalad unlock 123
unlock(ok): 123 (file)

yoh@hopa:/tmp/testunlock> datalad add 123
add(ok): /tmp/testunlock/123 (file)                                                                             
save(notneeded): /tmp/testunlock (dataset)
action summary:
  add (ok: 1)
  save (notneeded: 1)

yoh@hopa:/tmp/testunlock> datalad unlock 123
unlock(ok): 123 (file)

yoh@hopa:/tmp/testunlock> cat .gitattributes
* annex.backend=MD5E

*yoh@hopa:/tmp/testunlock> echo '* annex.largefiles=(not(mimetype=text/*' >> .gitattributes

yoh@hopa:/tmp/testunlock> datalad save -m 'all text files go to git' .gitattributes
save(ok): /tmp/testunlock (dataset)                                                                             

yoh@hopa:/tmp/testunlock> datalad unlock 123

yoh@hopa:/tmp/testunlock> datalad add 123
add(ok): /tmp/testunlock/123 (file) [non-large file; adding content to git repository]                          
add(ok): /tmp/testunlock/123 (file) [non-large file; adding content to git repository]
Failed to run ['git', '-c', 'receive.autogc=0', '-c', 'gc.auto=0', 'commit', '-m', '[DATALAD] added content', u'123'] under '/tmp/testunlock'. Exit code=1. out= err=git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.
git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

yoh@hopa:/tmp/testunlock> datalad add 123
add(notneeded): /tmp/testunlock/123 (file) [already included in the dataset]
save(ok): /tmp/testunlock (dataset)
action summary:
  add (notneeded: 1)
  save (ok: 1)

yoh@hopa:/tmp/testunlock> git status
On branch master
nothing to commit, working tree clean

yoh@hopa:/tmp/testunlock> ls -l 123
4 lrwxrwxrwx 1 yoh yoh 108 Jul 21 15:52 123 -> .git/annex/objects/pF/Zf/MD5E-s4--ba1f2511fc30423bdbb183fe33f3dd0f/MD5E-s4--ba1f2511fc30423bdbb183fe33f3dd0f
@mih
Copy link
Member

@mih mih commented Jul 22, 2017

To whoever will work on this: Please keep in mind that calling add on any changed file will bring back approx. 324 issues that we discovered to be a problem with this kind of approach. We may want to consider accepting that we cannot cover all possible combinations of various stages of modification and actions without substantial costs. In this particular case, my impulse response would be:

  1. document that such a change in .gitattributes is best done on a clean repo
  2. not "fix" this issue within add, but intercept this particular error, and give instructions on what to do in order to resolve it (current error isn't too bad to begin with) -- manually. This is pretty much conflict resolution, which even git will present as a manual exercise. (I don't think that a posthoc change in .gitattributes implies that an already annexed file needs to magically move out of the annex)

But it could also be that I misread the bug report (it has no conclusion), because it looks like running add twice resolves the issue?! Hard to say anything without debugging.

@yarikoptic
Copy link
Member Author

@yarikoptic yarikoptic commented Oct 5, 2018

change in .gitattributes could happen long before that file even exists... it is a matter of file possibly migrating between git and annex depending on the settings in the .gitattributes. E.g. this would lead to a failure as well:

   > datalad create --text-no-annex /tmp/testds2
   > cd /tmp/testds2
   > touch annexed
   > datalad add annexed
   > datalad unlock annexed
   > echo 1 >> annexed
   > datalad add annexed
   add(ok): /tmp/testds2/annexed (file) [non-large file; adding content to git repository]
add(ok): /tmp/testds2/annexed (file) [non-large file; adding content to git repository]
Failed to run ['git', '-c', 'receive.autogc=0', '-c', 'gc.auto=0', 'commit', '-m', '[DATALAD] added content', '--', u'annexed'] under u'/tmp/testds2'. Exit code=1. out= err=git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

not sure what 324 issues were in the bright mind at that point. I would disagree about current error being not too bad -- whole output makes little sense (add(ok), even twice, seems to come from git-annex since non-large file, but then kaboom saying that needed git annex add first).
Running twice doesn't help

$> echo 2 >> annexed     

$> datalad add annexed   
add(ok): /tmp/testds2/annexed (file) [non-large file; adding content to git repository]
add(ok): /tmp/testds2/annexed (file) [non-large file; adding content to git repository]
Failed to run ['git', '-c', 'receive.autogc=0', '-c', 'gc.auto=0', 'commit', '-m', '[DATALAD] added content', '--', u'annexed'] under u'/tmp/testds2'. Exit code=1. out= err=git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

$> datalad add annexed
add(notneeded): /tmp/testds2/annexed (file) [already included in the dataset]
Failed to run ['git', '-c', 'receive.autogc=0', '-c', 'gc.auto=0', 'commit', '-m', '[DATALAD] added content', '--', u'annexed'] under u'/tmp/testds2'. Exit code=1. out= err=git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.

Not sure what conclusion was asked for (besides that add fails). Interestingly: save seems to not fail. It doesn't jump between annex/git though - consistently retains the git/annex across saves. Not sure if should be considered a feature or a bug anymore since IMHO there is no clear definition/idea of what actually should happen (besides that things shouldn't fail)

yarikoptic added a commit to yarikoptic/datalad that referenced this issue Oct 5, 2018
This way we could aggregate all the result records happen they
come handy for analysis/output where/when needed.

BK due to datalad#1651
since apparently neither add nor save can deal with files migrating
between git and annex
@yarikoptic
Copy link
Member Author

@yarikoptic yarikoptic commented Nov 23, 2018

awesome that it is fixed in the revolution, I kept hitting it in the real world, need to come up with a workaround or a fix...

@mih
Copy link
Member

@mih mih commented Nov 23, 2018

pip install datalad-revolution or ping #2926

@yarikoptic yarikoptic mentioned this issue Nov 23, 2018
16 tasks
yarikoptic added a commit that referenced this issue Nov 27, 2018
	## 0.11.1 (Nov 25, 2018) -- v7-better-than-v6

	Rushed out bugfix release to stay fully compatible with recent
	[git-annex] which introduced v7 to replace v6.

	### Fixes

	- [install]: be able to install recursively into a dataset ([#2982])
	- [save]: be able to commit/save changes whenever files potentially
	  could have swapped their storage between git and annex
	  ([#1651]) ([#2752]) ([#3009])
	- [aggregate-metadata]:
	  - dataset's itself is now not "aggregated" if specific paths are
		provided for aggregation ([#3002]). That resolves the issue of
		`-r` invocation aggregating all subdatasets of the specified dataset
		as well
	  - also compare/verify the actual content checksum of aggregated metadata
		while considering subdataset metadata for re-aggregation ([#3007])
	- `annex` commands are now chunked assuming 50% "safety margin" on the
	  maximal command line length. Should resolve crashes while operating
	  ot too many files at ones ([#3001])
	- `run` sidecar config processing ([#2991])
	- no double trailing period in docs ([#2984])
	- correct identification of the repository with symlinks in the paths
	  in the tests ([#2972])
	- re-evaluation of dataset properties in case of dataset changes ([#2946])
	- [text2git] procedure to use `ds.repo.set_gitattributes`
	  ([#2974]) ([#2954])
	- Switch to use plain `os.getcwd()` if inconsistency with env var
	  `$PWD` is detected ([#2914])
	- Make sure that credential defined in env var takes precedence
	  ([#2960]) ([#2950])

	### Enhancements and new features

	- [shub://datalad/datalad:git-annex-dev](https://singularity-hub.org/containers/5663/view)
	  provides a Debian buster Singularity image with build environment for
	  [git-annex]. [tools/bisect-git-annex]() provides a helper for running
	  `git bisect` on git-annex using that Singularity container ([#2995])
	- Added [.zenodo.json]() for better integration with Zenodo for citation
	- [run-procedure] now provides names and help messages with a custom
	  renderer for ([#2993])
	- Documentation: point to [datalad-revolution] extension (prototype of
	  the greater DataLad future)
	- [run]
	  - support injecting of a detached command ([#2937])
	- `annex` metadata extractor now extracts `annex.key` metadata record.
	  Should allow now to identify uses of specific files etc ([#2952])
	- Test that we can install from http://datasets.datalad.org
	- Proper rendering of `CommandError` (e.g. in case of "out of space"
	  error) ([#2958])

* tag '0.11.1':
  Adjust the date -- 25th fell through due to __version__ fiasco
  BF+ENH(TST): boost hardcoded version + provide a test to guarantee consistency in the future
  This (expensive) approach is not needed in v6+
  small tuneup to changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants