-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FYI: empty files are annexed; inconsistent mimetype reporting #3663
Comments
FTR:
Apparently that's what git-annex uses. However, we need to figure what to do with such configuration procedures indeed, since the behavior doesn't make a lot of sense. Maybe just add |
I can't think of a reason why it would be. The file name is already tracked, so there can be no unintended information that's being exposed. I read @bpoldrack's comment as suggesting that you should add diff --git a/datalad/resources/procedures/cfg_text2git.py b/datalad/resources/procedures/cfg_text2git.py
index 0218d2f9d..490666ea0 100644
--- a/datalad/resources/procedures/cfg_text2git.py
+++ b/datalad/resources/procedures/cfg_text2git.py
@@ -10,7 +10,7 @@
check_installed=True,
purpose='configuration')
-annex_largefiles = '(not(mimetype=text/*))'
+annex_largefiles = '(not((mimetype=text/*)or(mimetype=inode/x-empty)))'
attrs = ds.repo.get_gitattributes('*')
if not attrs.get('*', {}).get(
'annex.largefiles', None) == annex_largefiles: If a user has instructed datalad to send text files to annex, I think it's more likely that they'll be surprised an empty file goes to annex than that they'll be annoyed about our loose interpretation of an empty file as a text file. I suppose it's worth considering if the default .gitattributes (i.e., no |
@kyleam: My intention actually was to add it either to
Yes, I lean towards that idea as well. |
@adswa Would you like to take a crack at wrapping the above patch into a proper commit? I think it should come down to providing the rationale in the commit message and adding a test. I suppose one tricky part about testing cfg_text2git is that the behavior depends on git-annex being built with the MagicMime flag. But assuming MagicMime support should be fine because neurodebian's git-annex has it and it looks like conda's does now too. If it ends up being a problem down the road, we can skip the test based on the |
Yes, I'd love to! |
Empty files are of mimetype inode/x-empty, and hence would be annexed. This behavior is likely unexpected after applying the text2git configuration. See 'datalad#3663 (comment)' for details.
Empty files are of mimetype inode/x-empty, and hence would be annexed. This behavior is likely unexpected after applying the text2git configuration. See 'datalad#3663 (comment)' for details.
Empty files are of mimetype inode/x-empty, and hence would be annexed. This behavior is likely unexpected after applying the text2git configuration. See 'datalad#3663 (comment)' for details.
Empty files are of mimetype inode/x-empty, and hence would be annexed. This behavior is likely unexpected after applying the text2git configuration. See 'datalad#3663 (comment)' for details.
Empty files are of mimetype inode/x-empty, and hence would be annexed. This behavior is likely unexpected after applying the text2git configuration. See 'datalad#3663 (comment)' for details. use file size rule instead of mime type
Empty files are of mimetype inode/x-empty, and hence would be annexed. This behavior is likely unexpected after applying the text2git configuration. See 'datalad#3663 (comment)' for details. use file size rule instead of mime type
I'm just noting a behavior that confused me:
Let's say I am creating an empty file In a dataset configured with
text2git
withtouch somefile
(no extension, no content).The
mimetype
command reports this file astext/plain
:A subsequent
datalad save
annexes the file:The inconsistency/my confusion arise from the fact that
.gitattributes
is configured to not regard any text files aslargefiles
bytext2git
The rules for largefiles in the docs of Git-annex help to clarify what I was missing:
If I actually run
file --mime-type somefile
it is not reported as a text file anymore:That's just as an FYI. However, is it actually useful/intended to annex files with a size of 0?
The text was updated successfully, but these errors were encountered: