Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git filters are sometimes case sensitive on case insensitive file systems #1537

Closed
larsxschneider opened this issue Sep 26, 2016 · 12 comments
Closed

Comments

@larsxschneider
Copy link
Member

Setup

git version 2.10.0
git-lfs/1.4.1 (GitHub; darwin amd64; go 1.7)

Steps to reproduce

$ git init .
$ echo "test1" > upper.DAT
$ echo "test2" > lower.dat
$ git add .
$ git commit -m "add files to Git"
$ git lfs track "*.dat"
$ git add .
$ git commit -m "move files to LFS"
$ rm *
$ git checkout .

Expected behavior

A clean working directory.

Actual behavior

$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   upper.DAT

Weird for the user:
As a user you would think that somehow your file was changed. If you run git checkout -- upper.DAT to discard any changes (as Git instructs you) nothing will happen.

What is going on?

The git add . after the git lfs track "*.dat" moved lower.dat to Git LFS, but not upper.DAT (although it should as this is a case insensitive file system):

$ git cat-file blob :upper.DAT
test1
$ git cat-file blob :lower.dat
version https://git-lfs.github.com/spec/v1
oid sha256:7d6fd7774f0d87624da6dcf16d0d3d104c3191e771fbe2f39c86aed4b2bf1a0f
size 6

If you add the dat/DAT files after you track the files in LFS then everything works as expected:

$ git init .
$ git lfs track "*.dat"
$ echo "test1" > upper.DAT
$ echo "test2" > lower.dat
$ git add .
$ git commit -m "add files to Git LFS"

My hunch is that the root of the problem is located in Git core.
What do you think?

@technoweenie
Copy link
Contributor

My hunch is that the root of the problem is located in Git core.

That's what it looks like. git lfs track should just be writing what you send it to .gitattributes. Does it happen if you run git lfs track "*.dat" before creating and adding the files for the first time?

@larsxschneider
Copy link
Member Author

@technoweenie See the last example in the repo. If I track first and add later everything works OK.

@technoweenie
Copy link
Contributor

Doh, there it is. The git lfs track command runs ls-files to look for existing files to run through the git lfs clean filter.

I wonder if git ls-files is using the pattern differently?

@zezba9000
Copy link

I just hit this bug... huge issue on Windows! Took forever to fix it. Had to uninstall Git-LFS then pull, then re-install git-lfs then pull again.

@ttaylorr ttaylorr modified the milestones: v2.0.0, v2.1.0 Feb 21, 2017
@ttaylorr ttaylorr modified the milestone: v2.1.0 Apr 4, 2017
@douglasbr
Copy link

Hello. I ran into this issue when tracking binary files using LFS on windows, but I don't think it is a bug. I was able to address the issue using glob patterns that work in .gitignore and .gitattributes files.

For example, to track *.dat files in a case insensitive way, you could use the pattern *.[dD][aA][tT] in the .gitattributes file.

If you don't want to modify the .gitattributes file directly, you could use the command git lfs track *.[dD][aA][tT]

Either method results in the following line in the .gitattributes file: *.[dD][aA][tT] filter=lfs diff=lfs merge=lfs -text

If you already have files tracked in the repository and need to apply the new .gitattributes, you can do that using the following procedure. For example, lets say you already committed a bunch of *.DAT files. You added the case insensitive glob pattern to .gitattributes, and now you want to pick up the files that are already in the repository.

  1. git rm -r --cached *.DAT (removes all *.DAT files from the index, but not from your working directory)
  2. git commit (applies a delete for all the *.DAT files)
  3. git add . (adds all those *.DAT files back into the index, picking up the new attribute)
  4. git commit (commits the re-added files back into the repository, now tracked by LFS)
  5. git push (and you should notice that the files are now uploaded as LFS objects)

@larsxschneider
Copy link
Member Author

Thanks for the great summary @douglasbr ! Glob patterns are the way to go on case insensitive file systems 👍👍

I touched on a few other tricks in my Git LFS talk too (patterns start around 13:40):
https://www.youtube.com/watch?v=YQzNfb4IwEY

@ttaylorr
Copy link
Contributor

ttaylorr commented Jun 8, 2018

I agree with @larsxschneider, and I think that this issue can be closed. Please don't hesitate to ping either of us if you have any trouble in the future. Thanks!

@ttaylorr ttaylorr closed this as completed Jun 8, 2018
@mloskot
Copy link
Contributor

mloskot commented Nov 28, 2018

@larsxschneider Thank you for the hints and the video.

I've just hit the issue myself :)

migrate: Sorting commits: ..., done
migrate: Examining commits: 100% (164/164), done
*.zip   1.2 GB  20/20 files(s)  100%
*.txt   184 MB    3/3 files(s)  100%
*.TXT   16 MB     1/1 files(s)  100%

@leodutra-aurea-zz
Copy link

leodutra-aurea-zz commented Dec 17, 2018

Question: if I use some migrate import just with the missing patterns:
will it override the already existing ones on .gitignore and on the rewrite itself
or
will it add this and follow existing track + the included one?

@bk2204
Copy link
Member

bk2204 commented Dec 17, 2018

The existing patterns in .gitattributes will stay, and it will add the additional patterns as well.

@AnomalousUnderdog
Copy link

@douglasbr if I do the steps you mentioned, does that double the amount of space taken in the remote repository (i.e. the files stay in the commit history but also added as duplicates to the lfs storage)?

@douglasbr
Copy link

@AnomalousUnderdog To the best of my understanding, the steps I outlined do not remove the binary files from your commit history. They are only removed from the index and moved into LFS storage. So, it would double the total amount of storage (assuming you only had one version of your binaries previously committed). If you wanted to go back into your commit history and actually remove the files from your repository's history, that would take more work. How deep you go into your history would also depend on how long you have been committing binaries and which versions you are willing to delete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants