-
Notifications
You must be signed in to change notification settings - Fork 316
Git does not see that file in working directory differs from HEAD #312
Comments
I just manually unzipped your DixiLink2.zip and opened a Git Bash in the created directory:
As you can see, the file shows up as modified for me. EDIT: But interestingly, I can reproduce the issue with your script. I've compared the full directory contents from manually unzipping DixiLink2.zip vs. your script, and although they look equal, the Git statuses differ. Indeed odd. |
I think it depends on which unzip tool you use. If possible, please try again unpacking the zip file with Info-ZIP v3.0. What's relevant here is that the timestamps of the files are preserved. |
I noticed there is a off-by-one difference in the seconds of the creation time when the archive is unpacked via "unzip" vs "WinRAR". That is one thing. On the other hand, the issue disappears after deleting The fact that the issue disappears when using "WinRAR" instead of "unzip" makes me believe that the bug is rather in Unzip than in Git. Did this issue already show up in the original repo before zipping / unzipping it? |
Yes the issue showed up in the original repo. (And of course also when pulling the repo directory over to another PC over the LAN.) |
I think Git should not cache file timestamps, or at least there should be an option to turn it off. |
Git is very conservative when looking at timestamps (it does not even trust the index if it has the very same timestamp as the file it is looking at), so there must be something else going wrong. |
I also always thought that Git is only tracking file contents, and nothing else. But meanwhile Hannes replied to the corresponding mailing list thread saying something different ... |
Now the fog is clearing. In my case the vss2git visits every file in VSS, rewinds the deltas until the origin, then starts replaying the history and adds and commits every version to git. It will do so very quickly. Knowing that the precision of timestamps on some Windows filesystems (FAT for example) is 1 or 2 seconds, and knowing that some files (like icon bitmaps) don't change size in their history it is hard to avoid clashes without inspecting file contents. You can expect other problems because of the different ways timezones and daylight saving time are handled across filesystems. See my latest post on the https://groups.google.com/forum/#!topic/msysgit/6XLoSPH26kc thread. So please, dear developers, please let Git to only track file contents (reliably tracked by SHA-1 hashes) and nothing else. If there is a performance penalty, so be it. Any shortcut that comes with a reliability penalty should have a switch to turn it off! I am not familiar with the internals of the Git index to really understand what is hidden there, and why the file content/hash inspection has been bypassed. But I know the basics of both the Windows and Unix filesystems, so I will follow any discussion here about the details with interest. I am willing to test any fix; the problem is fully reproducible here. |
Looks like the old "racy git" problem??? Failed workaround: (as this said) Is it possible that implement inode via Windows MFT? |
The screen short from @YueLinHo shows read-cache.c for all you don't know git's code by heart. Regarding with the nanoseconds timestamps. On Windows we have timestamps with at most 100ns resolution according to https://msdn.microsoft.com/en-us/library/windows/desktop/ms724290%28v=vs.85%29.aspx. If I understand the code correctly we also do not use USE_NSEC in msysgit/git-for-windows-sdk. |
@j66st I suppose there is a workaround for your vss2git that insert a command git reset --mixed after checkout from VSS and before git commit. @t-b I guess the root cause is that Git for Windows lacks some file system features. See basic logic match_stat_data() first. Git for Windows has its own lstat() and st_ino, st_gid, st_uid are always 0. (see do_lstat()) And I also recall the issue #229, not sure if it is the same root cause. BTW, I guess such problems will come about often, because of SSD. |
In your Demo.zip, the dixilinkerr.h file has a modification time of 2009-09-17, and the .git/index file has a modification time of 2015-01-23. Yet you say that the file was changed after committing version 2. So it seems your VSS conversion tool messes up modification timestamps. Make sure it doesn't do that and everything should work as expected. Btw. I don't think this is a Windows-specific problem (or even Git-specific). Playing around with file times will fool Git on Linux as well (as outlined in racy-git.txt) as well as most backup software. |
@kblees Thank you. That makes me think more deeply. |
@t-b: Re 100 ns timestamp resolution: that's only in theory. The clock ticks at a slower pace and a file system like FAT can only store seconds. Re changing code: I don't have an environment right now to rebuild git for Windows, nor the time to dive into the code, so I leave that to others. I really wished that Git could preserve timestamps too! I'm so used to seeing a file's age in the directory listing and being able to sort out which files were changed this week, etc.. Everyone moving from VSS to Git (or TFS, for that matter) is complaining about the loss of this information. I changed the vss2git tool to keep this information at least in the commit message. @YueLinHo Using git reset --mixed after extracting the file and before the git add is not a workaround. I tried it, Git still does not see the file as changed. But just touching the file |
At least the rebuilding part we have made pretty easy by now. Simply download the latest snapshot of the Git for Windows SDK, run the "Git Development Environment" from the Start Menu, and do
For reference, here's the FAQ entry why it does not. However, I agree there could be an option to still do so. |
@j66st
|
I hope you understand that a statement like this seems very arrogant because you expect others to spend time helping you but do not see fit to set aside time yourself.
The mtime of a file is the time when it was last modified in this particular spot on the file system. That is why a copy of a file dated January 1st 1980 will get a new time stamp, even if the content technically is really that old. That is the way Git treats timestamps. You probably can convince VSS to update files' mtime to the appropriate value, too, i.e. reflecting the time when it wrote the file (as opposed to the time when the file with those contents entered the repository) and that should fix your woes. In any case, it is well established by now that the problem is not a bug in Git for Windows, it does exactly what it is designed to do. Therefore I close this ticket. |
@dscho:
These words also reflect my opinion. So I thought this was something to be fixed urgently and that it would be wise to humbly leave this work to the developers who are fully familiar with the codebase and who must be able to fix this in hours, rather than weeks. I am willing to discuss the subject, to assist with my knowledge of filesystems and VSS, and to assist in testing, that's all that would make sense for now, I guess. Anyway, glad that I did not yet invest more time in coding a solution. I don't know about the hierarchy of the people contributing in this project, but you must be the authority here if you can decide on your own that this strange behavior is by design and does not need further action. So if I would develop a fix it wouldn't get your approval anyway. Now I know enough to find a suitable workaround to redo my migration project. But I think the least one could do then is adding a section to the FAQ and to the manpages to explain and warn for this surprising behavior as shown in my test script. Maybe I should spend time to do that and add my demo-case as an example. I truly respect people investing so much time, energy and knowledge in open source projects like this, and I apologize if anyone feels insulted by my posts, it was never my intention. It is up to the readers following this discussion to judge for themselves who seems arrogant here. |
Thanks for your good work! I am MS Visual Studio developer, I would have to set up a Gnu C build environment first, and get used to it, at this moment I don't have time for that, already lost too much time with this vss2git migration, and the show must go on. I hope, some day ...
I know the reasons. To accommodate those people who are too ignorant or too lazy to add two characters to their make command to force a rebuild-all whenever they import other people's code :-) At the cost of losing valuable file timestamps. Fine with me, we can agree to disagree here. I spent considerable time to have vss2git save the timestamps in the commit messages, I think it was worth it. |
No you don't. That's exactly what the SDK installer does for you, it's self-contained. |
Why? Touching the files will only refresh the mtime in the directory, so not an expensive operation. Vss2git bundles files having mtimes within a certain time interval like 2 minutes, so normally only a dozen files or so per commit. And vss2git is a batch process, it can run overnight, I don't care about a few minutes more.
Yes, this works (just tried it with my Demo-setup, you can also try it yourself if you like):
Yes, that is also a viable solution (didn't know that it is fine to delete the index file).
Your workarounds will also cost some time (more than touch, I guess). Anyway, I only need to apply this workaround in those cases that git add fails, so no problem. Thanks for your ideas! |
May I take it back? :)
Agree that.
What if there are 2 changed files in one commit, A and B, and git status only show B is modified?
Yor're welcome! BTW, I am a git user, too. |
I fear you misunderstood. Git – as Visual Studio, or make – needs to figure out which files have changed since the last time it saw them. The reason for Git – as for Visual Studio, or make – is that it would be prohibitively costly to just handle all the files again. Visual Studio would have to make a full build otherwise, even if all you did was to fix a typo. And Git would have to re-checksum all the files again – a really wasteful way to handle your files. That is why Visual Studio, and make, and Git, and everybody else relies on the mtime as a means to say "this is when the file was last modified here". By the way, that is exactly the same as with backup systems, because backup systems are not intended to synchronize data between two locations. They are intended to reconstruct the exact state of a given working directory as on that machine. That is why they preserve mtime, so that tools like Visual Studio, make – or Git – can say: "Oh, I see, git.c is older than git.o, I guess I do not have to rebuild it". It is important to realize, and I encourage you to wrap your head around that concept, too, that the time when a given file revision entered the repository is a very different beast from the mtime of the file of the reconstructed revision. The puzzlement you saw was of some developers who thought that Git would do something different than what the tools expect, namely to play games with the mtime (i.e. change it to something that does not reflect when that file, in that location, was last modified). But that puzzlement went away when it was clarified that Git does exactly what it is expected to: record source code revisions (and attach a timestamp to the entire revision instead of recording individual mtimes which it is not supposed to).
No, touching is not. But the consequences are. You basically add a lot of churn for the tools that are now fooled into believing that they did not see the most recent contents of the files. You will cause complete builds – which can easily last several hours for complex C++ projects, something that you should not want to cause – or in the case of Git a full reindexing – which can still last several minutes if you have a large project, and doing so for every single revision you want to import will unduly slow down the entire process.
Well, I, in contrast, think it unrealistic to expect others who do not share your particular problem to solve it for you, in particular when the faulty bit – the wrong mtime modification of VSS – has been identified already and you have everything you need in hand to solve the problem yourself. And I need to protect the time of the Git for Windows developers: we have a couple of challenges on hand that we need to solve to make Git for Windows a better user experience (and fixing vss2git is not our concern, so it is also a bit unfair to distract us from our very own problems, unless you help us with our problems in return, something that you indicated to be unlikely because of your lack of knowledge of the code base that you did not intend to change). There has been substantial help you received from developers who could not spend that time on Git for Windows as a consequence (and the solution to this problem was outlined as much as people not using VSS could), therefore I closed the issue. I encourage you to heed the advice – ask VSS to give you a list of files it updated, then touch that set of files, in In closing: you are welcome for all the help you received. |
If I only came here to have a problem with vss2git fixed, you would be right. But I think you are missing my point. To summarize: I discovered failures of vss2git to transfer certain files to Git. That boiled down to git refusing to add a file that was visibly different from the latest revision in the repository. I was surprised to see this happen and after seeing that this only happens if the file size matches, I suspected some kind of bypassing reality for performance reasons. I could not believe this, since Git is designed to be tracking contents and nothing else. So I decided to file an issue here. Git developers tend to think from a Unix/POSIX/Linux perspective, that's why the index reflects the Unix file attributes, of which most components do not even exist on Windows. They treated mtimes the way they are commonly used in Unix. But this differs from the Windows environment, where the modification time of a file has traditionally a different meaning. That is easily shown by looking at the basic copy commands: Unix' cp touches the mdate; Windows' COPY or XCOPY does not! This behavior is reflected by many file management tools in each of these environments (file managers, archiving tools, version control systems, backup programs etc.). Both approaches have their advantages. (I think that historically the meaning of ctime and mtime has been confused, but that's another discussion.) You cannot deny this different approach, and you cannot force your Unix view onto the Windows world.
I don't need an argumentation how mdates affect build tooling. I am an experienced C++ developer, I grew up in Unix before Visual Studio even existed. There are different workflows possible, and in large projects you always will have to decide intelligently whether to force a full rebuild. So, again, this is not a vss2git-only problem! The assumption that equal filesize and mtime means that a file is unaltered is going to cause problems under Windows (more likely than under Unix). I cannot estimate how often it will occur in the future, but the hard fact is that Git sometimes ignore diffs and that this may cause loss of files. I am having a hard time "selling" Git to my team who are used to VSS, because (a) from now on they will lose their valuable mtime information, and (b) since yesterday they don't trust that Git will take good care of their data! So I think that there should be a command line option (and a config setting) to opt for reliability over speed, which can force file inspection and rehashing over mtime-based assumptions. I don't think I am the only one I would welcome a discussion about this among Windows-hosted developers. I think it does not cost much time to realize this. It may be as simple as just forcing a function to return TRUE when told so by a config flag. If I could complete it myself within a day I would do so and offer it to the community. (I am also willing to offer a day of my time in exchange for someone else doing it.)
Git would need to do so ONLY if size and mtime match, just to be sure. Even if a general git diff or git status would not see the changes, these commands should have a simple option to look more thoroughly.
I think an efficiënt touch tool under Windows should do just that. In closing: don't bother about my vss2git problem, it has been solved, thanks to all who contributed here. Instead, care about 100% reliability of Git in a Windows environment, in workflows that Windows developers are used to. (Where in the Git manuals is specified that git diff cannot be trusted under circumstances? Under Windows this is not just a "racy" problem!) |
^_^ @j66st
IOW, making So, no need to touch files. Just attack one. Assume git want to support the option, maybe "--no-trust-mtime-of-index-file", I guess the modification to the above code would be
(These code is base on git 1.9.0. |
@YueLinHo Indeed, zeroing the mtime in the index is a way to force considering the file racy without changing git itself. Normally the index will contain more entries, we would have to walk the list and check every file's size and clear the mtime only if the size is equal. The code change you propose does indeed hit the right spot, IMO. You would also have to include the USE_NSEC case:
What is left to do then is to wire this option flag to a command-line switch for all relevant commands (I don't know which of the many non-porcelain commands are involved) and to a config setting. |
There are 3 kind of mtime. So, the "attack" is not walk the all entries data.
Actually, I read them first, |
Aha, you mean not altering the index file contents, but just altering the mtime in the .git directory, something like:
? |
The file modification time is updated by the operating system on a successful write() (POSIX) or WriteFile() (Windows) system call. Both POSIX and Win32 documentation are quite clear on this. You are obviously looking for a time stamp representing the last logical content change by some user (or VSS check-in time or whatever). However, there is no field with such a meaning in file system stat data (neither on Windows nor POSIX/UNIX). Just that VSS abuses the modification time for this doesn't make it correct or even smart to do so, as it may interfere with a variety of software that expects the documented behaviour. So please kindly allow me to use terms like "mess up" or "abuse" for this rogue behaviour of VSS.
Why not use the commit date (
Clearing cached work tree stat data is as simple as However, it naturally comes with a huge performance penalty (> factor 100), e.g. with the WebKit repo (~200k files, ~2GB data):
From these numbers, its pretty obvious that its just not feasible to have Git look at the content of files that haven't changed. Adding a command line switch or config option to ignore cached stat data would have to be discussed in the upstream mailing list, but given previous discussions about persisting / restoring mtime, I doubt that such a change would be accepted. |
@j66st I've triggered a new snapshot build on our CI, which according to the installer test works fine (modulo the mentioned |
As Git did not detect a change in create date either, it seems that VSS just overwrites the file (rather than delete + recreate). So tracking the FileIndex as inode number wouldn't have helped. AFAIK, only NTFS and ReFS have stable FileIndexes, it makes no sense to track the (generated) FileIndex of FAT volumes or network file systems. ...and _ino_t is just 16 bit on Windows, so we cannot store 64 bit (or 128 bit for ReFS) anyway. |
Yes, but the CopyFile Windows system call preserves file attributes, including mtime. Clearly documented. Most file managers (Windows Explorer, Total Commander, etc.), unzippers, backup-restore tools will do so. If you consider VSS a tool to restore an old version, it is correct to also restore the file attributes. But this is configurable. Vss2git, though, does not use VSS, it replays history itself by directly accessing the VSS repository at the file level. But forget about VSS and vss2git. That problem has been solved. I am worried that this problem might occur in daily use of Git in a Windows context, where copying files from flash drives, network drives, ZIP archives always preserves/restores the original mtime. I want an option to bypass the Git-assumption that it thinks to "know" files which are older than the index. This option may then be off by default, so no overhead at all for users who want to accept the small risk.
I do so. But the commit date may differ considerably from the file's mtime, and the mtime differs between files. So I let vss2git in addition save a listing of file mtimes in the commit message. For the future (as soon as all team members want to use Git) the commit timestamps may be indeed sufficient, although loss of the real mtimes is still a pain.
I proposed to check only files of which the size has not been changed. This is normally only a small percentage of the files, and thus the performance overhead would hardly be noticeable.
I would appreciate if someone can point me to relevant discussions about mtimes and Git in a Windows environment, to enlighten me; apart from my vss2git problems I still can't estimate how big a risk ignoring this issue would be for the future. |
Huh, are you saying the files who have changed their size is the common case...?! |
Source code (text) files which are edited most of the time will get a different size. Binary files like icon bitmaps normally keep the same size. |
You typically only work on a small portion of a project, and the majority of files are unchanged (i.e. same size, modification time, creation time and content). Always looking at the content of those unchanged files has quite noticeable performance impact, as measured above (0.7s vs. 95s).
This is what Git already does: if the size is different, Git knows for sure that the content changed (without even looking). But it seems you're contradicting yourself here? Do you want to check files if the size has not been changed, or if the size is different? |
@kblees Sorry for the confusion! I edited a mistake in my message a minute after posting, but you apparently got the text before that change. Please read my last paragraph again: So by calling "git add" for a file the caller indicates that the file is added or modified; Git should then first compare the file size, only if the size is unchanged the index should not be trusted (because of Git's caching habits) and the file must be inspected. So only added files have to be inspected, all "sleeping" files are left alone. |
@kbless thanks for the pointers regarding FileIndexes. |
Can we agree on a minimal test case which shows the "unwanted" behaviour? @j66st Do you think the following should result in git saying that file.txt was modified?
The fancy ed line is just a way to change the file without changing its inode number. It gives:
|
The discussion now is branching, but I hope you keep focus on the user's POV. So please step back for a minute and listen to this user story: " I ran into the following problem with Git: I wanted to update a file of which a version is already in the repository. So I run the command "git add myfile". Looks OK, no error or warning returned. After adding some more files I do a "git commit". I get the normal success response. A few days later a build behaves strange: it turns out that myfile was never updated! This smells like Git is taking a shortcut and does not actually read myfile. After discussing this with a long-time Git user I decided to make a simple repro case and submit a bug report. Got suprised reactions even from some Git developers. The fog started clearing as the "racy git problem" was brought in. Then the Product Master came in. From his throne in Linux Heaven he looked down to me, humble noob in Windows Hell. He said that such things only happen in Windows where sinners think they can turn back time. But it would be unlikely to occur anyway, and we have to live with it. After preaching extensively about how mtimes affect build tools He concluded with something like "Thou Shalt Not Play With Mtimes" and "This Is How Git Is Designed", arrogantly made clear that I already wasted too much of his and his disciples' valuable time . He pointed His Thumb down and closed the issue. Then after a short silence slowly the mumbling began in the forum. Some good ideas bubbled to the surface. Several people offered help. Am I, humble Git noob, asking too much? If I ask "git add myfile" isn't it clear what I want? Is it unlikely that the file has been modified? Is there ANY excuse (apart from a hash collision) for silently NOT updating my file if it is different? So I think "git add" should always read the file(s) that I want to add before concluding that they are unchanged. So if I am alone with my "woes", I will have to find my own solution. Git is open source, so I can make my private fix. If other Git users feel affected, then maybe we can join forces to find a suitable solution (like an option --i-confess-i-am-a-sinner-so-please-ignore-mtime) in a way that does not itch too much in heaven and makes hell still a place worth living in. " OK guys, thanks for your time, back to work. P.S.: Yes, I know I can go for a full refund. |
I cannot test it with the ed line (ed is not included in msysgit) but if I replace that line with
then I get similar output under Windows. |
@j66st If my test case shows in your eyes unwanted behvaiour this is a upstream's git behaviour. This behaviour is a bigger problem on windows as we have only time stamps with second precision and don't include the inode comparison check. Both things could be changed in windows if someone steps volunteering to do or pay. |
^_^ Nice story. @j66st you are a good story teller. |
Which is not true, CopyFile always sets the archive attribute.
The MSDN documentation doesn't mention file times at all (apart from community additions), so it may change at any time. The current behaviour of copying just mtime (not ctime/atime) is completely broken, as it creates files that are last modified before they were created, which makes no sense at all. AFAICT, the behaviour is also file system specific (e.g. last accessed time behaves differently on FAT and NTFS if NtfsDisableLastAccessUpdate=1 in the registry).
Even if CopyFile preserves mtime, you still need a file of same size and mtime but different content. Which IMO is very unlikely to happen in a normal development workflow. E.g. if you copy a file to the network, edit it there and copy it back, the mtime will have changed. Apart from vss2git, the only currently known way to reproduce the problem requires a special Unix tool (
You're forgetting that most git commands support wildcards / pathspecs. So there is good reason to check stat data first, otherwise So, dear humble Git noob, if you took such great care to produce a file with same size, mtime and ctime, yet different content, is it too much to ask to just |
The one second granularity is due to a former POSIX limitation of the stat structure (changed in 2013 to include nanoseconds). NTFS time stamp resolution is 100ns. Tracking file times with nanosecond precision would make the "same mtime by chance" case even more unlikely than it already is. It won't fix the "reset mtime on purpose" case, though.
Tracking the inode number just helps detecting criss-cross renames. AFAICT it won't affect any of the problems discussed here. That being said, implementing nanosecond precision is reasonably trivial and won't hurt performance, so I gave it a shot. |
I know, I agree it has always been a mess. But it has been this way since the DOS days, I don't expect MS to change it. Ctimes under Windows are hardly used by anyone, I guess. It would be useful to separately keep content modification timestamp and last write timestamps. Fact is that many Windows developers have a workflow where mtime is used as an always visible content modification time. Moving to Git will change that because Git (from their POV) messes up the mtimes. Major pain was, after the vss2git misbehavior, to find out which files were lost, because we couldn't use mtime anymore to identify if the working copy of a file was the most recent. The problem typically occurred with icon bitmaps, where only a few pixels were edited, so a simple diff is useless.
I know. But if I ask to "git add ." I will accept some delay because then I am asking for inspecting every file.
I'm still not convinced that in my vss2git case there ever existed two files with same mtime and different content. I only would expect such thing to happen in a "racy" case with mtimes that are current, but then the index in my sample would contain a fresh timestamp. My VSS repository contained 3 versions of the file, with clearly distinct mtimes in the past, so different from the current time. So I still can't understand the content of the Git index in my Demo.zip sample. To find out, I will have to do a vss2git run from the debugger, break before every "git add" and take snapshots of the working directory and git repo. I'm too busy right now, but I will try that soon. I agree, doing a "touch myfile && git add myfile" would be OK. |
Assume: Assume vss2git checkout file with modification time or checkin-time Something wrong... :-/ @j66st few questions: |
@YueLinHo : Here is the export section of the vss2git log file of the session where I built the Demo.zip from. Demo.zip simply contains the resulting directory. You can see every step, all Git commands and their response have been logged.
The actual timestamps are not logged. I will do a new run to find out. |
In your .git/index file, both ctime and mtime are 0x4ab204b5 = 2009-09-17T09:43:17 (see YueLinHo's screenshot above), which seems to be the mtime of V3. From the vss2git sources (https://code.google.com/p/vss2git/source/browse/Vss2Git/GitExporter.cs#689), at least the ctime should have been that of V1 (2007-06-01... ~= 0x465f604d). This also means that when git saw the content of V2, the file already had mtime/ctime of V3, so something is seriously messing up your file times here. The timestamps you dump to the commit message seem to be correct, where do you get these from (file system or vss2git classes)? Is it possible that your 'dump file times' patch screwed things up? |
@kbess said:
Hmm yes. Thanks for the nanoseconds patch btw. |
The logic I realized from vanilla git code:
|
You picture it very clear in this block of pseudo-code! Yes, this is also the logic I found out, it took me quite some time before I understood the role of the cache entries, the timestamp of the index file itself in relation to the working tree.
Yes, it turned out that my patched version used a field that contains the archived mtime of the last version of the file (even during replay of older versions) to set the mtime of the reconstructed file. My intention was to have the working directory reflect the mtimes we are used to. And it did. I did not check mtimes of the intermediary reconstucted versions as I did not care because they are not stored by Git anyway. I now understand that this caused the "racy" condition which confuses Git. Since I discovered that Git will mess up the mtimes in my working directory anyway with every checkout of a different branch, it makes no sense to preserve mtimes any longer. So I now changed vss2git back so that the intermediary mtimes will reflect the changeset time (also used as the commit date in Git). This is the safest bet to avoid any racy conditions, because vss2git's changesets are by definition distinct in time. vss2git is a complicated program, basically it imitates the full retrieval logic of VSS. I did not study its data structures in full depth. My major goal to patch vss2git was to better handle Shared and Branched files in the VSS repo (note these words have a different meaning in Git) because otherwise most of the history would not be transferred. And the second goal was to keep the original mtimes in the commit message, because some team members are not yet ready to change their workflow and want to see the last real modification time. To conclude:
Everyone who contributed, thanks a lot for your help and patience! |
I learned what I want and enjoy these conversation from you top guys, so thank all of you here. ^o^d |
I have a similar case, which may belong here. Interesting, that git status doesn't show replaced changes, if the Is there a way to force git status to show changes, even if the file I tried to set core to: The following solutions makes the changed files appear again as changed: b) git read-tree HEAD But these solutions are just workarounds, not the real permanent solutions. |
By this "restoring" of the original modification time you broke the contract: the mtime should reflect the time of the latest change. You replaced something, i.e. changed the file contents. Git expects the mtime to be adjusted in that case. By painstakingly faking it back to its original value you essentially told Git: don't worry, this file has not changed since you last saw it. There is nothing Git can do to outguess you when you go out of your way to break the most fundamental promise of the mtime value. |
@webmaster33 I would not expect that Git would ever want to try and detect a content change, without mtime (etc) stat change, and try to show it as 'changed', i.e. unstaged. I'd hope it could allow a forced add (but someone [you?] would have to code it) to allow these special cases where you/your code already knows that the file has changed and the update can be forced so that the 'unstaged' indication would never show itself. (note there is a separation / split of concerns, so the suggested solution changed) The main use case here appears to be to shift data from one version control system to another, and retain a coarse mtime value held by the old system (probably as a text field, as git does not itself record it) when recreating revisions in the git system, and what is wanted is a "this is what it is, write it, blinkers-on" approach to copying the data across. [As I type this I realise there maybe some low level blob writing plumbing action that I can't remember the name of that does this (e.g. git-hash-object etc.), the manual is quite a Full manual, so worth a read] |
My memo: |
From: Chapter 10 of Pro Git 2 Book Thanks, that's useful. Lets hope it helps the OP with the macro/script for the transfer from VSS... yeah. Philip |
I found a strange problem in msysGit, I am wondering if it's a bug.
I already discussed it here:
https://groups.google.com/forum/#!topic/msysgit/6XLoSPH26kc
You can download my Demo.zip attachment there.
It seems not to happen in Git for Linux or OS-X.
The problem occurs in msysGit 1.9.0 and 1.9.5 (running x86 version on a Windows 7 x64 system).
Essentially, the problem seems that Git for Windows assumes that two files are the same when both the timestamp and the file size match. Obviously the file contents is not inspected nor the hash recalculated.
As mentioned I created a minimal demo package to easily prove the issue. Simply download the zip file, put it in a clean directory and run the enclosed script from bash. Below is a transcript of what happens when I run the script in my situation.
The issue is a real show-stopper in my automated migration from Visual SourceSafe to Git (in a test transferring 20,000 source files the problem caused loss of roughly 0.1% of the files from the history), so I hope for a quick solution.
The text was updated successfully, but these errors were encountered: