New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
commit problem with Git Gui / Windows7 / UTF-8 #761
Comments
I suspect this is due to (what appears to be) a default encoding setting in Git GUI.
NOTE: If you change this setting, it may not "fix" the author name of past commits. But future commits made by Git GUI should result in the author's name appearing correctly. It is possible to go back and fix the author name of those past commits if they don't appear correctly after changing this setting. It's a bit involved, depending on your level of knowledge about Git, but it can be done. |
Thanks fourpastmidnight, but the default file encoding was already set to utf-8. |
Full test case:
|
It is a regression may be caused by commit 30395c6 tagv2.8.1.windows.1 works. Commit 30395c6 should be reverted. |
@bitjo: Ah, good find. And I apologize, I missed your configuration entry for the Gui in your original post. |
So you ask me to reintroduce a bug? |
So,@bitjo, can you fix it and create a pull request? If there are tests for Git GUI, could you also write a test? |
Yes, please. Description:A commit with the option 'Amend Last Commit' in Git Gui behaves like 'git commit --reset-author --amend' (tested on Linux using git version 2.7.3 and Windows git version 2.8.1). Under Windows git version 2.8.2 or 2.8.3, the commit acts like 'git commit --amend' (without '--reset-author'). The implementations under different environments (Linux, Windows, older versions) behave not the same. (One side effect of the change for Windows since git version 2.8.2 is that UTF-8 characters in the author's name are not handled correctly.) Suggested solution:revert commit 30395c6 Then the implementations under different environments (Linux, Windows, older versions) behave the same. Further discussion:The proposal of http://article.gmane.org/gmane.comp.version-control.git/243921 can be discussed. But a user of Git Gui will use the function generally to the correction of the current commit. '--reset-author' is then the correct behavior. For a 'git commit --amend' (without '--reset-author') in Git Gui we should introduce rather be a new option (like: 'Amend Last Commit and preserve comitter information'). Changes must be done in the master repository of Git Gui. |
I'm sorry, but I have no development environment under Windows. Therefore, I do not want an extremly simple change (it's just a 'git revert') submit. I would like to also write a test case. But for this kind of discussion we should open another point. |
Hmm, no, I don't think reverting that commit is the answer. That commit clearly corrects incorrect behavior in Git GUI with respect to amending a commit. Having said that, it appears (from a cursory glance) that the solution does not handle UTF-8 in the commit author fields appropriately, which would be causing the problem you're experiencing. The real fix for the problem you're experiencing, then, is handling UTF-8 appropriately in Git commit author fields. NOTE: This is the first time I'm looking at the TCL language, so I'm not real familiar with how it deals with character encodings. I see on lines 30 and 31 when they initially load the commit for amending, that they attempt to use UTF-8 as the default encoding for reading the commit blob. But when they go to actually set the various author fields starting on new line 122, there's no mention of character encoding. Again, I'm not real familiar with TCL and how it deals with character encodings. |
Why? In many years I have seen: Git Gui -> Commit Amend ... : This is 'git commit --reset-author --amend'. (For not Git professionals is this a plausible behavior. Git professionals will never use Git Gui.) We can not break with this experience in Git Gui for Windows only. The UTF-8 problem is more subordinate. |
The patch in question was submitted for inclusion upstream, and when it gets accepted, your desired revert would cause the rift to deepen.
Is this not the problem you want to see fixed? I do not see how the revert is a solution to that end. Instead, it will only pile up technical debt. @bitjo so, are you prepared to work on a real fix that does not regress Git GUI even further? |
I had not known that the patch has been submitted for inclusion upstream. Under these circumstances, is a discussion of the patches on this issue, of course, wrong. My above comments about the patch should be ignored.
I'm not familiar with TCL and the git gui implementation. |
I’ve experienced the same behavior of Git GUI as described by @bitjo. Moreover, Git GUI seems to keep cached commit information (date and wrongly encoded author) and put it to any ongoing commit (not limited to amends) done in Git GUI until I relaunch it. |
@Melebius do you have any experience with Tcl? |
@Melebius do you know who the original author is? It's not me. In fact, the original author of Git GUI was swallowed by the black hole known as Google. It will come as no surprise to you that he will never fix this. The thing is: this is Open Source. You get to use the software. Just like that. You can install it on as many machines as you want. You do not pay for it. You are never asked to support the lives of those who wrote the software. You will never contribute to their being able to pay their rent, as low as it might be. And you do not get to tell them what they should or should not do. If you are interested in seeing this fixed, and if you are prepared to do a little something to that end yourself, let me know, then I will invest some time to point you to the correct code location. |
@dscho I meant the author of the discussed commit 30395c6. It was not made a long time ago. However, I might be able to do some work on it if time permits. In the meantime, I would like to ask you to defer the incorporation of the faulty commit into upstream. You mentioned it may happen and I am not familiar with the process of submitting and reviewing patches in the Git project. |
I think that problem is in passing author name as Someone should test if changing this to something like this (line 395)
would fix this. |
This doesn't work. I wonder why nobody tagged me here, I just stumbled upon this bug report. I'll try to fix it. |
Probably because you not associated your e-mail |
hmm... I've been trying to investigate this for 2 days. It looks like an inherent bug in TCL. I have a test script that demonstrates the problem:
The output of this is:
If I remove It looks like there's no way to pass a utf-8 string in an environment variable. Does anyone has a suggestion? |
H Orgad, Maybe https://www.tcl.tk/doc/howto/i18n.html which says: The system encoding is the character encoding used by the operating system for items such as file names and environment variables. Text files used by text editors and other applications are usually encoded in the system encoding as well, unless the application that produced them explicitly saves them in another format (for example, if you use a Shift-JIS text editor on an ISO 8859-1 system). Tcl automatically converts strings from UTF-8 format to the system encoding and vice versa whenever it communicates with the operating system. For example, Tcl automatically handles any encoding conversion needed if you execute commands ... Aside: I just did the thing of googling for the final "string of frustration" which is often the best summary of the search question. I find that if I hit 'send' then when I read back the posted email I get a new impetus... The quoted section looks to explain how TCL massages the i18n strings. Hope it helps Philip ----- Original Message ----- hmm... I've been trying to investigate this for 2 days. It looks like an inherent bug in TCL. I have a test script that demonstrates the problem: #!/usr/bin/env tclsh encoding system utf-8 Großset a [encoding convertfrom utf-8 [binary decode hex 47726fc39f]] 47726fdf # good It looks like there's no way to pass a utf-8 string in an environment variable. Does anyone has a suggestion? -- lmgtfy "tch pass a utf-8 string in an environment variable" ;-) |
@PhilipOakley Thanks for your help. I already read this documentation, that's why I tried to use Still, this doesn't work as expected. I tried many permutations of encodings and conversions. If you're able to find a way that works, share it please. The expected output for the last line is:
|
@patthoyts any ideas? |
Okay, after a couple experiments and a couple of web searches, I came to the conclusion that one should never set the system encoding. Period. It simply does not do what you expect it to. From https://www.tcl.tk/doc/howto/i18n.html:
If you change the system encoding in Tcl, you do not change the system encoding. just Tcl's idea of it. Not a good idea. It also appears that the results are different depending whether we run in Mintty or in a Win32 console... My guess is that we should make sure that the environment variables are set using |
@dscho, being related to Tcl/Tk development some time ago, I concur that you should never touch Basically, the Tcl's idea of string processing is quite modern:
|
Commit 7e71adc fixes a problem with git-gui failing to pick up the original author identity during a commit --amend operation. However, the new author details then become persistent for the remainder of the session. This commit fixes this by ensuring the environment variables are reset and the author information reset once the commit is completed. The relevant changes were reworked to reduce global variables. Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
This is a problem in the handling of the Windows environment in Tcl. Tcl needs to be updated to use the unicode environment on Windows. A test using an extension to try this out (tclenv) shows this should work once a patch to tcl is ready. This will need a bit of a rework in this part of Tcl as it doesn't currently do any per-platform environment initialization.
|
Or should we introduce an abstraction layer for setting environment variables in Git GUI and use |
This is just a test extension, it doesn't correctly maintain the 'env' variable linkage so could lead to confusion without extra works to keep things in sync. As your main concern is git-for-windows, we really want a patched or fixed version of wish in the git-for-windows distribution. I'm rather surprised this hasn't been raised before since we went all unicode in 2000. |
@Melebius : I am experiencing this too, not just with Git GUI, but also when I do rebase from Git Bash (the mintty one), and possibly elsewhere as I have not yet tracked down the exact chain of events leading to the errors in my project database (it tends to be discovered some undefined time after the fact). And it is not only wrongly encoded author, but in fact sometimes wrong author(!) This is however a critical flaw, as it means that the contents of the database cannot be trusted to show the true history. And as such it should be reported in a separate issue, as I think it will be overlooked in this one. |
@superole : Are you sure you don't have GIT_AUTHOR_NAME environment variable set? |
@orgads : yes, I am quite sure. I use the config settings user.name and user.email. There is no GIT_AUTHOR_NAME env var on my system currently, but I guess git could have set it at the time of the error and failed to clean up after. Although I fail to see any good reason why git would tamper with that variable. But as I said this is a separate, and possibly unrelated bug, and should be reported as its own issue. |
@orgads Are you sure you don't have GIT_AUTHOR_NAME environment variable set? @superole Agree. However, I haven’t filed any separate issue on that, the best matching I found was this one. Do you launch Git GUI from Git Bash? I usually launch Git GUI directly using Explorer’s context menu and run Git Bash (or even Git in |
@Melebius : yes, I normally launch it as a backround process from git bash ( Aha, so Git GUI may have altered my bash environment, and thereby messing up for my CLI operations... |
@orgads : patthoyts commit cfe616b ammends your pull request to the git-gui project to fix this
should this fix not also be included here? |
The fix, as I understand it, should be in the next G4W (the aforementioned 'here'?) release. The cascade from the git-gui project to the Git project to the G4W project does take a finite time ;-) Philip |
Yes, and it is very possible to help. For example by checking out the latest of the prereleases at https://github.com/git-for-windows/git/releases/ whether they have the fixes, and by preparing appropriate PRs if they do not. |
FYI this issue reproduces for me on Linux. Git gui "amend" breaks Unicode characters in the Author field. git-gui version 0.20.0.44.gccc98 git version 2.11.0 |
@ilor My workaround under Windows (as root / administrator in Git-Bash): The workaround provides the file git-gui/lib/commit.tcl from Git version 2.10.2. |
I have the same as @ilor with git-gui version 0.21.GITGUI and git version 2.14.1 on Ubuntu Linux. Relevant bits of my ~/.gitconfig:
|
As it has been determined that the issue is not Windows-specific, I'll close this ticket, suggesting to move the discussion to the Git mailing list (no HTML, not even alternate part, just plain text, otherwise the mail will be rejected). |
or closed issue
matching what I'm seeing
Setup
output of
git version
as well.$ git --version
git version 2.8.2.windows.1
Windows7 64bit
defaults?
defaults
to the issue you're seeing?
user.name in ~/.config contains a 'german double s' / 'sharp S' / 'ß'.
Relevant entries in ~/.config:
[user]
email = ...
name = ...ß...
[gui]
spellingdictionary = de_DE
encoding = utf-8
gcwarning = false
Details
Bash
Minimal, Complete, and Verifiable example
this will help us understand the issue.
A commit in Git Bash works fine.
A commit in Git Gui translates der autor name from '...ß...' in '...Ã??...'. The date of change is also wrong.
Correct autor name and time stamp in the repository.
A commit with Git Gui should have the same result as the command 'git commit' in Git Bash.
Git Gui destoys autor name and date.
URL to that repository to help us with testing?
The problem occurse in all repositories.
I am using some repositories since ten year.
The text was updated successfully, but these errors were encountered: