New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recognize .gitattributes, at least "encoding" field #96

Closed
socketpair opened this Issue Oct 28, 2011 · 9 comments

Comments

Projects
None yet
3 participants
@socketpair

socketpair commented Oct 28, 2011

I have .gitattributes with this:

*.sql encoding=cp1251

So, in git GUI, "interactive add" woks out of the box, but not in git cola, as I have russian lines.
Also, git cola displays wrong symbols in these lines.

@davvid

This comment has been minimized.

Show comment
Hide comment
@davvid

davvid Oct 30, 2011

Member

Thanks mmarkk. Do you happen to have an example file I can use for testing? I'll see what we can do.

Member

davvid commented Oct 30, 2011

Thanks mmarkk. Do you happen to have an example file I can use for testing? I'll see what we can do.

@socketpair

This comment has been minimized.

Show comment
Hide comment
@socketpair

socketpair Oct 31, 2011

2011/10/30 David Aguilar
reply@reply.github.com:

Thanks mmarkk.  Do you happen to have an example file I can use for testing?  I'll see what we can do.

  1. Create git repository.
  2. add file.txt with contents:
    example 1
    example 2
    example 3
  3. add and commit it
  4. insert into middle some russian strings in encoding CP-1251
  5. try create contiguous selection containig both russian and english lines.
  6. choose "stage selected lines"
  7. look to the output of git-cola and see the mesage: "patch does not apply".

Reply to this email directly or view it on GitHub:
#96 (comment)

Segmentation fault

socketpair commented Oct 31, 2011

2011/10/30 David Aguilar
reply@reply.github.com:

Thanks mmarkk.  Do you happen to have an example file I can use for testing?  I'll see what we can do.

  1. Create git repository.
  2. add file.txt with contents:
    example 1
    example 2
    example 3
  3. add and commit it
  4. insert into middle some russian strings in encoding CP-1251
  5. try create contiguous selection containig both russian and english lines.
  6. choose "stage selected lines"
  7. look to the output of git-cola and see the mesage: "patch does not apply".

Reply to this email directly or view it on GitHub:
#96 (comment)

Segmentation fault

@davvid

This comment has been minimized.

Show comment
Hide comment
@davvid

davvid Dec 17, 2011

Member

I was hoping you could send me a URL a document I could download in CP-1251. Yes, sorry, I'm very lazy.

I have a rough idea of how we could do this. Supporting the .gitattributes parsing/rules generation is just a matter of coding. That could be kinda fun; dulwich and similar projects may have already written code we can reuse.

Once we have a way to classify paths as being affected by an attribute then we can use that encoding when writing the .patch file that we feed to git apply/apply-index. I think that should work.

mmarkk, does the diff output look correct with these files? Does git always output utf-8 or is cola also displaying the diff incorrectly too?

Sorry for so many questions. Also, are you on Linux/win32/darwin? I know the unicode stuff in msysgit isn't completely there yet, which is why I ask.

Member

davvid commented Dec 17, 2011

I was hoping you could send me a URL a document I could download in CP-1251. Yes, sorry, I'm very lazy.

I have a rough idea of how we could do this. Supporting the .gitattributes parsing/rules generation is just a matter of coding. That could be kinda fun; dulwich and similar projects may have already written code we can reuse.

Once we have a way to classify paths as being affected by an attribute then we can use that encoding when writing the .patch file that we feed to git apply/apply-index. I think that should work.

mmarkk, does the diff output look correct with these files? Does git always output utf-8 or is cola also displaying the diff incorrectly too?

Sorry for so many questions. Also, are you on Linux/win32/darwin? I know the unicode stuff in msysgit isn't completely there yet, which is why I ask.

@weinhold

This comment has been minimized.

Show comment
Hide comment
@weinhold

weinhold Apr 15, 2012

Similar issue here: openSuse 11.4, git-cola v1.7.6. Standard encoding is UTF-8, encoding for the files in question is Latin1 (ISO-8859-1). Staging hunks/lines in git-cola works in principle, but only when the hunk/lines don't contain non-ASCII characters (such a character in the context is sufficient to make things break).

Here's a simple test working directory to reproduce the issue:
http://yellowbites.com/stuff/git-cola-encoding-issue.tar.bz2

It contains a file "test" with two unstaged changes. The first contains a non-ASCII character and cannot be staged. Specifying the encoding via ".gitattributes" fixes the same issue in git-gui. Interestingly using "git add -p" on the command line always works, even without the ".gitattributes".

weinhold commented Apr 15, 2012

Similar issue here: openSuse 11.4, git-cola v1.7.6. Standard encoding is UTF-8, encoding for the files in question is Latin1 (ISO-8859-1). Staging hunks/lines in git-cola works in principle, but only when the hunk/lines don't contain non-ASCII characters (such a character in the context is sufficient to make things break).

Here's a simple test working directory to reproduce the issue:
http://yellowbites.com/stuff/git-cola-encoding-issue.tar.bz2

It contains a file "test" with two unstaged changes. The first contains a non-ASCII character and cannot be staged. Specifying the encoding via ".gitattributes" fixes the same issue in git-gui. Interestingly using "git add -p" on the command line always works, even without the ".gitattributes".

@socketpair

This comment has been minimized.

Show comment
Hide comment
@socketpair

socketpair May 26, 2012

I was hoping you could send me a URL a document I could download in CP-1251

http://dl.dropbox.com/u/4266905/cp1251.txt
And the same in utf-8:
http://dl.dropbox.com/u/4266905/utf8.txt

does the diff output look correct with these files?

Git diff show "incorrectly", as git does know nothing about source encoding

Does git always output utf-8 or is cola also displaying the diff incorrectly too?

Both display incorreclty, but: (http://linux.die.net/man/5/gitattributes):

: Viewing files in GUI tools
:
: encoding
: The value of this attribute specifies the character encoding that should be used by GUI tools (e.g. gitk(1) and git-gui(1)) to
: display the contents of the relevant file. Note >> that due to performance considerations gitk(1) does not use this attribute
: unless you manually enable per-file encodings in its options.
:
: If this attribute is not set or has an invalid value, the value of the gui.encoding configuration variable is used instead
: (See git-config(1)).

I'm still using Ubuntu 12.04 now, still the same problem.

socketpair commented May 26, 2012

I was hoping you could send me a URL a document I could download in CP-1251

http://dl.dropbox.com/u/4266905/cp1251.txt
And the same in utf-8:
http://dl.dropbox.com/u/4266905/utf8.txt

does the diff output look correct with these files?

Git diff show "incorrectly", as git does know nothing about source encoding

Does git always output utf-8 or is cola also displaying the diff incorrectly too?

Both display incorreclty, but: (http://linux.die.net/man/5/gitattributes):

: Viewing files in GUI tools
:
: encoding
: The value of this attribute specifies the character encoding that should be used by GUI tools (e.g. gitk(1) and git-gui(1)) to
: display the contents of the relevant file. Note >> that due to performance considerations gitk(1) does not use this attribute
: unless you manually enable per-file encodings in its options.
:
: If this attribute is not set or has an invalid value, the value of the gui.encoding configuration variable is used instead
: (See git-config(1)).

I'm still using Ubuntu 12.04 now, still the same problem.

@davvid

This comment has been minimized.

Show comment
Hide comment
@davvid

davvid May 27, 2012

Member

What we can do is follow gitk's lead and have a similar variable to turn on the feature if it turns out to have performance implications.

Thanks for the examples. It's certainly possible to teach git-cola about .gitattribute encodings. The easy first step was to assume utf-8 everywhere, which is how things work right now. If you have control over your environment, I would suggest using utf-8 over latin-1, but in principle, yes this needs to be fixed.

Patches and pull requests are certainly welcome! Hopefully this can be done with too many intrusive changes.

Member

davvid commented May 27, 2012

What we can do is follow gitk's lead and have a similar variable to turn on the feature if it turns out to have performance implications.

Thanks for the examples. It's certainly possible to teach git-cola about .gitattribute encodings. The easy first step was to assume utf-8 everywhere, which is how things work right now. If you have control over your environment, I would suggest using utf-8 over latin-1, but in principle, yes this needs to be fixed.

Patches and pull requests are certainly welcome! Hopefully this can be done with too many intrusive changes.

@davvid

This comment has been minimized.

Show comment
Hide comment
@davvid

davvid May 28, 2012

Member

Staging hunks/lines in git-cola works in principle, but only when the hunk/lines don't contain non-ASCII characters

That's good. That means that supporting this should only affect the patch/apply parts of the code. This is very good. I'll see what I can find today.

Member

davvid commented May 28, 2012

Staging hunks/lines in git-cola works in principle, but only when the hunk/lines don't contain non-ASCII characters

That's good. That means that supporting this should only affect the patch/apply parts of the code. This is very good. I'll see what I can find today.

davvid added a commit that referenced this issue May 29, 2012

diff: Support .gitattribute encodings when showing diffs
Teach the diff viewer to display diffs using the encoding
specified in .gitattributes.

This is the first half of the work needed to properly support
.gitattributes for file encodings as described in issue #96.

Set `cola.fileattributes` to true to enable this feature.
The diffs are now displayed correctly for non-utf8 files.

Diffs cannot yet be applied, but this is the first step.

Signed-off-by: David Aguilar <davvid@gmail.com>

@davvid davvid closed this in 11e27e3 May 29, 2012

@davvid

This comment has been minimized.

Show comment
Hide comment
@davvid

davvid May 29, 2012

Member

Thanks for those test repos! That really helped.

I'd really appreciate it if you guys could give it a whirl. If it looks good then I'll start thinking about the next git-cola release and hopefully Ubuntu can pick it up soon.

You'll need to do this in your repo to enable the feature:

git config cola.fileattributes true

The per-file attributes are checked once per cola session (for performance) so you will need to restart git-cola if you are actively making changes to .gitattributes.

Member

davvid commented May 29, 2012

Thanks for those test repos! That really helped.

I'd really appreciate it if you guys could give it a whirl. If it looks good then I'll start thinking about the next git-cola release and hopefully Ubuntu can pick it up soon.

You'll need to do this in your repo to enable the feature:

git config cola.fileattributes true

The per-file attributes are checked once per cola session (for performance) so you will need to restart git-cola if you are actively making changes to .gitattributes.

@weinhold

This comment has been minimized.

Show comment
Hide comment
@weinhold

weinhold May 31, 2012

Thanks David! Used this version for a day and everything worked as advertised. So far I haven't noticed any performance impact either, but I haven't had any really large change sets.

weinhold commented May 31, 2012

Thanks David! Used this version for a day and everything worked as advertised. So far I haven't noticed any performance impact either, but I haven't had any really large change sets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment