Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buggy support for Chinese characters #2536

Closed
Strongc opened this issue Feb 1, 2016 · 22 comments
Closed

Buggy support for Chinese characters #2536

Strongc opened this issue Feb 1, 2016 · 22 comments
Labels
💊 bug Something isn't working 🙇‍♂️ help wanted Need your help

Comments

@Strongc
Copy link

Strongc commented Feb 1, 2016

1.从hggit导入的库,如果目录里有中文名的文件,试图打开这个目录,则报500错误
如果在根目录有中文名目录或者有中文名文件,仓库首页报500
https://pypi.python.org/pypi/hg-git (hggit 0.84版)
TortoiseHg 3.62版
Gogs使用数据库是Sqlite,win8主机
客户端是谷歌浏览器,Win10

经测试,报500错的,也可以克隆出来。
但中文名的文件或目录已经是乱码了。

2.在仓库首页,Readme.txt文件中的中文都是乱码。
经过比较,发现commit里的中文是utf-8,而文件中的是gbk或18030

3.浏览文件,文件中的中文注释是乱码,尝试使用gbk,18030等编码也无法正确显示。
但克隆出来的文件中中文仍是正确的。

4.不支持中文仓库名,而bitbucket.org是支持的
default
1
2
3

@unknwon
Copy link
Member

unknwon commented Feb 1, 2016

Thanks your feedback!

What is your Gogs version and can you reproduce this on https://try.gogs.io?

@unknwon unknwon added the status: needs feedback Tell me more about it label Feb 1, 2016
@unknwon unknwon added this to the 0.9.0 milestone Feb 1, 2016
@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

我用的版本是 © 2015 Gogs 当前版本: 0.8.10.1217

测试的库已经上传到https://try.gogs.io/cztest/test1.git
当前版本是 : 0.8.26.0201

测试方法是先用TortoiseHg 3.62版建立hg库。
创建中文文件名,包含中文的Readme.txt等
然后通过hg-git提交到 https://try.gogs.io/cztest/test1.git

try.gogs.io中的库没有因为中文文件名或路径名出现500错,可能是升级解决了此问题。
但中文的文件名仍然是乱码的的,下载下来也是错的。

另外,首页预览中的Readme.Txt中的中文仍旧是乱码。
看原始文件没有问题。

通过git提交到gogs库里可以正确处理中文路径名。
“中文目录”就是用git提交的。

如果通过hg-git再次拉到hg库里,这个"中文目录"是乱码。

看来多半是hg-git对中文路径名处理错误,造成上传和下载中文路径名或文件名乱码。

@unknwon unknwon added 💊 bug Something isn't working 🙇‍♂️ help wanted Need your help and removed status: needs feedback Tell me more about it labels Feb 2, 2016
@unknwon unknwon removed this from the 0.9.0 milestone Feb 2, 2016
@unknwon
Copy link
Member

unknwon commented Feb 2, 2016

Thanks... You may want to set a good value for this config option.

@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

https://try.gogs.io/cztest/test1/raw/master/main.c 中的中文显示是乱码的,下载后文件没问题
https://try.gogs.io/cztest/test1/raw/master/Readme.txt 中的中文显示是正确的

这说明不同后缀预览处理不一致。
谷歌浏览器

@unknwon
Copy link
Member

unknwon commented Feb 2, 2016

https://try.gogs.io/cztest/test1/raw/master/main.c 中的中文显示是乱码的,下载后文件没问题
https://try.gogs.io/cztest/test1/raw/master/Readme.txt 中的中文显示是正确的

这说明不同后缀预览处理不一致。

Thanks... but I don't see any valuable points here.

And, please do not skip my comments.

@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

app.ini中设置了
ANSI_CHARSET = GB18030
更新Gogs到 0.8.25.0129

浏览原来有500错的库,问题依旧。

再次使用TortoiseHg 提交到一个新的库,问题还是存在。
浏览包含中文路径或文件名路径仍然出现500页面。

可能升级到与try.gogs.io相同的版本能解决,但这需要编译,我再试试。

提交到https://github.com/Strongc/test222.git的库
首页预览Readme.txt,中文可以正确显示。
https://github.com/Strongc/test222/blob/master/main.c 中的中文与gogs一样是乱码
改变浏览器字符集也不能正确显示。
我猜想这些问题与hg-git有关系,我再研究研究hg-git吧

@unknwon
Copy link
Member

unknwon commented Feb 2, 2016

浏览原来有500错的库,问题依旧。

What is the error log for 500?

@unknwon unknwon added the status: needs feedback Tell me more about it label Feb 2, 2016
@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

就是显示个大大的500。
路径中有中文或包括中文文件名就是这样。

在try.gogs.io上没有这个问题。

@unknwon
Copy link
Member

unknwon commented Feb 2, 2016

Please check log on the server side.

@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

错误都是类似这样:
2016/02/02 10:24:18 [...routers/repo/view.go:134 Home()] [E] GetCommitsInfo: GetCommitByPath (/发布): Length must be 40:
2016/02/02 10:24:26 [...routers/repo/view.go:134 Home()] [E] GetCommitsInfo: GetCommitByPath (/opc消息指令协议.txt): Length must be 40:

[...routers/repo/view.go:134 Home()] [E] GetCommitsInfo: GetCommitByPath (doc//说明 2.docx): Length must be 40:
2016/02/02 11:11:38 [.../runtime/asm_amd64.s:437 call32()] [E] Convert content encoding: Unknown encoding: 18030

@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

我还没有go环境,对go,git也不熟悉,尝试着搜索了一番,感觉好像是这里出错了。

https://github.com/gogits/git-module/blob/d86a90f801dbe279db095437a8c7ea42c60e8d98/tree.go

step = 40
id, err := NewIDFromString(string(data[pos : pos+step]))

@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

https://try.gogs.io/cztest/test2/src/master/doc

使用谷歌浏览器,更多工具,编码,选择GB18030,则路径中的中文文件名正确显示。
但网页其他部分是乱码。

https://try.gogs.io/cztest/test2/src/master/main.c
无论设置什么编码,都无法正确显示文件中的中文
https://try.gogs.io/cztest/test2/src/master/Readme.txt
设置GB18030,则可以正确显示中文

说明gogs对这两种文件使用了不同的处理方式

@unknwon
Copy link
Member

unknwon commented Feb 2, 2016

https://try.gogs.io/cztest/test2/src/master/main.c

What is the real encoding of this file?

@Strongc
Copy link
Author

Strongc commented Feb 2, 2016

What is the real encoding of this file?

看起来应该是GB18030,就是普通的中文win10下用记事本做的文件

@unknwon
Copy link
Member

unknwon commented Feb 2, 2016

看起来应该是GB18030,就是普通的中文win10下用记事本做的文件

You must be sure...

@Strongc
Copy link
Author

Strongc commented Feb 3, 2016

You must be sure...
我确定,证据看附图。
使用了二进制查看工具,还有一些工具性网站。
其中个GB2312,GBK,GB18030之间的关系是范围越来越大的关系,从右向左向下兼容。

这是一个查内码,提供乱码恢复的网站
http://www.mytju.com/classCode/tools/messyCodeRecover.asp

这是自动检测GB18030和UTF8的例子
http://blog.csdn.net/firstboy0513/article/details/7349854
你的目标是彻底国际化,所以全自动探测编码有困难,这个我理解。

现在显示乱码就是在中文win10环境下生成的中文编码是GB18030,在网页预览时,被识别为Windows-1252,因此显示为乱码。但强行设置网页编码为GB18030也无效,原因不清楚。
https://try.gogs.io/cztest/test2/src/master/main.c

但Readme.txt在首页预览中的中文可以正确显示,这个又是怎么回事?
https://try.gogs.io/cztest/test2/src/master/Readme.txt
直接查看时,强行设置页面编码为GB18030,又会正常显示中文。

https://try.gogs.io/cztest/test2/src/master/doc
文件名显示乱码,但强行设置编码为GB18030后正常显示中文。

4
5
6

@Strongc
Copy link
Author

Strongc commented Feb 3, 2016

原始的文件
Readme.txt

main.c.txt
这个改了后缀,否则无法上传。

说明 2.docx

@unknwon
Copy link
Member

unknwon commented Feb 3, 2016

Thanks for the details but I'm sorry I have no idea about what you're trying to say...

app.ini中设置了
ANSI_CHARSET = GB18030
更新Gogs到 0.8.25.0129
浏览原来有500错的库,问题依旧。

Would you be able to reproduce this 500 on https://try.gogs.io ?

@Strongc
Copy link
Author

Strongc commented Feb 4, 2016

Would you be able to reproduce this 500 on https://try.gogs.io ?
在 try.gogs.io 上没有 500 错

@unknwon unknwon changed the title 对中文支持有些问题 Buggy support for Chinese characters Feb 11, 2016
@unknwon unknwon removed the status: needs feedback Tell me more about it label Feb 11, 2016
ethantkoenig added a commit to ethantkoenig/gogs that referenced this issue Oct 13, 2017
@IssueHuntBot
Copy link

@0maxxam0 has funded $10.00 to this issue. See it on IssueHunt

@cyfx
Copy link

cyfx commented Jan 12, 2023

I also encountered this problem. The version of gogs on my server was upgraded from 0.12.3 to 0.12.10.

When the name of the file I uploaded to the git version is Chinese, 500 errors will appear on the page,

b7fd11c2176db1c192ed17e539addb7

But I'm https://try.gogs.io There is no problem trying,

Then I found a new server to install the 0.12.10 gogs directly, and found that there would be no such problem.

So I upgraded the git version on the server from 1.8.3.1 to 2.38.1, and then I found that the problem has been solved.

@unknwon
Copy link
Member

unknwon commented Jan 12, 2023

So I upgraded the git version on the server from 1.8.3.1 to 2.38.1, and then I found that the problem has been solved.

Amazing discovery, thank you! @cyfx

@unknwon unknwon closed this as completed Jan 12, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
💊 bug Something isn't working 🙇‍♂️ help wanted Need your help
Projects
None yet
Development

No branches or pull requests

4 participants