Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

处理含中文的文件,但是文件字符集为iso88591的时候,程序直接close #10

Open
someonebw opened this issue Jan 29, 2016 · 21 comments

Comments

@someonebw
Copy link

test-iso88591.txt: text/plain; charset=iso-8859-1
test-utf-8.txt: text/plain; charset=utf-8
[root@node96 test]# hexdump -C test-utf-8.txt
00000000 74 65 73 74 c3 96 c3 90 |test....|
00000008
[root@node96 test]# hexdump -C test-iso88591.txt
00000000 74 65 73 74 d6 d0 |test..|
00000006

在页面上面会报错,websocket被close

WebSocket connection to 'ws://x.x.x.x:9527/ws' failed: Could not decode a text frame as UTF-8.

@xsank
Copy link
Owner

xsank commented Jan 30, 2016

Uploading tmp.jpg…
我这里是可以的,还有别的条件吗?

@xsank
Copy link
Owner

xsank commented Jan 30, 2016

`[xsank.mz@myhost /home/xsank.mz]
$hexdump -C testdd.txt
00000000 a1 09 69 6e 76 65 72 74 65 64 20 65 78 63 6c 61 |..inverted excla|
00000010 6d 61 74 69 6f 6e 20 6d 61 72 6b 09 26 69 65 78 |mation mark.&iex|
00000020 63 6c 3b 09 26 23 31 36 31 3b 0a a2 09 63 65 6e |cl;.¡...cen|
00000030 74 09 26 63 65 6e 74 3b 09 26 23 31 36 32 3b 0a |t.¢.¢.|
00000040 a3 09 70 6f 75 6e 64 09 26 70 6f 75 6e 64 3b 09 |..pound.£.|
00000050 26 23 31 36 33 3b 0a a4 0a |£...|
00000059

[xsank.mz@myhost /home/xsank.mz]
$file -bi testdd.txt
text/plain; charset=iso-8859-1

[xsank.mz@myhost /home/xsank.mz]
$ `

@someonebw
Copy link
Author

[root@node96 ~]# env |grep LANG
LANG=zh_CN.UTF-8
[root@node96 ~]# python
Python 2.7.9 (default, Nov 5 2015, 10:01:01)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

[root@node96 ~]# uname -r
2.6.32-431.el6.x86_64
[root@node96 ~]# cat /etc/redhat-release
CentOS release 6.5 (Final)
[root@node96 ~]# cat /tmp/test/test-utf-8.txt
testÖÐ[root@node96 ~]# cat /tmp/test/test-iso88591.txt
testא[root@node96 ~]#

我的环境,没有啥特殊的地方了.

@someonebw
Copy link
Author

qq 20160202095559

我的场景截图

@someonebw
Copy link
Author

test-iso88591.txt
test-utf-8.txt

我测试的2个文件.

@someonebw
Copy link
Author

场景截图
qq 20160202100058

@someonebw
Copy link
Author

场景截图
qq 20160202100204

@xsank
Copy link
Owner

xsank commented Feb 2, 2016

看起来是websocket客户端收到非utf-8编码时自己主动close的,白天工作比较苦逼,晚上我查查这个问题

@someonebw
Copy link
Author

行,xsank,辛苦了.:)

@xsank
Copy link
Owner

xsank commented Feb 2, 2016

看起来在javascript的websocket的onmessage方法中是没办法处理这个问题的,这个方法收到的数据已经是parse过frame的数据了,抛错在此方法之前

临时解决方案:
服务器端添加chardet模块来校验发送的数据类型,如果数据非UTF-8编码则声明websocket不支持,亲测可行
暂时没想到好的方案,先不提交代码

@xsank
Copy link
Owner

xsank commented Feb 2, 2016

utf-8编码应该是支持websocket的浏览器硬性要求的,目前只能应用层处理或者换修复此问题的浏览器

@someonebw
Copy link
Author

亲测了这个example,好像没有这个编码的问题.
https://github.com/chjj/term.js/blob/master/example/index.js
不知道对你有没有帮助!

@someonebw
Copy link
Author

继续关注......

@xsank
Copy link
Owner

xsank commented Feb 15, 2016

好的,刚开始上班,这周应该有空的,你说的这个例子我看看

@someonebw
Copy link
Author

我试了下这个插件,貌似也没有编码方面的问题(还支持复制,粘贴).好像是基于hterm开发的.也是脚本实现的.非term.js

https://github.com/chromium/hterm

1 similar comment
@someonebw
Copy link
Author

我试了下这个插件,貌似也没有编码方面的问题(还支持复制,粘贴).好像是基于hterm开发的.也是脚本实现的.非term.js

https://github.com/chromium/hterm

@someonebw
Copy link
Author

@someonebw
Copy link
Author

这个好像是基于hterm开发的

https://github.com/krishnasrinivas/wetty

@someonebw
Copy link
Author

給你发了,关于hterm的mail,请查收.供参考.

@someonebw
Copy link
Author

如何了@xsank

@anythingwhat
Copy link

中文字符编码不是utf-8,程序直接close的问题,求指教@someonebw@xsank

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants