Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

评论语料乱码的解决 #8

Open
zhugw opened this issue Jul 28, 2018 · 3 comments
Open

评论语料乱码的解决 #8

zhugw opened this issue Jul 28, 2018 · 3 comments

Comments

@zhugw
Copy link

zhugw commented Jul 28, 2018

问题

Mac git clone 下来的评论语料乱码 如下所示

➜  neg git:(master) ✗ head neg.0.txt 
��׼��̫�� ���仹����3�ǵ� ������ʩ�dz��¾�.����Ƶ���ϵı�׼����¸���.

解决

 #读取文件内容
 def getContent(fullname):
-    f = codecs.open(fullname, 'r')
+    f = codecs.open(fullname, 'r', encoding="gbk", errors="ignore")

@wenk207
Copy link

wenk207 commented Jul 5, 2020

good

@RongBoZ
Copy link

RongBoZ commented Dec 17, 2020

nice

@lailaiya
Copy link

beauti

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants