Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response in Chinese or non ASCII char will become Mojibake #32

Closed
Yukimagi opened this issue Oct 23, 2023 · 2 comments
Closed

Response in Chinese or non ASCII char will become Mojibake #32

Yukimagi opened this issue Oct 23, 2023 · 2 comments
Labels
question Further information is requested

Comments

@Yukimagi
Copy link

Yukimagi commented Oct 23, 2023

``您好:
不好意思再次請教您,想詢問開發者,我希望他可以回答我中文,但他回答中文後會變成utf-8格式,而非中文字,想詢問有解決方法嗎?我已經嘗試過使用一些轉碼的技巧,像是:

    decoded_response = codecs.escape_decode(response['text'])[0].decode('utf-8')
    response['text'] = decoded_response

    print(json.dumps(response, indent=2))
    assert response

或是:

    response_text = response["text"].encode("utf-8").decode("unicode-escape")

    print(json.dumps(response_text, indent=2))
    assert response

但依舊是亂碼:
"sources_text": "#no\n\u9019\u500b\u6a19\u984c\u5c0d\u53f0\u7a4d\u96fb\u7684\u80a1\u50f9\u662f\u58de\u6d88\u606f\uff0c\u56e0\u70ba\u5b83\u610f\u5473\u8457\u53f0\u7a4d\u96fb\u57282\u5948\u7c73\u88fd\u7a0b\u7684\u6295\u8cc7\u8a08\u756b\u6703\u53d7\u5230\u5ef6\u9072\uff0c\u9019\u53ef\u80fd\u5f71\u97ff\u53f0\u7a4d\u96fb\u5728\u5168\u7403\u6676\u7247\u5e02\u5834\u7684\u7af6\u722d\u529b\u548c\u9818\u5148\u5730\u4f4d\u3002\u6839\u64da[\u4e2d\u6642\u65b0\u805e\u7db2]\u7684\u5831\u5c0e\uff0c\u53f0\u7a4d\u96fb\u539f\u672c\u9810\u8a08\u5728\u4eca\u5e74\u5e74\u5e95\u53d6\u5f97\u4e2d\u79d1\u5712\u5340\u7684\u7528\u5730\uff0c\u4f46\u56e0\u70ba\u74b0\u8a55\u7a0b\u5e8f\u7684\u5ef6\u5b95\uff0c\u9810\u8a08\u8981\u5230\u660e\u5e74\u7b2c\u4e00\u5b63\u624d\u80fd\u4ea4\u5730\u3002\u9019\u5c0d\u65bc\u53f0\u7a4d\u96fb\u4f86\u8aaa\u662f\u4e00\u500b\u6311\u6230\uff0c\u56e0\u70ba\u5b83\u9700\u8981\u57282024\u5e74\u958b\u59cb\u91cf\u75222\u5948\u7c73\u6676\u7247\uff0c\u800c\u4e14\u9084\u8981\u9762\u5c0d\u4f86\u81ea\u4e09\u661f\u7b49\u7af6\u722d\u5c0d\u624b\u7684\u58d3\u529b\u3002 "

謝謝您!

@JE-Chen
Copy link
Member

JE-Chen commented Oct 23, 2023

.encode("utf-8").decode("unicode_escape")

@Yukimagi
Copy link
Author

Yukimagi commented Oct 23, 2023

我找到解決方法了,可以修改以下的程式碼,非常感謝開發者,感激不盡!

print(json.dumps(response, ensure_ascii=False, indent=2)) # 加入 ensure_ascii=False

@JE-Chen JE-Chen added the question Further information is requested label Oct 23, 2023
@JE-Chen JE-Chen changed the title Response in chinese will become Garbled characters Response in Chinese or non ASCII char will become Mojibake Oct 24, 2023
@JE-Chen JE-Chen closed this as completed Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants