Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 本地导出为乱码,导入失败(Title: [Bug] The local export is garbled and the import fails.) #3562

Open
1 of 3 tasks
taurusduan opened this issue Dec 18, 2023 · 23 comments

Comments

@taurusduan
Copy link

taurusduan commented Dec 18, 2023

Describe the bug
A clear and concise description of what the bug is.
版本为:2.9.13 本地数据导出后为乱码,然后就无法导入了。(Version is: 2.9.13 After exporting the local data, it becomes garbled and cannot be imported.)
To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error
    导出导入(Export and import)

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
image

Deployment

  • Docker
  • Vercel
  • Server

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]
    windows11+谷歌浏览器(Windows 11 + Google Chrome)
    Smartphone (please complete the following information):
  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional Logs
Add any logs about the problem here.

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Title: [Bug] The local export is garbled and the import fails.

@taurusduan taurusduan changed the title [Bug] 本地导出为乱码,导入失败 [Bug] 本地导出为乱码,导入失败(Title: [Bug] The local export is garbled and the import fails.) Dec 18, 2023
@Hub-moon
Copy link

It seems that only the client side has this issue.

@H0llyW00dzZ
Copy link
Contributor

this issue related to #3395

@sheng-di
Copy link

+1,在 macOS 下的客户端也有这个问题。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


+1, the client under macOS also has this problem.

@TCOTC
Copy link

TCOTC commented Feb 2, 2024

+1,希望能尽快修复,否则我只能在一台设备上使用

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


+1, hope it gets fixed soon otherwise I can only use it on one device

@jerrylususu
Copy link

jerrylususu commented Feb 4, 2024

this issue related to #3395

这是同一个bug。问题在于这里转换text到uint8array的方式是错误的,charCodeAt方法返回utf-16编码,但是uint8array的取值范围是0~255,导致英文字符可以正常转换,中文字符的编码值都被截断了。

new Uint8Array([...text].map((c) => c.charCodeAt(0)))

示例

text = "你好世界hellworld"
console.log(new Uint8Array([...text].map((c) => c.charCodeAt(0))))
// Uint8Array(13) [96, 125, 22, 76, 104, 101, 108, 108, 119, 111, 114, 108, 100]

console.log(new Uint32Array([...text].map((c) => c.charCodeAt(0))))
// Uint32Array(13) [20320, 22909, 19990, 30028, 104, 101, 108, 108, 119, 111, 114, 108, 100]

正确的做法是用TextEncoder

let str = "你好世界";
let encoder = new TextEncoder();
let utf8Array = encoder.encode(str);
// Uint8Array(21) [228, 189, 160, 229, 165, 189, 228, 184, 150, 231, 149, 140, 104, 101, 108, 108, 119, 111, 114, 108, 100]

这个bug应该是跨平台的,只要用户数据(聊天内容,对话标题..)包含中文(或者任何超出0~255编码范围的字符)就必定会触发。另外因为数据本身被截断了,意味着修复前导出的内容无法简单恢复(aka 数据永久丢失了)


This is the same bug. The problem lies in the incorrect way of converting text to a Uint8Array. The charCodeAt method returns UTF-16 encoding, but the valid range for Uint8Array is 0 to 255. This causes English characters to be converted correctly, but the encoding values for Chinese characters are truncated.

Example:

text = "你好世界hellworld"
console.log(new Uint8Array([...text].map((c) => c.charCodeAt(0))))
// Uint8Array(13) [96, 125, 22, 76, 104, 101, 108, 108, 119, 111, 114, 108, 100]

console.log(new Uint32Array([...text].map((c) => c.charCodeAt(0))))
// Uint32Array(13) [20320, 22909, 19990, 30028, 104, 101, 108, 108, 119, 111, 114, 108, 100]

The correct approach is to use TextEncoder:

let str = "你好世界";
let encoder = new TextEncoder();
let utf8Array = encoder.encode(str);
// Uint8Array(21) [228, 189, 160, 229, 165, 189, 228, 184, 150, 231, 149, 140, 104, 101, 108, 108, 119, 111, 114, 108, 100]

This bug is likely cross-platform, as long as user data (chat content, conversation titles, etc.) includes Chinese characters (or any characters beyond the 0 to 255 encoding range), it will trigger the bug. Additionally, since the data itself is truncated, it means that the exported content before the fix cannot be easily recovered (i.e., the data is permanently lost).

@H0llyW00dzZ
Copy link
Contributor

omg you are not smart, bug its not in that things

@H0llyW00dzZ
Copy link
Contributor

its because of this format are text not a json
https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web/blob/4511aa4d21eda4e6e0b5130d1e3222bb30734672/app/utils.ts#L68C1-L71C7

if you see my fork there is no problem whatever you export

@H0llyW00dzZ
Copy link
Contributor

@H0llyW00dzZ
Copy link
Contributor

H0llyW00dzZ commented Feb 4, 2024

also this issue is duplicated
yiidaa was explaining how to fix this before

@shenyan-008
Copy link

2.10.3导出中文还是有乱码

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


2.10.3 There are still garbled characters when exporting Chinese.

@NightmareZero
Copy link

v2.10.3 still has this issue
Is anyone paying attention to this issue?

@LinhanXu3928
Copy link

When my mask is written in Chinese and I export it, it will not be able to be opened correctly due to encoding errors, and therefore cannot be imported. The same problem also occurs when exporting Chinese chat data.
Hope nice developers can pay attention to the vast number of Chinese users!

@TCOTC
Copy link

TCOTC commented Mar 13, 2024

@Yidadaa 我希望这个问题能够修复,自从 v2.9.8 开始就无法导出正确的 json 了(会导入失败)。如果需要,我可以通过非公开渠道提供我在 v2.9.7v2.9.8 导出的同样的本地数据。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


@Yidadaa I hope this problem can be fixed. Since v2.9.8, the correct json cannot be exported (will Import failed). If needed, I can provide my work in v2.9.7 and [v2.9.8](https ://github.com/ChatGPTNextWeb/ChatGPT-Next-Web/releases/tag/v2.9.8) exported the same local data.

@TCOTC
Copy link

TCOTC commented Mar 14, 2024

fix by #3972

v2.11.3 我这里导出已经正常了,大家可以试一试

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


fix by #3972

v2.11.3 My export is now normal, you can give it a try

@NightmareZero
Copy link

fix by #3972

v2.11.3 我这里导出已经正常了,大家可以试一试

I have already gone to use the chatbox

@jiangying000
Copy link

jiangying000 commented Apr 17, 2024

同步了最新代码,全新从windows11 chrome导出,接着导入ios 17 的chrome的时候还是报【导入失败】
我查看了 json,没有发现什么异常

导出的文件大小 3112 KB,没有超过localStorage 5MB的限制

失败后,有个报错界面显示:
quotaexceedederror: the quota has been exceeded.
后面一堆 堆栈

基本确定是 ios 的浏览器的问题,桌面端的firefox,chrome都没问题

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


The latest code has been synchronized and exported from Windows 11 chrome. When importing into ios 17 chrome, it still reports [Import failed]
I checked the json and found nothing unusual

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests