Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PacketEncoder corrupts UTF-8 message #157

Closed
evgeny-pasynkov opened this issue Sep 15, 2014 · 18 comments
Closed

PacketEncoder corrupts UTF-8 message #157

evgeny-pasynkov opened this issue Sep 15, 2014 · 18 comments
Assignees
Labels
Milestone

Comments

@evgeny-pasynkov
Copy link

Hi,

Using SocketIOClient.send("русский"), I receive on client string "Ñ�Ñ�Ñ�Ñ�кий"

I think that the problem is in PacketEncoder lines 272-274:

                            String str = b.toString(CharsetUtil.ISO_8859_1);
                            if (enc.canEncode(str)) {
                                buf.writeBytes(str.getBytes(CharsetUtil.UTF_8));
                            }

the string gets corrupted when converting it with ISO_8859_1 charset, and then it is sent corrupted

@Maypeur
Copy link

Maypeur commented Sep 17, 2014

Hello,
I really don't know if it's a good solution but i used that to make it work in my case :

CharsetEncoder enc = CharsetUtil.UTF_8.newEncoder();
                            String str = b.toString(CharsetUtil.UTF_8);
                            if (enc.canEncode(str)) {
                                buf.writeBytes(str.getBytes(CharsetUtil.UTF_8));
                            } else {
                                buf.writeBytes(b);
                            }

@mrniko
Copy link
Owner

mrniko commented Sep 17, 2014

@evgeny-pasynkov hi!
Do use UTF-8 encoding for your sources?

@evgeny-pasynkov
Copy link
Author

@mrniko What do you mean by "UTF-8 for sources"?

Is it default encoding for JVM process? If yes, then it isn't a good solution - I don't want my software to depend on server locale :)

Actually, converting my string to ISO_8859_1 damages it, so client cannot restore it further.

@evgeny-pasynkov
Copy link
Author

@Maypeur your solution is tautology :) It is simply equivalent to "buf.writeBytes(b)"

@Maypeur
Copy link

Maypeur commented Sep 17, 2014

@evgeny-pasynkov maybe ! I let it because there was a problem with websocket and accent !

@mrniko
Copy link
Owner

mrniko commented Sep 17, 2014

@evgeny-pasynkov could you purpose a better solution for this problem?

@evgeny-pasynkov
Copy link
Author

@mrniko Could you point please to the mentioned websockets bug? Why not to simplify all this stuff to "buf.writeBytes(b)"?

@mrniko
Copy link
Owner

mrniko commented Sep 18, 2014

to avoid encoding problem

@evgeny-pasynkov
Copy link
Author

What problems? In the code comment, you've mentioned the websockets bug. Which one?

BTW, socket.io had some UTF-8 encoding problems, they claimed to be fixed in socket.io 1.1.0
Check this topic: socketio/socket.io#1744

@Maypeur
Copy link

Maypeur commented Sep 18, 2014

@evgeny-pasynkov it is for #137 , using new socket.io.client 1.1.0 and socket.io 1.7.3-SNAPSHOT make problem on UTF-8 characters, but with client 1.0.6 and 1.7.3-SNAPSHOT it's working, so now the bug is corrected in client maybe PacketEncoder need to be adapted. Only @mrniko can say what to do now !

@mrniko
Copy link
Owner

mrniko commented Sep 18, 2014

oh! i think to release next 1.7.4 version with this fix so it will be 1.1.0+ compatible only, ok?

@evgeny-pasynkov
Copy link
Author

For me it is ok.

@Maypeur
Copy link

Maypeur commented Sep 18, 2014

Since pre 1.1.0 versions have this major bug i think it must !

@mrniko mrniko added the bug label Sep 19, 2014
@mrniko mrniko added this to the 1.7.4 milestone Sep 23, 2014
@mrniko mrniko self-assigned this Sep 23, 2014
mrniko pushed a commit that referenced this issue Sep 24, 2014
@mrniko
Copy link
Owner

mrniko commented Sep 24, 2014

please check

@mrniko mrniko closed this as completed Sep 24, 2014
@evgeny-pasynkov
Copy link
Author

The problem is fixed for my scenarios. Thank you!

@mrniko
Copy link
Owner

mrniko commented Sep 25, 2014

@Maypeur did you happy too?

@Maypeur
Copy link

Maypeur commented Sep 25, 2014

UTF-8 is now correct, but i'm searching why sometimes the memory gow up and never decrease even if i got only 5 users !
Anyway thank you !!!

@mrniko
Copy link
Owner

mrniko commented Sep 26, 2014

@Maypeur take a look at memory dump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants