Skip to content

Socket.IO merges unicode combining accent marks during transport #669

@coreh

Description

@coreh

Hey there,

I've been hunting a bug pertaining filenames with special characters on my project, nide, and I think I've traced it back to Socket.IO.

It seems whenever I send to the server strings containing combining accent marks, such as 'COMBINING ACUTE ACCENT' (U+0301) they're somehow merged with the character before and transformed into the "normal" accented character. That is, if I call:

socket.emit('test', 'a\u0301')

The server will actually receive the string 'á' instead of the string 'a\u0301'. This is kind of hard to notice, as when printed they both look the same. The problem is that they have different lengths, and do not compare as equal.

'a\u0301' == 'á' // evaluates to false

I'm not sure if it's relevant, but I'm currently using the xhr-polling transport.

Is this behavior already known or expected? If so, how can I work around it?

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions