Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

Conversation

nikolaglumac
Copy link
Contributor

@nikolaglumac nikolaglumac commented Apr 5, 2018

JavaScript length method (with ES6) actually counts unicode punycodes of the UTF-16 encoding of the given string. Apart from messing up with surrogate pairs (counting for instance 💩 as two characters because encoded with a pair or punycodes), it also leads to unexpected results when characters have a different UTF-8 and UTF-16 encoding.

Since requests are made using UTF-8 encoding, the Content-Length of these requests should match the actual length when payloads are UTF-8 encoded.

This PR is a port of #839

Acceptance tests results:

Before the fix:
screen shot 2018-04-05 at 11 59 19

After the fix:
screen shot 2018-04-05 at 12 00 07

@nikolaglumac
Copy link
Contributor Author

I can confirm that all our acceptance tests are passing!

@nikolaglumac
Copy link
Contributor Author

My manual testing has also proven that the fix works! 🎉

@nikolaglumac
Copy link
Contributor Author

@DominikGuzei @darko-mijic we need to get this PR merged as soon as possible so that we can resume the release process...

@DominikGuzei
Copy link
Member

@nikolaglumac you forgot to push the acceptance tests implementation?

@nikolaglumac
Copy link
Contributor Author

No @DominikGuzei - I have added them in the first commit: 521d30a
(perhaps this was not the best idea in the world)...

@DominikGuzei
Copy link
Member

@nikolaglumac ah sorry … i just saw the feature file and thought there are some implementation steps missing - but you just reused existing functionality 👍

@DominikGuzei
Copy link
Member

@nikolaglumac i will merge this now as it works as intended. Please feel free to create an issue on YouTrack with the proposal by @cleverca22

@DominikGuzei DominikGuzei merged commit ae68fa3 into release/0.9.1 Apr 5, 2018
@nikolaglumac
Copy link
Contributor Author

Yes that single test captures the issue (in the code without the fix)

@DominikGuzei DominikGuzei deleted the ktorz/csl-2435/cannot-use-non-latin-characters-hotfix branch April 5, 2018 11:54
@darko-mijic darko-mijic mentioned this pull request Apr 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants