Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

Conversation

Projects
None yet
3 participants
@nikolaglumac
Copy link
Contributor

commented Apr 5, 2018

JavaScript length method (with ES6) actually counts unicode punycodes of the UTF-16 encoding of the given string. Apart from messing up with surrogate pairs (counting for instance 馃挬 as two characters because encoded with a pair or punycodes), it also leads to unexpected results when characters have a different UTF-8 and UTF-16 encoding.

Since requests are made using UTF-8 encoding, the Content-Length of these requests should match the actual length when payloads are UTF-8 encoded.

This PR is a port of #839

Acceptance tests results:

Before the fix:
screen shot 2018-04-05 at 11 59 19

After the fix:
screen shot 2018-04-05 at 12 00 07

@nikolaglumac

This comment has been minimized.

Copy link
Contributor Author

commented Apr 5, 2018

I can confirm that all our acceptance tests are passing!

@nikolaglumac

This comment has been minimized.

Copy link
Contributor Author

commented Apr 5, 2018

My manual testing has also proven that the fix works! 馃帀

@nikolaglumac

This comment has been minimized.

Copy link
Contributor Author

commented Apr 5, 2018

@DominikGuzei @darko-mijic we need to get this PR merged as soon as possible so that we can resume the release process...

@DominikGuzei

This comment has been minimized.

Copy link
Member

commented Apr 5, 2018

@nikolaglumac you forgot to push the acceptance tests implementation?

@nikolaglumac

This comment has been minimized.

Copy link
Contributor Author

commented Apr 5, 2018

No @DominikGuzei - I have added them in the first commit: 521d30a
(perhaps this was not the best idea in the world)...

@DominikGuzei

This comment has been minimized.

Copy link
Member

commented Apr 5, 2018

@nikolaglumac ah sorry 鈥 i just saw the feature file and thought there are some implementation steps missing - but you just reused existing functionality 馃憤

@DominikGuzei

This comment has been minimized.

Copy link
Member

commented Apr 5, 2018

@nikolaglumac i will merge this now as it works as intended. Please feel free to create an issue on YouTrack with the proposal by @cleverca22

@DominikGuzei DominikGuzei merged commit ae68fa3 into release/0.9.1 Apr 5, 2018

3 checks passed

buildkite/daedalus Build #940 passed (26 minutes, 57 seconds)
Details
continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@nikolaglumac

This comment has been minimized.

Copy link
Contributor Author

commented Apr 5, 2018

Yes that single test captures the issue (in the code without the fix)

@DominikGuzei DominikGuzei deleted the ktorz/csl-2435/cannot-use-non-latin-characters-hotfix branch Apr 5, 2018

@darko-mijic darko-mijic referenced this pull request Apr 5, 2018

Merged

Release/0.9.1 #784

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.