[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

nikolaglumac · 2018-04-05T10:02:46Z

JavaScript length method (with ES6) actually counts unicode punycodes of the UTF-16 encoding of the given string. Apart from messing up with surrogate pairs (counting for instance 💩 as two characters because encoded with a pair or punycodes), it also leads to unexpected results when characters have a different UTF-8 and UTF-16 encoding.

Since requests are made using UTF-8 encoding, the Content-Length of these requests should match the actual length when payloads are UTF-8 encoded.

This PR is a port of #839

Acceptance tests results:

Before the fix:

After the fix:

nikolaglumac · 2018-04-05T11:06:05Z

I can confirm that all our acceptance tests are passing!

nikolaglumac · 2018-04-05T11:33:12Z

My manual testing has also proven that the fix works! 🎉

nikolaglumac · 2018-04-05T11:36:08Z

@DominikGuzei @darko-mijic we need to get this PR merged as soon as possible so that we can resume the release process...

DominikGuzei · 2018-04-05T11:44:55Z

@nikolaglumac you forgot to push the acceptance tests implementation?

nikolaglumac · 2018-04-05T11:48:19Z

No @DominikGuzei - I have added them in the first commit: 521d30a
(perhaps this was not the best idea in the world)...

DominikGuzei · 2018-04-05T11:53:29Z

@nikolaglumac ah sorry … i just saw the feature file and thought there are some implementation steps missing - but you just reused existing functionality 👍

DominikGuzei · 2018-04-05T11:54:35Z

@nikolaglumac i will merge this now as it works as intended. Please feel free to create an issue on YouTrack with the proposal by @cleverca22

nikolaglumac · 2018-04-05T11:54:44Z

Yes that single test captures the issue (in the code without the fix)

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings

521d30a

nikolaglumac added the bug label Apr 5, 2018

nikolaglumac assigned KtorZ Apr 5, 2018

nikolaglumac mentioned this pull request Apr 5, 2018

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #839

Merged

nikolaglumac requested review from DominikGuzei and darko-mijic April 5, 2018 10:04

[CSL-2435] Add CHANGELOG entry

8f0cd13

DominikGuzei merged commit ae68fa3 into release/0.9.1 Apr 5, 2018

DominikGuzei deleted the ktorz/csl-2435/cannot-use-non-latin-characters-hotfix branch April 5, 2018 11:54

darko-mijic mentioned this pull request Apr 5, 2018

Release/0.9.1 #784

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

nikolaglumac commented Apr 5, 2018 •

edited

Loading

nikolaglumac commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

DominikGuzei commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

DominikGuzei commented Apr 5, 2018

DominikGuzei commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

[CSL-2435] Use 'TextEncoder' to measure correct length of UTF-8 strings #840

Conversation

nikolaglumac commented Apr 5, 2018 • edited Loading

Acceptance tests results:

nikolaglumac commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

DominikGuzei commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

DominikGuzei commented Apr 5, 2018

DominikGuzei commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018

nikolaglumac commented Apr 5, 2018 •

edited

Loading