Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support characters >1 byte #5

Merged
merged 4 commits into from
Feb 21, 2019
Merged

Conversation

microbit-rosslowe
Copy link
Contributor

Create a new test for issue #1 , testing if strToBytes() supports characters longer than 1 byte.

Create a new test for issue #1 , testing if strToBytes() supports characters longer than 1 byte.
@microbit-carlos
Copy link
Collaborator

Great, thanks Ross!
I normally like follow the AAA pattern (Arrange Act Assert), which you are pretty much following already here, where the only difference is having a blank line between sections.
https://jamescooke.info/arrange-act-assert-pattern-for-python-developers.html

Will you continue working on the fix in this branch/PR?

Per feedback in #5 (review). Also add spacing for AAA format.
- Properly encode >1 byte characters, closes #1 (adds text-encoding as a dependency)
- Add test for bytesToStr()
- Fix test for strToBytes(); add 3 byte character test
@microbit-rosslowe microbit-rosslowe changed the title Create a test for unicode characters Support characters >1 byte Feb 19, 2019
@microbit-rosslowe
Copy link
Contributor Author

Latest commit adds a new test for bytesToStr(), tests for 3 byte characters, and adds a fix for #1

package.json Outdated Show resolved Hide resolved
@microbit-carlos
Copy link
Collaborator

That's a nice polyfill! It makes me a bit uneasy that it is marked as deprecated, but if it works well I think it's fine.
https://www.npmjs.com/package/text-encoding

Did you have a look or try other ones as well?

Move @types/text-encoding dependency to dev-dependencies
@microbit-rosslowe
Copy link
Contributor Author

@microbit-carlos I looked at using https://github.com/mathiasbynens/utf8.js but the way it worked with strings was a little confusing.

@microbit-carlos
Copy link
Collaborator

Yeah, I agree, the \x notation from the utf-8 library is not easy to work with, very glad to see that TextEncoder takes and returns Uint8Arrays.
I was wondering if there were other TextEncoder polyfills that were still maintained, just to check if this one is the best option.

@microbit-rosslowe
Copy link
Contributor Author

I've had a look at using https://github.com/solderjs/TextEncoderLite but it looks like it isn't being actively maintained, and didn't seem to work properly (it was also last published 1 year ago whereas the one used here was 5 months ago). TextEncoding has 1.2 million weekly downloads, so it seems pretty popular. I also looked at fast-text-encoding which had similar issues and was lasted updated September 2017.

@microbit-carlos
Copy link
Collaborator

@microbit-rosslowe Could we record here the licenses for the new dependencies (the polyfill and the types).

@microbit-rosslowe
Copy link
Contributor Author

microbit-rosslowe commented Feb 20, 2019

@microbit-carlos
Copy link
Collaborator

Great, and the @types/text-encoding package is MIT licensed:
https://www.npmjs.com/package/@types/text-encoding
https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/LICENSE.

It all looks good, if you are happy with the PR you can change the draft state and I can merge.

Copy link
Collaborator

@microbit-carlos microbit-carlos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ross!

@microbit-carlos microbit-carlos merged commit abb54af into master Feb 21, 2019
@microbit-carlos microbit-carlos deleted the unicode-characters-test branch February 21, 2019 11:48
@microbit-carlos
Copy link
Collaborator

microbit-carlos commented Feb 21, 2019

@microbit-rosslowe This dependency increased the built file from 80Kbs to 700Kbs.

Most of it comes from this file and looks like rollup tree-shaking cannot really remove any of it (since it only uses a small part of the object during runtime, but rollup has no way of knowing that).

Could we find a lighter polyfill? Ideally that only does UTF-8? Or that can easily be trimmed by rollup.

@microbit-rosslowe
Copy link
Contributor Author

We can use text-encoding-lite, which reduces build size to around 88Kbs. Version 1.0.2 hasn't been published to npm but works nicely. See #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants