Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(op_crates/web): Add all single byte encodings to TextDecoder #6178

Merged
merged 13 commits into from
Sep 9, 2020

Conversation

AliBasicCoder
Copy link
Contributor

@AliBasicCoder AliBasicCoder commented Jun 8, 2020

i added this ecodings

    ibm866
    iso-8859-2
    iso-8859-3
    iso-8859-4
    iso-8859-5
    iso-8859-6
    iso-8859-7
    iso-8859-8
    iso-8859-10
    iso-8859-13
    iso-8859-14
    iso-8859-15
    iso-8859-16
    koi8-r
    koi8-u
    macintosh
    windows-874
    windows-1250
    windows-1251
    windows-1253
    windows-1254
    windows-1255
    windows-1256
    windows-1257
    windows-1258
    x-mac-cyrillic

NOTES:

  • i needed to make code points start from 0 not 128
  • i made code points 8 items per line

@CLAassistant
Copy link

CLAassistant commented Jun 8, 2020

CLA assistant check
All committers have signed the CLA.

@AliBasicCoder AliBasicCoder changed the title Adding all single byte encodings to Text Decoder Adding all single byte encodings to Text Decoder (#6076) Jun 8, 2020
@ry
Copy link
Member

ry commented Jun 8, 2020

@AliBasicCoder Thanks for the patch. I haven't looked at this in detail yet - but this needs tests before it can be landed.

@kitsonk
Copy link
Contributor

kitsonk commented Jun 9, 2020

Just to point out, browsers only support encoding UTF-8. https://encoding.spec.whatwg.org/#interface-textencoder so this would be a significant variation from the spec.

@lucacasonato
Copy link
Member

Just to point out, browsers only support encoding UTF-8. https://encoding.spec.whatwg.org/#interface-textencoder so this would be a significant variation from the spec.

@kitsonk This is about decoding though. Encoding should stay utf-8 only to be spec compliant, but decoder can have many encodings (Chrome has many encodings for decoding, but only utf-8 for encoding).

@AliBasicCoder
Copy link
Contributor Author

AliBasicCoder commented Jun 9, 2020

Just to point out, browsers only support encoding UTF-8. https://encoding.spec.whatwg.org/#interface-textencoder so this would be a significant variation from the spec.

@kitsonk i just added for decoding and didn't touch TextEncoder at all

@kitsonk
Copy link
Contributor

kitsonk commented Jun 9, 2020

🤦 I can't read. Sorry folks.

@AliBasicCoder
Copy link
Contributor Author

i have just added all the labels for the encodings

@ry
Copy link
Member

ry commented Jun 10, 2020

@AliBasicCoder I'm all for this but we need tests...

Maybe you can take some tests from here: https://github.com/web-platform-tests/wpt/blob/master/encoding/single-byte-decoder.html

@AliBasicCoder
Copy link
Contributor Author

AliBasicCoder commented Jun 10, 2020

so, i found some issuses and fixed theme

and tested it with the following

then made this script

import { TextDecoder } from "./cli/js/web/text_encoding.ts";

// @ts-ignore
window.TextDecoder = TextDecoder;

then bundled it via deno bundle

then converted it to the browser via babel

then downloaded https://github.com/web-platform-tests/wpt

and commneted the XMLHttpRequest test

then included the bundle via a script tag

and this is the result:

screencapture-localhost-5000-encoding-single-byte-decoder-1591817272672

@ry
Copy link
Member

ry commented Jun 10, 2020

@AliBasicCoder We need those tests to run continuously so they don't get broken in the future.

@AliBasicCoder
Copy link
Contributor Author

@AliBasicCoder We need those tests to run continuously so they don't get broken in the future.

so, where to add theme

@ry
Copy link
Member

ry commented Jun 11, 2020

@AliBasicCoder How about in cli/tests/unit/text_encoding_test.ts? Or maybe create a new test module neighboring that one. (If you create a new one, be sure to add it to cli/tests/unit/unit_tests.ts)

Tests can be run with cargo test unit

@AliBasicCoder
Copy link
Contributor Author

AliBasicCoder commented Jun 11, 2020

@AliBasicCoder How about in cli/tests/unit/text_encoding_test.ts? Or maybe create a new test module neighboring that one. (If you create a new one, be sure to add it to cli/tests/unit/unit_tests.ts)

Tests can be run with cargo test unit

tests are added now

@bartlomieju bartlomieju added cli related to cli/ dir web related to Web APIs labels Jun 16, 2020
@bartlomieju bartlomieju requested a review from ry June 27, 2020 11:44
@bartlomieju bartlomieju added this to the 1.2.0 milestone Jul 2, 2020
@bartlomieju bartlomieju modified the milestones: 1.2.0, 1.3.0 Jul 14, 2020
@ry
Copy link
Member

ry commented Aug 3, 2020

@AliBasicCoder We are missing a CLA from you. It's preventing this PR from going green. I believe you've signed it before - it's the same terms - but we updated it changing my personal name to an entity Deno Land Inc.

@bartlomieju bartlomieju removed this from the 1.3.0 milestone Aug 11, 2020
@AliBasicCoder
Copy link
Contributor Author

@AliBasicCoder We are missing a CLA from you. It's preventing this PR from going green. I believe you've signed it before - it's the same terms - but we updated it changing my personal name to an entity Deno Land Inc.

i signed the CLA

@bartlomieju
Copy link
Member

@AliBasicCoder I'm interested in landing this PR, could you rebase it and move the logic to op_crates/web/08_text_encoding.js?

@AliBasicCoder
Copy link
Contributor Author

@AliBasicCoder I'm interested in landing this PR, could you rebase it and move the logic to op_crates/web/08_text_encoding.js?

done

@AliBasicCoder
Copy link
Contributor Author

the problem is
git status -uno --porcelain --ignore-submodules
prints

 M cli/tests/unit/single_byte_encoding_test.ts
 M op_crates/web/08_text_encoding.js
 M op_crates/web/text_encoding_test.js

@ry ry added this to the 1.4.0 milestone Aug 31, 2020
@bartlomieju bartlomieju changed the title Adding all single byte encodings to Text Decoder (#6076) feat(op_crates/web): Add all single byte encodings to Text Decoder (#6076) Sep 9, 2020
Copy link
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Huge thanks to @AliBasicCoder and sorry it took so long to land

@bartlomieju bartlomieju changed the title feat(op_crates/web): Add all single byte encodings to Text Decoder (#6076) feat(op_crates/web): Add all single byte encodings to TextDecoder Sep 9, 2020
@bartlomieju bartlomieju merged commit 0d12693 into denoland:master Sep 9, 2020
@AliBasicCoder AliBasicCoder deleted the text_encoding branch September 26, 2020 16:49
@linux-china
Copy link
Contributor

@AliBasicCoder any plan to add utf-16le? Now emscripten's js code use utf-16le for decoding. I want to run wasm compiled by emscripten, and utf-16le is a problem. if you can implement it, and it will be very greatful to run emscripten wasm on Deno. thanks.

// Given a pointer 'ptr' to a null-terminated UTF16LE-encoded string in the emscripten HEAP, returns
// a copy of that string as a Javascript String object.

var UTF16Decoder = typeof TextDecoder !== 'undefined' ? new TextDecoder('utf-16le') : undefined;

@AliBasicCoder
Copy link
Contributor Author

@AliBasicCoder any plan to add utf-16le? Now emscripten's js code use utf-16le for decoding. I want to run wasm compiled by emscripten, and utf-16le is a problem. if you can implement it, and it will be very greatful to run emscripten wasm on Deno. thanks.

// Given a pointer 'ptr' to a null-terminated UTF16LE-encoded string in the emscripten HEAP, returns
// a copy of that string as a Javascript String object.

var UTF16Decoder = typeof TextDecoder !== 'undefined' ? new TextDecoder('utf-16le') : undefined;

see #8108

@xaliphostes
Copy link

@AliBasicCoder any plan to add utf-16le? Now emscripten's js code use utf-16le for decoding. I want to run wasm compiled by emscripten, and utf-16le is a problem. if you can implement it, and it will be very greatful to run emscripten wasm on Deno. thanks.

// Given a pointer 'ptr' to a null-terminated UTF16LE-encoded string in the emscripten HEAP, returns
// a copy of that string as a Javascript String object.

var UTF16Decoder = typeof TextDecoder !== 'undefined' ? new TextDecoder('utf-16le') : undefined;

Agreed
Same problem while using wasm with deno.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli related to cli/ dir web related to Web APIs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants