Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check destination length in encodeInto #5078

Merged
merged 3 commits into from May 6, 2020
Merged

Check destination length in encodeInto #5078

merged 3 commits into from May 6, 2020

Conversation

seishun
Copy link
Contributor

@seishun seishun commented May 4, 2020

I changed the Encoder interface because adding this check was awkward otherwise. Now it follows the spec more closely.

}

if (inRange(codePoint, 0x00, 0x7f)) {
return codePoint;
return [codePoint];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be horribly inefficient, since most code points are just number, and now we are creating an array all the time.

We should return a number | number[] | symbol and use a symbol instead of "finished" if this is really a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before:

PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }


Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 94
Ticks             : 945903
TotalDays         : 1.09479513888889E-06
TotalHours        : 2.62750833333333E-05
TotalMinutes      : 0.001576505
TotalSeconds      : 0.0945903
TotalMilliseconds : 94.5903



PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }


Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 107
Ticks             : 1074975
TotalDays         : 1.24418402777778E-06
TotalHours        : 2.98604166666667E-05
TotalMinutes      : 0.001791625
TotalSeconds      : 0.1074975
TotalMilliseconds : 107.4975



PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 98
Ticks             : 987147
TotalDays         : 1.14253125E-06
TotalHours        : 2.742075E-05
TotalMinutes      : 0.001645245
TotalSeconds      : 0.0987147
TotalMilliseconds : 98.7147

After:

PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }


Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 92
Ticks             : 926395
TotalDays         : 1.07221643518519E-06
TotalHours        : 2.57331944444444E-05
TotalMinutes      : 0.00154399166666667
TotalSeconds      : 0.0926395
TotalMilliseconds : 92.6395



PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }


Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 93
Ticks             : 939193
TotalDays         : 1.08702893518519E-06
TotalHours        : 2.60886944444444E-05
TotalMinutes      : 0.00156532166666667
TotalSeconds      : 0.0939193
TotalMilliseconds : 93.9193



PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 92
Ticks             : 924900
TotalDays         : 1.07048611111111E-06
TotalHours        : 2.56916666666667E-05
TotalMinutes      : 0.0015415
TotalSeconds      : 0.09249
TotalMilliseconds : 92.49

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the memory usage? Still is there a problem with using a symbol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a good way to measure memory usage of a command on Windows.

Still is there a problem with using a symbol?

It's more verbose. Is there an advantage to using a symbol?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More verbose? It wouldn't be more verbose if you replaced const FINALIZED = Symbol("finalized");. Creating an array on every codepoint is actually more verbose by two characters. It is arguable the whole thing should use symbols instead of numbers for the consts. Symbols are more memory efficient that strings and less likely to cause intented consequences (like string | number[] both have .length).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be more verbose if you replaced const FINALIZED = Symbol("finalized");.

Plus type FINISHED = typeof FINISHED;.

Creating an array on every codepoint is actually more verbose by two characters.

That's orthogonal. I made the bytes always be an array to avoid having two branches. I made finished a string so that TypeScript knows that if result isn't finished then it's an array. Strings are nice in that "finished" is both a value and a type.

Symbols are more memory efficient that strings

How many extra bytes are we talking about? V8 would probably intern the strings, so there might be no difference at all.

and less likely to cause intented consequences (like string | number[] both have .length).

That's a fair point, but it won't be a problem if the implementation follows the spec.

Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either a symbol or a string is fine as far as I'm concerned. This isn't exposed to users.

I'll keep an eye on the benchmarks to see if this change effects things, but based on the output @seishun posted it seems like it won't. V8 perf is difficult to predict.

This fixes a bug and adds a test so I'm going to land it. Thanks @seishun.

@ry ry merged commit 76c77bb into denoland:master May 6, 2020
@seishun seishun deleted the fix-encodeInto branch May 6, 2020 17:13
SASUKE40 pushed a commit to SASUKE40/deno that referenced this pull request May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants