New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check destination length in encodeInto #5078
Conversation
} | ||
|
||
if (inRange(codePoint, 0x00, 0x7f)) { | ||
return codePoint; | ||
return [codePoint]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to be horribly inefficient, since most code points are just number, and now we are creating an array all the time.
We should return a number | number[] | symbol
and use a symbol instead of "finished"
if this is really a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before:
PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 94
Ticks : 945903
TotalDays : 1.09479513888889E-06
TotalHours : 2.62750833333333E-05
TotalMinutes : 0.001576505
TotalSeconds : 0.0945903
TotalMilliseconds : 94.5903
PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 107
Ticks : 1074975
TotalDays : 1.24418402777778E-06
TotalHours : 2.98604166666667E-05
TotalMinutes : 0.001791625
TotalSeconds : 0.1074975
TotalMilliseconds : 107.4975
PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 98
Ticks : 987147
TotalDays : 1.14253125E-06
TotalHours : 2.742075E-05
TotalMinutes : 0.001645245
TotalSeconds : 0.0987147
TotalMilliseconds : 98.7147
After:
PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 92
Ticks : 926395
TotalDays : 1.07221643518519E-06
TotalHours : 2.57331944444444E-05
TotalMinutes : 0.00154399166666667
TotalSeconds : 0.0926395
TotalMilliseconds : 92.6395
PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 93
Ticks : 939193
TotalDays : 1.08702893518519E-06
TotalHours : 2.60886944444444E-05
TotalMinutes : 0.00156532166666667
TotalSeconds : 0.0939193
TotalMilliseconds : 93.9193
PS C:\Users\Nikolai\deno> Measure-Command { .\target\release\deno.exe run .\cli\tests\text_encoder_perf.js }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 92
Ticks : 924900
TotalDays : 1.07048611111111E-06
TotalHours : 2.56916666666667E-05
TotalMinutes : 0.0015415
TotalSeconds : 0.09249
TotalMilliseconds : 92.49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the memory usage? Still is there a problem with using a symbol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be a good way to measure memory usage of a command on Windows.
Still is there a problem with using a symbol?
It's more verbose. Is there an advantage to using a symbol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More verbose? It wouldn't be more verbose if you replaced const FINALIZED = Symbol("finalized");
. Creating an array on every codepoint is actually more verbose by two characters. It is arguable the whole thing should use symbols instead of numbers for the consts. Symbols are more memory efficient that strings and less likely to cause intented consequences (like string | number[]
both have .length
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wouldn't be more verbose if you replaced const FINALIZED = Symbol("finalized");.
Plus type FINISHED = typeof FINISHED;
.
Creating an array on every codepoint is actually more verbose by two characters.
That's orthogonal. I made the bytes always be an array to avoid having two branches. I made finished a string so that TypeScript knows that if result isn't finished then it's an array. Strings are nice in that "finished"
is both a value and a type.
Symbols are more memory efficient that strings
How many extra bytes are we talking about? V8 would probably intern the strings, so there might be no difference at all.
and less likely to cause intented consequences (like string | number[] both have .length).
That's a fair point, but it won't be a problem if the implementation follows the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either a symbol or a string is fine as far as I'm concerned. This isn't exposed to users.
I'll keep an eye on the benchmarks to see if this change effects things, but based on the output @seishun posted it seems like it won't. V8 perf is difficult to predict.
This fixes a bug and adds a test so I'm going to land it. Thanks @seishun.
I changed the Encoder interface because adding this check was awkward otherwise. Now it follows the spec more closely.