-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace deps encoding with build in textdecoder #39
Conversation
I second the idea of having busboy benchmarks, we could do an informed decision here then. |
lib/utils.js
Outdated
/** | ||
* https://datatracker.ietf.org/doc/html/rfc7578#section-4.7 | ||
*/ | ||
switch (destEncoding) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no default? what will happen if we don't resolve anything there?
edit: right, we will use the passed one
lib/utils.js
Outdated
destEncoding = 'binary'; | ||
break; | ||
} | ||
if (text) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need destEncoding outside of this block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, we get from outside if it is 7bit or 8bit. But I think I made a mistake because this is actually the textEncoding and not the destEncoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering that all tests still passed, it's a very alarming sign. we do need to improve our code coverage for this part of functionality ahead of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a huge task to do that, with little to no benefits. Or are there any nodejs unit tests for the builtin textdecoder, which we could reuse?
I am very scared of regressing edge cases here, considering both the size of original implementation and low code coverage of it. Wonder if we could transitively increase code coverage of encoder via writing additional tests for busboy itself first. |
I think that jsencoding dependency is actually an archaic TextDecoder Implementation. But what i realized is, that if I output the error messages of the created try catch blocks, it actually throws errors in node 10 and 12 reporting that iso-8891-1, despite the unit test passes. Maybe because the files are utf-8 encoded?
So I think, the issue is, that the built in TextDecoder is depending on the built-in ICU. So what we can do is, to try to get a built-in TextDecoder and if we throw we use the deps/encoding like before as fallback. Maybe if we split the encoders, and require them if necessary reduces the start up time of busboy. But only if it means, that we have some performance gains. So we should postpone this PR till we have some busboy benchmarks. |
@kibertoad We could remove actually all encoders, as we dont need the encoders but only the decoders. |
I removed the encoders and extracted the big json array into encodings.json. |
I tested it with the benchmarks from #40 and it is surprisingly slower. So I would say, We should not use the built-in TextDecoder... |
We should discuss about this first. This replaces the jsencoding code with the built in textdecoder.
We have unfortunately no busboy benchmarks. Maybe we should add first some benchmarks before checking the validity of this PR.
I cache the instantiated TextDecoder in a Map because I expect that instantiating a TextDecoder is expensive and reusing it is cheap and it has a low memory footprint. Maybe my expectation is wrong?
Nice sideeffect, the test coverage increases massively. LOL
Checklist
npm run test
andnpm run benchmark
and the Code of conduct