Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer,string_decoder: consolidate encoding validation logic #7207

Closed
wants to merge 1 commit into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented Jun 7, 2016

Checklist
  • make -j4 test (UNIX) or vcbuild test nosign (Windows) passes
  • documentation is changed or added
  • the commit message follows commit guidelines
Affected core subsystem(s)

buffer, string_decoder

Description of change

Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

@nodejs/buffer @mscdex

@jasnell jasnell added buffer Issues and PRs related to the buffer subsystem. string_decoder Issues and PRs related to the string_decoder subsystem. labels Jun 7, 2016
@nodejs-github-bot nodejs-github-bot added the util Issues and PRs related to the built-in util module. label Jun 7, 2016
@mscdex
Copy link
Contributor

mscdex commented Jun 7, 2016

I don't think this will work for StringDecoder since some libraries override Buffer.isEncoding(), so we need to check the result of that function if we don't match a known internal encoding. Doing that as-is with these changes will mean executing the normalization function twice in the case that Buffer.isEncoding() isn't overridden.

@jasnell
Copy link
Member Author

jasnell commented Jun 7, 2016

Ok, let me take one more stab at it...

}
}
if (!internalUtil)
internalUtil = require('internal/util');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just include it at the top of the file. This is already being required by other modules prior to this point so there's no savings.

@jasnell
Copy link
Member Author

jasnell commented Jun 7, 2016

@mscdex ... ok, how does that look now?

return enc;
default:
if (low)
return undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: this can be simplified to just return;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I did it this way just to make it more explicit. Either way works for me tho

@mscdex
Copy link
Contributor

mscdex commented Jun 7, 2016

LGTM except for minor nit. Should probably run this through citgm though just to be extra safe.

@jasnell
Copy link
Member Author

jasnell commented Jun 8, 2016

CI: https://ci.nodejs.org/job/node-test-pull-request/2952/ ... Green except for unrelated build bot failure
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/300/ ... Green

@@ -114,8 +114,7 @@ exports.normalizeEncoding = function normalizeEncoding(enc) {
case 'hex':
return enc;
default:
if (low)
return undefined;
if (low) return; // undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to add the comment we may as well just use return undefined; ;-)

loweredCase = true;
}
}
Buffer[Symbol.for('isEncoding')] = Buffer.isEncoding = function(encoding) {
Copy link
Contributor

@trevnorris trevnorris Jun 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say either

  1. Prefix the Symbol to help prevent collisions (e.g. node.isEncoding)
  2. Create and export this symbol from internal/util.js so it's no longer global
    I lean towards the later.

@jasnell
Copy link
Member Author

jasnell commented Jun 8, 2016

@trevnorris @mscdex ... updated and added a couple of tests. It turns out that we were not previously testing or verifying that Buffer.isEncoding() with non-string values would return false.

@trevnorris
Copy link
Contributor

LGTM

@mscdex
Copy link
Contributor

mscdex commented Jun 11, 2016

LGTM if CI is still ok: https://ci.nodejs.org/job/node-test-pull-request/2988/

@jasnell
Copy link
Member Author

jasnell commented Jun 12, 2016

That CI run had a couple buildbot issues. Running again just to be safe
https://ci.nodejs.org/job/node-test-pull-request/2991/

Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.
@jasnell
Copy link
Member Author

jasnell commented Jun 21, 2016

New CI before landing... just in case.. https://ci.nodejs.org/job/node-test-pull-request/3038/

jasnell added a commit that referenced this pull request Jun 21, 2016
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: #7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@jasnell
Copy link
Member Author

jasnell commented Jun 21, 2016

Landed in 6dd093d

@jasnell jasnell closed this Jun 21, 2016
@thefourtheye
Copy link
Contributor

Returning different types of values from a function could have been avoided :(

@mscdex
Copy link
Contributor

mscdex commented Jun 21, 2016

@thefourtheye What do you mean?

@thefourtheye
Copy link
Contributor

@mscdex normalizeEncoding returns string and undefined based on the inputs, right? I was thinking, this could have been avoided.

@mscdex
Copy link
Contributor

mscdex commented Jun 21, 2016

@thefourtheye How could it have been avoided?

@thefourtheye
Copy link
Contributor

thefourtheye commented Jun 21, 2016

@mscdex maybe we could have returned an empty string instead of undefined?

@mscdex
Copy link
Contributor

mscdex commented Jun 21, 2016

@thefourtheye Returning undefined or an empty string both have their pros and cons so there's really no difference IMHO, and I believe any performance hit from not consistently returning a string is likely to be negligible in this case.

@Fishrock123
Copy link
Member

seems to have conflicts on v6

@jasnell
Copy link
Member Author

jasnell commented Jun 27, 2016

@Fishrock123 ... this PR builds on @mscdex's work in #6777. This can be backported to v6 if #6777 is. I don't believe a don't-land label is appropriate but we might need a better way of tracking dependencies between PRs.

@mscdex
Copy link
Contributor

mscdex commented Jun 27, 2016

@jasnell That PR is already in v6, it's actually in v6.2.1.

@MylesBorins
Copy link
Contributor

I put the don't land label for v4.x please correct if wrong

addaleax added a commit to addaleax/node that referenced this pull request Sep 7, 2016
Fishrock123 pushed a commit to Fishrock123/node that referenced this pull request Sep 8, 2016
steal a line from nodejs#7207 to make things work

PR-URL: nodejs#8437
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Fishrock123 pushed a commit to Fishrock123/node that referenced this pull request Sep 8, 2016
steal a line from nodejs#7207 to make things work

PR-URL: nodejs#8437
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
addaleax pushed a commit to addaleax/node that referenced this pull request Sep 9, 2016
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: nodejs#7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Fishrock123 pushed a commit that referenced this pull request Sep 13, 2016
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: #7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>

Refs: #8463
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. string_decoder Issues and PRs related to the string_decoder subsystem. util Issues and PRs related to the built-in util module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants