New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer,string_decoder: consolidate encoding validation logic #7207

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
7 participants
@jasnell
Member

jasnell commented Jun 7, 2016

Checklist
  • make -j4 test (UNIX) or vcbuild test nosign (Windows) passes
  • documentation is changed or added
  • the commit message follows commit guidelines
Affected core subsystem(s)

buffer, string_decoder

Description of change

Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

@nodejs/buffer @mscdex

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Jun 7, 2016

Contributor

I don't think this will work for StringDecoder since some libraries override Buffer.isEncoding(), so we need to check the result of that function if we don't match a known internal encoding. Doing that as-is with these changes will mean executing the normalization function twice in the case that Buffer.isEncoding() isn't overridden.

Contributor

mscdex commented Jun 7, 2016

I don't think this will work for StringDecoder since some libraries override Buffer.isEncoding(), so we need to check the result of that function if we don't match a known internal encoding. Doing that as-is with these changes will mean executing the normalization function twice in the case that Buffer.isEncoding() isn't overridden.

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 7, 2016

Member

Ok, let me take one more stab at it...

Member

jasnell commented Jun 7, 2016

Ok, let me take one more stab at it...

@trevnorris

View changes

Show outdated Hide outdated lib/buffer.js
}
}
if (!internalUtil)
internalUtil = require('internal/util');

This comment has been minimized.

@trevnorris

trevnorris Jun 7, 2016

Contributor

Just include it at the top of the file. This is already being required by other modules prior to this point so there's no savings.

@trevnorris

trevnorris Jun 7, 2016

Contributor

Just include it at the top of the file. This is already being required by other modules prior to this point so there's no savings.

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 7, 2016

Member

@mscdex ... ok, how does that look now?

Member

jasnell commented Jun 7, 2016

@mscdex ... ok, how does that look now?

@mscdex

View changes

Show outdated Hide outdated lib/internal/util.js
return enc;
default:
if (low)
return undefined;

This comment has been minimized.

@mscdex

mscdex Jun 7, 2016

Contributor

minor nit: this can be simplified to just return;

@mscdex

mscdex Jun 7, 2016

Contributor

minor nit: this can be simplified to just return;

This comment has been minimized.

@jasnell

jasnell Jun 7, 2016

Member

Yeah I did it this way just to make it more explicit. Either way works for me tho

@jasnell

jasnell Jun 7, 2016

Member

Yeah I did it this way just to make it more explicit. Either way works for me tho

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Jun 7, 2016

Contributor

LGTM except for minor nit. Should probably run this through citgm though just to be extra safe.

Contributor

mscdex commented Jun 7, 2016

LGTM except for minor nit. Should probably run this through citgm though just to be extra safe.

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell
Member

jasnell commented Jun 8, 2016

@mscdex

View changes

Show outdated Hide outdated lib/internal/util.js
@@ -114,8 +114,7 @@ exports.normalizeEncoding = function normalizeEncoding(enc) {
case 'hex':
return enc;
default:
if (low)
return undefined;
if (low) return; // undefined

This comment has been minimized.

@mscdex

mscdex Jun 8, 2016

Contributor

If we're going to add the comment we may as well just use return undefined; ;-)

@mscdex

mscdex Jun 8, 2016

Contributor

If we're going to add the comment we may as well just use return undefined; ;-)

@trevnorris

View changes

Show outdated Hide outdated lib/buffer.js
loweredCase = true;
}
}
Buffer[Symbol.for('isEncoding')] = Buffer.isEncoding = function(encoding) {

This comment has been minimized.

@trevnorris

trevnorris Jun 8, 2016

Contributor

I'd say either

  1. Prefix the Symbol to help prevent collisions (e.g. node.isEncoding)
  2. Create and export this symbol from internal/util.js so it's no longer global
    I lean towards the later.
@trevnorris

trevnorris Jun 8, 2016

Contributor

I'd say either

  1. Prefix the Symbol to help prevent collisions (e.g. node.isEncoding)
  2. Create and export this symbol from internal/util.js so it's no longer global
    I lean towards the later.
@trevnorris

View changes

Show outdated Hide outdated lib/string_decoder.js
}
}
const nenc = internalUtil.normalizeEncoding(enc);
if (!nenc && (Buffer.isEncoding === isEncoding || !Buffer.isEncoding(enc)))

This comment has been minimized.

@trevnorris

trevnorris Jun 8, 2016

Contributor

I'd say the first check should be typeof nec !== 'string'. After that, why are you checking if isEncoding the same as what's assigned to Buffer.isEncoding? The string decoder should only ever use node's internal implementation. Not any user assigned one. So doing !isEncoding(enc) would be enough.

@trevnorris

trevnorris Jun 8, 2016

Contributor

I'd say the first check should be typeof nec !== 'string'. After that, why are you checking if isEncoding the same as what's assigned to Buffer.isEncoding? The string decoder should only ever use node's internal implementation. Not any user assigned one. So doing !isEncoding(enc) would be enough.

This comment has been minimized.

@jasnell

jasnell Jun 8, 2016

Member

@trevnorris ... see #7207 (comment) ... specifically, the code currently already checks Buffer.isEncoding() and would pick up userland overrides. @mscdex is saying that this needs to preserve that.

@jasnell

jasnell Jun 8, 2016

Member

@trevnorris ... see #7207 (comment) ... specifically, the code currently already checks Buffer.isEncoding() and would pick up userland overrides. @mscdex is saying that this needs to preserve that.

This comment has been minimized.

@trevnorris

trevnorris Jun 11, 2016

Contributor

ah cool.

@trevnorris

trevnorris Jun 11, 2016

Contributor

ah cool.

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 8, 2016

Member

@trevnorris @mscdex ... updated and added a couple of tests. It turns out that we were not previously testing or verifying that Buffer.isEncoding() with non-string values would return false.

Member

jasnell commented Jun 8, 2016

@trevnorris @mscdex ... updated and added a couple of tests. It turns out that we were not previously testing or verifying that Buffer.isEncoding() with non-string values would return false.

@trevnorris

This comment has been minimized.

Show comment
Hide comment
@trevnorris

trevnorris Jun 11, 2016

Contributor

LGTM

Contributor

trevnorris commented Jun 11, 2016

LGTM

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex
Contributor

mscdex commented Jun 11, 2016

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 12, 2016

Member

That CI run had a couple buildbot issues. Running again just to be safe
https://ci.nodejs.org/job/node-test-pull-request/2991/

Member

jasnell commented Jun 12, 2016

That CI run had a couple buildbot issues. Running again just to be safe
https://ci.nodejs.org/job/node-test-pull-request/2991/

buffer,string_decoder: consolidate encoding validation logic
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.
@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 21, 2016

Member

New CI before landing... just in case.. https://ci.nodejs.org/job/node-test-pull-request/3038/

Member

jasnell commented Jun 21, 2016

New CI before landing... just in case.. https://ci.nodejs.org/job/node-test-pull-request/3038/

jasnell added a commit that referenced this pull request Jun 21, 2016

buffer,string_decoder: consolidate encoding validation logic
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: #7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 21, 2016

Member

Landed in 6dd093d

Member

jasnell commented Jun 21, 2016

Landed in 6dd093d

@jasnell jasnell closed this Jun 21, 2016

@thefourtheye

This comment has been minimized.

Show comment
Hide comment
@thefourtheye

thefourtheye Jun 21, 2016

Contributor

Returning different types of values from a function could have been avoided :(

Contributor

thefourtheye commented Jun 21, 2016

Returning different types of values from a function could have been avoided :(

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Jun 21, 2016

Contributor

@thefourtheye What do you mean?

Contributor

mscdex commented Jun 21, 2016

@thefourtheye What do you mean?

@thefourtheye

This comment has been minimized.

Show comment
Hide comment
@thefourtheye

thefourtheye Jun 21, 2016

Contributor

@mscdex normalizeEncoding returns string and undefined based on the inputs, right? I was thinking, this could have been avoided.

Contributor

thefourtheye commented Jun 21, 2016

@mscdex normalizeEncoding returns string and undefined based on the inputs, right? I was thinking, this could have been avoided.

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Jun 21, 2016

Contributor

@thefourtheye How could it have been avoided?

Contributor

mscdex commented Jun 21, 2016

@thefourtheye How could it have been avoided?

@thefourtheye

This comment has been minimized.

Show comment
Hide comment
@thefourtheye

thefourtheye Jun 21, 2016

Contributor

@mscdex maybe we could have returned an empty string instead of undefined?

Contributor

thefourtheye commented Jun 21, 2016

@mscdex maybe we could have returned an empty string instead of undefined?

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Jun 21, 2016

Contributor

@thefourtheye Returning undefined or an empty string both have their pros and cons so there's really no difference IMHO, and I believe any performance hit from not consistently returning a string is likely to be negligible in this case.

Contributor

mscdex commented Jun 21, 2016

@thefourtheye Returning undefined or an empty string both have their pros and cons so there's really no difference IMHO, and I believe any performance hit from not consistently returning a string is likely to be negligible in this case.

@Fishrock123

This comment has been minimized.

Show comment
Hide comment
@Fishrock123

Fishrock123 Jun 27, 2016

Member

seems to have conflicts on v6

Member

Fishrock123 commented Jun 27, 2016

seems to have conflicts on v6

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Jun 27, 2016

Member

@Fishrock123 ... this PR builds on @mscdex's work in #6777. This can be backported to v6 if #6777 is. I don't believe a don't-land label is appropriate but we might need a better way of tracking dependencies between PRs.

Member

jasnell commented Jun 27, 2016

@Fishrock123 ... this PR builds on @mscdex's work in #6777. This can be backported to v6 if #6777 is. I don't believe a don't-land label is appropriate but we might need a better way of tracking dependencies between PRs.

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Jun 27, 2016

Contributor

@jasnell That PR is already in v6, it's actually in v6.2.1.

Contributor

mscdex commented Jun 27, 2016

@jasnell That PR is already in v6, it's actually in v6.2.1.

@MylesBorins

This comment has been minimized.

Show comment
Hide comment
@MylesBorins

MylesBorins Jul 11, 2016

Member

I put the don't land label for v4.x please correct if wrong

Member

MylesBorins commented Jul 11, 2016

I put the don't land label for v4.x please correct if wrong

addaleax added a commit to addaleax/node that referenced this pull request Sep 7, 2016

[squash] fixup for v6 by adding missing require()
steal a line from #7207 to make things work

Fishrock123 added a commit to Fishrock123/node that referenced this pull request Sep 8, 2016

fixup for v6 by adding missing require()
steal a line from #7207 to make things work

PR-URL: nodejs#8437
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>

@Fishrock123 Fishrock123 referenced this pull request Sep 8, 2016

Closed

crypto: fix ucs2/ucs-2/utf16le/utf-16le encoding check #8301

3 of 3 tasks complete

Fishrock123 added a commit to Fishrock123/node that referenced this pull request Sep 8, 2016

fixup for v6 by adding missing require()
steal a line from #7207 to make things work

PR-URL: nodejs#8437
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>

addaleax added a commit to addaleax/node that referenced this pull request Sep 9, 2016

buffer,string_decoder: consolidate encoding validation logic
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: nodejs#7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>

Fishrock123 added a commit that referenced this pull request Sep 13, 2016

buffer,string_decoder: consolidate encoding validation logic
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: #7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>

Refs: #8463
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>

@BridgeAR BridgeAR referenced this pull request Feb 15, 2018

Closed

buffer: improve fill & normalizeEncoding performance #18790

4 of 4 tasks complete

BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018

buffer: stricter isEncoding
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

MayaLekova added a commit to MayaLekova/node that referenced this pull request May 8, 2018

buffer: stricter isEncoding
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment