Skip to content

Commit 28071a1

Browse files
trevnorrisaddaleax
authored andcommitted
buffer: introduce latin1 encoding term
When node began using the OneByte API (f150d56) it also switched to officially supporting ISO-8859-1. Though at the time no new encoding string was introduced. Introduce the new encoding string 'latin1' to be more explicit. The previous 'binary' and documented as an alias to 'latin1'. While many tests have switched to use 'latin1', there are still plenty that do both 'binary' and 'latin1' checks side-by-side to ensure there is no regression. PR-URL: #7111 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: James M Snell <jasnell@gmail.com>
1 parent 75b37a6 commit 28071a1

37 files changed

+246
-124
lines changed

doc/api/buffer.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -165,12 +165,22 @@ The character encodings currently supported by Node.js include:
165165
this encoding will also correctly accept "URL and Filename Safe Alphabet" as
166166
specified in [RFC 4648, Section 5].
167167

168-
* `'binary'` - A way of encoding the buffer into a one-byte (`latin-1`)
169-
encoded string. The string `'latin-1'` is not supported. Instead, pass
170-
`'binary'` to use `'latin-1'` encoding.
168+
* `'latin1'` - A way of encoding the buffer into a one-byte encoded string
169+
(as defined by the IANA in [RFC1345](https://tools.ietf.org/html/rfc1345),
170+
page 63, to be the Latin-1 supplement block and C0/C1 control codes).
171+
172+
* `'binary'` - (deprecated) A way of encoding the buffer into a one-byte
173+
(`latin1`) encoded string.
171174

172175
* `'hex'` - Encode each byte as two hexadecimal characters.
173176

177+
_Note_: Today's browsers follow the [WHATWG
178+
spec](https://encoding.spec.whatwg.org/) that aliases both `latin1` and
179+
`iso-8859-1` to `win-1252`. Meaning, while doing something like `http.get()`,
180+
if the returned charset is one of those listed in the WHATWG spec it's possible
181+
that the server actually returned `win-1252` encoded data, and using `latin1`
182+
encoding may incorrectly decode the graphical characters.
183+
174184
## Buffers and TypedArray
175185

176186
Buffers are also `Uint8Array` TypedArray instances. However, there are subtle

doc/api/crypto.md

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ console.log(encrypted);
160160
### cipher.final([output_encoding])
161161

162162
Returns any remaining enciphered contents. If `output_encoding`
163-
parameter is one of `'binary'`, `'base64'` or `'hex'`, a string is returned.
163+
parameter is one of `'latin1'`, `'base64'` or `'hex'`, a string is returned.
164164
If an `output_encoding` is not provided, a [`Buffer`][] is returned.
165165

166166
Once the `cipher.final()` method has been called, the `Cipher` object can no
@@ -198,13 +198,13 @@ The `cipher.setAutoPadding()` method must be called before [`cipher.final()`][].
198198
### cipher.update(data[, input_encoding][, output_encoding])
199199

200200
Updates the cipher with `data`. If the `input_encoding` argument is given,
201-
it's value must be one of `'utf8'`, `'ascii'`, or `'binary'` and the `data`
201+
it's value must be one of `'utf8'`, `'ascii'`, or `'latin1'` and the `data`
202202
argument is a string using the specified encoding. If the `input_encoding`
203203
argument is not given, `data` must be a [`Buffer`][]. If `data` is a
204204
[`Buffer`][] then `input_encoding` is ignored.
205205

206206
The `output_encoding` specifies the output format of the enciphered
207-
data, and can be `'binary'`, `'base64'` or `'hex'`. If the `output_encoding`
207+
data, and can be `'latin1'`, `'base64'` or `'hex'`. If the `output_encoding`
208208
is specified, a string using the specified encoding is returned. If no
209209
`output_encoding` is provided, a [`Buffer`][] is returned.
210210

@@ -277,7 +277,7 @@ console.log(decrypted);
277277
### decipher.final([output_encoding])
278278

279279
Returns any remaining deciphered contents. If `output_encoding`
280-
parameter is one of `'binary'`, `'base64'` or `'hex'`, a string is returned.
280+
parameter is one of `'latin1'`, `'base64'` or `'hex'`, a string is returned.
281281
If an `output_encoding` is not provided, a [`Buffer`][] is returned.
282282

283283
Once the `decipher.final()` method has been called, the `Decipher` object can
@@ -313,13 +313,13 @@ The `decipher.setAutoPadding()` method must be called before
313313
### decipher.update(data[, input_encoding][, output_encoding])
314314

315315
Updates the decipher with `data`. If the `input_encoding` argument is given,
316-
it's value must be one of `'binary'`, `'base64'`, or `'hex'` and the `data`
316+
it's value must be one of `'latin1'`, `'base64'`, or `'hex'` and the `data`
317317
argument is a string using the specified encoding. If the `input_encoding`
318318
argument is not given, `data` must be a [`Buffer`][]. If `data` is a
319319
[`Buffer`][] then `input_encoding` is ignored.
320320

321321
The `output_encoding` specifies the output format of the enciphered
322-
data, and can be `'binary'`, `'ascii'` or `'utf8'`. If the `output_encoding`
322+
data, and can be `'latin1'`, `'ascii'` or `'utf8'`. If the `output_encoding`
323323
is specified, a string using the specified encoding is returned. If no
324324
`output_encoding` is provided, a [`Buffer`][] is returned.
325325

@@ -361,7 +361,7 @@ Computes the shared secret using `other_public_key` as the other
361361
party's public key and returns the computed shared secret. The supplied
362362
key is interpreted using the specified `input_encoding`, and secret is
363363
encoded using specified `output_encoding`. Encodings can be
364-
`'binary'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
364+
`'latin1'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
365365
provided, `other_public_key` is expected to be a [`Buffer`][].
366366

367367
If `output_encoding` is given a string is returned; otherwise, a
@@ -371,45 +371,45 @@ If `output_encoding` is given a string is returned; otherwise, a
371371

372372
Generates private and public Diffie-Hellman key values, and returns
373373
the public key in the specified `encoding`. This key should be
374-
transferred to the other party. Encoding can be `'binary'`, `'hex'`,
374+
transferred to the other party. Encoding can be `'latin1'`, `'hex'`,
375375
or `'base64'`. If `encoding` is provided a string is returned; otherwise a
376376
[`Buffer`][] is returned.
377377

378378
### diffieHellman.getGenerator([encoding])
379379

380380
Returns the Diffie-Hellman generator in the specified `encoding`, which can
381-
be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
381+
be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
382382
returned; otherwise a [`Buffer`][] is returned.
383383

384384
### diffieHellman.getPrime([encoding])
385385

386386
Returns the Diffie-Hellman prime in the specified `encoding`, which can
387-
be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
387+
be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
388388
returned; otherwise a [`Buffer`][] is returned.
389389

390390
### diffieHellman.getPrivateKey([encoding])
391391

392392
Returns the Diffie-Hellman private key in the specified `encoding`,
393-
which can be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a
393+
which can be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a
394394
string is returned; otherwise a [`Buffer`][] is returned.
395395

396396
### diffieHellman.getPublicKey([encoding])
397397

398398
Returns the Diffie-Hellman public key in the specified `encoding`, which
399-
can be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a
399+
can be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a
400400
string is returned; otherwise a [`Buffer`][] is returned.
401401

402402
### diffieHellman.setPrivateKey(private_key[, encoding])
403403

404404
Sets the Diffie-Hellman private key. If the `encoding` argument is provided
405-
and is either `'binary'`, `'hex'`, or `'base64'`, `private_key` is expected
405+
and is either `'latin1'`, `'hex'`, or `'base64'`, `private_key` is expected
406406
to be a string. If no `encoding` is provided, `private_key` is expected
407407
to be a [`Buffer`][].
408408

409409
### diffieHellman.setPublicKey(public_key[, encoding])
410410

411411
Sets the Diffie-Hellman public key. If the `encoding` argument is provided
412-
and is either `'binary'`, `'hex'` or `'base64'`, `public_key` is expected
412+
and is either `'latin1'`, `'hex'` or `'base64'`, `public_key` is expected
413413
to be a string. If no `encoding` is provided, `public_key` is expected
414414
to be a [`Buffer`][].
415415

@@ -460,7 +460,7 @@ Computes the shared secret using `other_public_key` as the other
460460
party's public key and returns the computed shared secret. The supplied
461461
key is interpreted using specified `input_encoding`, and the returned secret
462462
is encoded using the specified `output_encoding`. Encodings can be
463-
`'binary'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
463+
`'latin1'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
464464
provided, `other_public_key` is expected to be a [`Buffer`][].
465465

466466
If `output_encoding` is given a string will be returned; otherwise a
@@ -476,14 +476,14 @@ The `format` arguments specifies point encoding and can be `'compressed'`,
476476
`'uncompressed'`, or `'hybrid'`. If `format` is not specified, the point will
477477
be returned in `'uncompressed'` format.
478478

479-
The `encoding` argument can be `'binary'`, `'hex'`, or `'base64'`. If
479+
The `encoding` argument can be `'latin1'`, `'hex'`, or `'base64'`. If
480480
`encoding` is provided a string is returned; otherwise a [`Buffer`][]
481481
is returned.
482482

483483
### ecdh.getPrivateKey([encoding])
484484

485485
Returns the EC Diffie-Hellman private key in the specified `encoding`,
486-
which can be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided
486+
which can be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided
487487
a string is returned; otherwise a [`Buffer`][] is returned.
488488

489489
### ecdh.getPublicKey([encoding[, format]])
@@ -495,13 +495,13 @@ The `format` argument specifies point encoding and can be `'compressed'`,
495495
`'uncompressed'`, or `'hybrid'`. If `format` is not specified the point will be
496496
returned in `'uncompressed'` format.
497497

498-
The `encoding` argument can be `'binary'`, `'hex'`, or `'base64'`. If
498+
The `encoding` argument can be `'latin1'`, `'hex'`, or `'base64'`. If
499499
`encoding` is specified, a string is returned; otherwise a [`Buffer`][] is
500500
returned.
501501

502502
### ecdh.setPrivateKey(private_key[, encoding])
503503

504-
Sets the EC Diffie-Hellman private key. The `encoding` can be `'binary'`,
504+
Sets the EC Diffie-Hellman private key. The `encoding` can be `'latin1'`,
505505
`'hex'` or `'base64'`. If `encoding` is provided, `private_key` is expected
506506
to be a string; otherwise `private_key` is expected to be a [`Buffer`][]. If
507507
`private_key` is not valid for the curve specified when the `ECDH` object was
@@ -512,7 +512,7 @@ public point (key) is also generated and set in the ECDH object.
512512

513513
Stability: 0 - Deprecated
514514

515-
Sets the EC Diffie-Hellman public key. Key encoding can be `'binary'`,
515+
Sets the EC Diffie-Hellman public key. Key encoding can be `'latin1'`,
516516
`'hex'` or `'base64'`. If `encoding` is provided `public_key` is expected to
517517
be a string; otherwise a [`Buffer`][] is expected.
518518

@@ -604,7 +604,7 @@ console.log(hash.digest('hex'));
604604
### hash.digest([encoding])
605605

606606
Calculates the digest of all of the data passed to be hashed (using the
607-
[`hash.update()`][] method). The `encoding` can be `'hex'`, `'binary'` or
607+
[`hash.update()`][] method). The `encoding` can be `'hex'`, `'latin1'` or
608608
`'base64'`. If `encoding` is provided a string will be returned; otherwise
609609
a [`Buffer`][] is returned.
610610

@@ -615,7 +615,7 @@ called. Multiple calls will cause an error to be thrown.
615615

616616
Updates the hash content with the given `data`, the encoding of which
617617
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
618-
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
618+
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
619619
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
620620
`input_encoding` is ignored.
621621

@@ -678,7 +678,7 @@ console.log(hmac.digest('hex'));
678678
### hmac.digest([encoding])
679679

680680
Calculates the HMAC digest of all of the data passed using [`hmac.update()`][].
681-
The `encoding` can be `'hex'`, `'binary'` or `'base64'`. If `encoding` is
681+
The `encoding` can be `'hex'`, `'latin1'` or `'base64'`. If `encoding` is
682682
provided a string is returned; otherwise a [`Buffer`][] is returned;
683683

684684
The `Hmac` object can not be used again after `hmac.digest()` has been
@@ -688,7 +688,7 @@ called. Multiple calls to `hmac.digest()` will result in an error being thrown.
688688

689689
Updates the `Hmac` content with the given `data`, the encoding of which
690690
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
691-
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
691+
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
692692
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
693693
`input_encoding` is ignored.
694694

@@ -768,7 +768,7 @@ object, it is interpreted as a hash containing two properties:
768768
* `key` : {String} - PEM encoded private key
769769
* `passphrase` : {String} - passphrase for the private key
770770

771-
The `output_format` can specify one of `'binary'`, `'hex'` or `'base64'`. If
771+
The `output_format` can specify one of `'latin1'`, `'hex'` or `'base64'`. If
772772
`output_format` is provided a string is returned; otherwise a [`Buffer`][] is
773773
returned.
774774

@@ -779,7 +779,7 @@ called. Multiple calls to `sign.sign()` will result in an error being thrown.
779779

780780
Updates the `Sign` content with the given `data`, the encoding of which
781781
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
782-
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
782+
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
783783
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
784784
`input_encoding` is ignored.
785785

@@ -831,7 +831,7 @@ console.log(verify.verify(public_key, signature));
831831

832832
Updates the `Verify` content with the given `data`, the encoding of which
833833
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
834-
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
834+
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
835835
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
836836
`input_encoding` is ignored.
837837

@@ -843,7 +843,7 @@ Verifies the provided data using the given `object` and `signature`.
843843
The `object` argument is a string containing a PEM encoded object, which can be
844844
one an RSA public key, a DSA public key, or an X.509 certificate.
845845
The `signature` argument is the previously calculated signature for the data, in
846-
the `signature_format` which can be `'binary'`, `'hex'` or `'base64'`.
846+
the `signature_format` which can be `'latin1'`, `'hex'` or `'base64'`.
847847
If a `signature_format` is specified, the `signature` is expected to be a
848848
string; otherwise `signature` is expected to be a [`Buffer`][].
849849

@@ -869,7 +869,7 @@ or [buffers][`Buffer`]. The default value is `'buffer'`, which makes methods
869869
default to [`Buffer`][] objects.
870870

871871
The `crypto.DEFAULT_ENCODING` mechanism is provided for backwards compatibility
872-
with legacy programs that expect `'binary'` to be the default encoding.
872+
with legacy programs that expect `'latin1'` to be the default encoding.
873873

874874
New applications should expect the default to be `'buffer'`. This property may
875875
become deprecated in a future Node.js release.
@@ -889,7 +889,7 @@ recent OpenSSL releases, `openssl list-cipher-algorithms` will display the
889889
available cipher algorithms.
890890

891891
The `password` is used to derive the cipher key and initialization vector (IV).
892-
The value must be either a `'binary'` encoded string or a [`Buffer`][].
892+
The value must be either a `'latin1'` encoded string or a [`Buffer`][].
893893

894894
The implementation of `crypto.createCipher()` derives keys using the OpenSSL
895895
function [`EVP_BytesToKey`][] with the digest algorithm set to MD5, one
@@ -913,7 +913,7 @@ recent OpenSSL releases, `openssl list-cipher-algorithms` will display the
913913
available cipher algorithms.
914914

915915
The `key` is the raw key used by the `algorithm` and `iv` is an
916-
[initialization vector][]. Both arguments must be `'binary'` encoded strings or
916+
[initialization vector][]. Both arguments must be `'latin1'` encoded strings or
917917
[buffers][`Buffer`].
918918

919919
### crypto.createCredentials(details)
@@ -968,7 +968,7 @@ recent OpenSSL releases, `openssl list-cipher-algorithms` will display the
968968
available cipher algorithms.
969969

970970
The `key` is the raw key used by the `algorithm` and `iv` is an
971-
[initialization vector][]. Both arguments must be `'binary'` encoded strings or
971+
[initialization vector][]. Both arguments must be `'latin1'` encoded strings or
972972
[buffers][`Buffer`].
973973

974974
### crypto.createDiffieHellman(prime[, prime_encoding][, generator][, generator_encoding])
@@ -979,7 +979,7 @@ optional specific `generator`.
979979
The `generator` argument can be a number, string, or [`Buffer`][]. If
980980
`generator` is not specified, the value `2` is used.
981981

982-
The `prime_encoding` and `generator_encoding` arguments can be `'binary'`,
982+
The `prime_encoding` and `generator_encoding` arguments can be `'latin1'`,
983983
`'hex'`, or `'base64'`.
984984

985985
If `prime_encoding` is specified, `prime` is expected to be a string; otherwise
@@ -1345,7 +1345,7 @@ unified Stream API, and before there were [`Buffer`][] objects for handling
13451345
binary data. As such, the many of the `crypto` defined classes have methods not
13461346
typically found on other Node.js classes that implement the [streams][stream]
13471347
API (e.g. `update()`, `final()`, or `digest()`). Also, many methods accepted
1348-
and returned `'binary'` encoded strings by default rather than Buffers. This
1348+
and returned `'latin1'` encoded strings by default rather than Buffers. This
13491349
default was changed after Node.js v0.8 to use [`Buffer`][] objects by default
13501350
instead.
13511351

lib/_http_outgoing.js

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ OutgoingMessage.prototype._send = function(data, encoding, callback) {
130130
data = this._header + data;
131131
} else {
132132
this.output.unshift(this._header);
133-
this.outputEncodings.unshift('binary');
133+
this.outputEncodings.unshift('latin1');
134134
this.outputCallbacks.unshift(null);
135135
this.outputSize += this._header.length;
136136
if (typeof this._onPendingData === 'function')
@@ -453,7 +453,7 @@ OutgoingMessage.prototype.write = function(chunk, encoding, callback) {
453453
if (typeof chunk === 'string' &&
454454
encoding !== 'hex' &&
455455
encoding !== 'base64' &&
456-
encoding !== 'binary') {
456+
encoding !== 'latin1') {
457457
len = Buffer.byteLength(chunk, encoding);
458458
chunk = len.toString(16) + CRLF + chunk + CRLF;
459459
ret = this._send(chunk, encoding, callback);
@@ -468,7 +468,7 @@ OutgoingMessage.prototype.write = function(chunk, encoding, callback) {
468468
this.connection.cork();
469469
process.nextTick(connectionCorkNT, this.connection);
470470
}
471-
this._send(len.toString(16), 'binary', null);
471+
this._send(len.toString(16), 'latin1', null);
472472
this._send(crlf_buf, null, null);
473473
this._send(chunk, encoding, null);
474474
ret = this._send(crlf_buf, null, callback);
@@ -581,10 +581,10 @@ OutgoingMessage.prototype.end = function(data, encoding, callback) {
581581
};
582582

583583
if (this._hasBody && this.chunkedEncoding) {
584-
ret = this._send('0\r\n' + this._trailer + '\r\n', 'binary', finish);
584+
ret = this._send('0\r\n' + this._trailer + '\r\n', 'latin1', finish);
585585
} else {
586586
// Force a flush, HACK.
587-
ret = this._send('', 'binary', finish);
587+
ret = this._send('', 'latin1', finish);
588588
}
589589

590590
if (this.connection && data)

lib/_tls_wrap.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -608,7 +608,7 @@ TLSSocket.prototype.setServername = function(name) {
608608

609609
TLSSocket.prototype.setSession = function(session) {
610610
if (typeof session === 'string')
611-
session = Buffer.from(session, 'binary');
611+
session = Buffer.from(session, 'latin1');
612612
this._handle.setSession(session);
613613
};
614614

0 commit comments

Comments
 (0)