Expose position and/or input of source text? #10

gibson042 · 2020-02-06T02:22:16Z

String.prototype.replace passes position and input arguments to replacer functions and the return value from RegExp.prototype.exec has "index" and "input" properties; JSON.parse could behave similarly.

const input = '\n\t"use\\u0020strict"';
let spied;
const parsed = JSON.parse(input, (key, val, context) => (spied = context, val));
parsed === 'use strict';
// → true
spied.source === '"use\\u0020strict"';
// → true
spied.index === 2;
// → true
spied.input === input;
// → true

As noted by @rbuckton, this could be useful for error reporting.

The text was updated successfully, but these errors were encountered:

ljharb · 2020-02-06T02:24:47Z

Please do not repeat the regex match object pattern of adding properties to an array :-)

gibson042 · 2020-02-06T02:40:40Z

No, it would be an object. 😅

michaelficarra · 2020-07-22T21:33:16Z

I'm concerned about the use of insignificant things such as whitespace, key ordering, etc to smuggle additional information. See my comments from February.

gibson042 · 2020-07-23T06:22:25Z

Such a channel already exists in the form of confusable glyphs/invisible whitespace/excess numeric precision/etc., not to mention extra object members and/or array items (and key ordering in particular is unaffected by this proposal; revivers are already called in source text order). That said, it's true that the bandwidth for such information would be dramatically increased. But is it really appropriate to try protecting against the extremely narrow scenarios in which a potentially malicious reviver is used with JSON.parse (e.g., that Eve doesn't have the ability to perform arbitrary inspection but does have unfettered access to deserialized values)? The fix for that would be straightforward—don't pass through source/input/index:

-JSON.parse(src, untrusted)
+JSON.parse(src, function scrubber(key, val, { keys }) { return call(untrusted, this, key, val, { keys }) })

michaelficarra · 2020-07-27T17:24:54Z

@gibson042 I think you misunderstood my concern. I am not worried about an adversarial scenario. I am worried about well-meaning developers deliberately (ab)using these insignificant encoding differences to transmit additional, extra-JSON information. I would prefer that JSON.parse only be able to act on the significant parts of the input.

gibson042 · 2020-07-27T19:41:59Z

This proposal necessarily introduces the ability to differentiate between input like 4 vs. 4.0 and "𝌆" vs. "\uD834\uDF06", which already seems to trip your concern. But beyond that, it currently also exposes whitespace at the non-leaf level (e.g., JSON.parse("0\t", (key, val, {source}) => source) === "0\t"). It would be possible, however, to mitigate that by only providing source text for primitives, reducing the available rope but also the internal consistency. How strong would you say your concerns are?

michaelficarra · 2020-07-27T20:49:45Z

I think that strikes a good balance. We don't have to eliminate the possibility, we just have to limit it enough to make it unappealing.

gibson042 · 2020-07-27T21:10:10Z

To clarify, is that support for a) exposing source text with all values but input and index for none, or b) taking the further step of denying the ability to read source text for objects and arrays?

michaelficarra · 2020-07-27T22:00:17Z

I think the only loss of (significant) information in JavaScript's interpretation of JSON is in the numbers/strings, right? In which case B would be preferable.

gibson042 · 2020-10-06T17:47:46Z

There was consensus on TC39 Incubator call to expose source text only for primitive values, and in particular to not provide input and index, unless a significant use case is identified.

bergus · 2020-12-04T21:09:25Z

index (actually, line number and position) would be very relevant if I use JSON.parse for deserialising values that are encoded as a custom scheme, where the decoding encounters a parse error. This would allow the parser to throw an appropriate human-readable error message, aiding in finding the malformed part of the source file. Not sure if that would count as a significant use case?

Examples:

// parsing JSON-LD
JSON.parse(text, (key, val, context) => {
  if (key == "@type" || key == "@context") {
    try {
      return new URL(val);
    } catch(err) {
      throw Obect.assign(new SyntaxError(err.message + ` at line ${context.line}, character ${context.position}`), err);
    }
  } else {
    return val;
  }
});

// parsing abstract syntax tree
JSON.parse(text, (key, val, context) {
  if (typeof val == "object" && val.type == "TreeNode") {
    if (!Array.isArray(val.children))
      throw new SyntaxError(`Encountered TreeNode without children at line ${context.line}, character ${context.position}`);
    return new TreeNode(val.children);
  }
  return val;
});

sarahghp · 2021-10-27T08:24:27Z

Agree that "throwing indicative errors" is a strong use-case for exposing position information. Especially with other data types in the pipeline (Decimal, etc.), establishing a generalized serialization pattern with indicative errors would be a strong base for growing functionality.

Per tc39#10 (comment)

Per #10 (comment)

gibson042 added a commit to gibson042/proposal-json-parse-with-source that referenced this issue May 28, 2022

Normative: Restrict source text access to primitive values

b5745c5

Per tc39#10 (comment)

gibson042 mentioned this issue May 28, 2022

Normative: Restrict source text access to primitive values #24

Merged

gibson042 added a commit that referenced this issue May 28, 2022

Normative: Restrict source text access to primitive values (#24)

7dd028c

Per #10 (comment)

gibson042 mentioned this issue May 28, 2022

Should reviver functions still get a context argument for non-primitive values? #25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose position and/or input of source text? #10

Expose position and/or input of source text? #10

gibson042 commented Feb 6, 2020

ljharb commented Feb 6, 2020

gibson042 commented Feb 6, 2020

michaelficarra commented Jul 22, 2020

gibson042 commented Jul 23, 2020

michaelficarra commented Jul 27, 2020

gibson042 commented Jul 27, 2020

michaelficarra commented Jul 27, 2020

gibson042 commented Jul 27, 2020

michaelficarra commented Jul 27, 2020

gibson042 commented Oct 6, 2020

bergus commented Dec 4, 2020 •

edited

sarahghp commented Oct 27, 2021

Expose position and/or input of source text? #10

Expose position and/or input of source text? #10

Comments

gibson042 commented Feb 6, 2020

ljharb commented Feb 6, 2020

gibson042 commented Feb 6, 2020

michaelficarra commented Jul 22, 2020

gibson042 commented Jul 23, 2020

michaelficarra commented Jul 27, 2020

gibson042 commented Jul 27, 2020

michaelficarra commented Jul 27, 2020

gibson042 commented Jul 27, 2020

michaelficarra commented Jul 27, 2020

gibson042 commented Oct 6, 2020

bergus commented Dec 4, 2020 • edited

sarahghp commented Oct 27, 2021

bergus commented Dec 4, 2020 •

edited