Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend this proposal to include serialization? #12

Closed
gibson042 opened this issue Feb 6, 2020 · 10 comments · Fixed by #22
Closed

Extend this proposal to include serialization? #12

gibson042 opened this issue Feb 6, 2020 · 10 comments · Fixed by #22

Comments

@gibson042
Copy link
Collaborator

This proposal currently covers only the parsing side, but full round-tripping would also require serialization of e.g. BigInt values as unquoted digit sequences. The committee seemed tepid about including serialization in this proposal, but I still wanted to capture the concept even if it is rejected as expected.

@kaizhu256
Copy link

i think serializing bigint will likely require an additional options argument in JSON.stringify as mentioned in cookbook scenarios.

if the intent is to roundtrip parse and stringify bigints, then i feel this proposal is a dead-end and not the way to go.

@rauschma
Copy link

Yes, please! Roundtripping seems such a core use case that it would be a shame if it weren’t supported.

One possibility:

function bigintReplacer(_key, value) {
  if (typeof value === 'bigint') {
    return JSON.rawSource(String(value));
    // Or: return {[Symbol.rawJsonSource]: String(value)};
  }
  return value;
}

@gibson042
Copy link
Collaborator Author

Including serialization will require motivating use cases. I can imagine needing to preserve uint64 data (e.g., Twitter ids) and possibly high-precision sensor data (e.g., IEEE binary128), but could use some broader and/or more concrete examples.

@kaizhu256
Copy link

but could use some broader and/or more concrete examples.

hypothetical-but-credible-finapp-example, is message-passing sql-tables with [arbitrary] bigint-columns between browser <-> server.

e.g. serialize following sql-table:

id    stock         market_cap
INT   VARCHAR(4)    BIGINT
--    --------      ------------------
 1    aapl          $1,690,000,000,000
 2    amzn          $1,550,000,000,000
 2    goog          $1,070,000,000,000

to space-efficent json-form:

{
    "columns": [ "id", "stock", "market_cap" ],
    "rows": [
        [ 1, "aapl", 1690000000000 ],
        [ 2, "amzn", 1550000000000 ],
        [ 3, "goog", 1070000000000 ]
    ]
}

and roundtrip-message-pass between [browser] sql.js <-> [server] mssql.

@gibson042
Copy link
Collaborator Author

Thanks. Those numbers are three orders of magnitude less than Number.MAX_SAFE_INTEGER, but perhaps there is something similar for cryptocurrencies or whole-market summations.

@gibson042
Copy link
Collaborator Author

There was consensus on the TC39 Incubator call to include serialization in this proposal to avoid shipping an incomplete solution with corresponding ecosystem fragmentation when serialization is ultimately added.

To avoid surreptitious output hijacking, the approach will tentatively use wrapping objects with symbols that are unique for each invocation of JSON.stringify, e.g.

let rawTags = [];
function replacer(key, val, {rawTag}) {
  rawTags.push(rawTag);
  if ( typeof val !== "bigint" ) return val;
  // Serialize BigInt values as raw digit strings.
  return {[rawTag]: String(val)};
};

// BigInt values serialize in context as raw digit strings.
assert.strictEqual(JSON.stringify([1n], replacer), "[1]");
assert.strictEqual(JSON.stringify([2n], replacer), "[2]");

// The replacer was invoked four times (once for each array and once for each array element).
assert.strictEqual(rawTags.length, 4);

// The rawTag values match for the first two invocations and the second two invocations.
assert.strictEqual(rawTags[1], rawTags[0]);
assert.strictEqual(rawTags[3], rawTags[2]);

// ...but not between the first and second invocations.
assert.notStrictEqual(rawTags[1], rawTags[2]);

@bergus
Copy link

bergus commented Dec 4, 2020

How will this work with .toJSON() methods? Are they allowed to return raw values as well?

Can you elaborate about "surreptitious output hijacking", what scenarios are you worried about? (Is there literature about attack vectors, or generic security advice?) I can think of a well-known Symbol.rawJsonSource passing security boundaries and affecting JSON output where I wouldn't want it, but a realm-specific JSON.rawSource should be only accessible to those who could also overwrite JSON.stringify itself. Unless it's leaked…

Do the raw source contents need to be valid JSON texts/tokens? Could I use JSON.stringify to output, say, YAML with the right replacer?

@mhofman
Copy link
Member

mhofman commented Oct 21, 2021

Should the algorithm verify that the raw value parses as JSON to deal with the output hijacking in the case where the replacer is composed of potentially untrusted behaviors (e.g. delegating to a class specific replacer) without requiring the replacer itself to implement this type of checking.

@legendecas
Copy link
Member

I'm curious about if there are other use cases for raw json source replacer, besides from output BigInt as unquoted digit sequences? Would it be more feasible to just add BigInt primitive stringify support in JSON.stringify instead of exposing generic raw json source replacer?

@bakkot
Copy link

bakkot commented Oct 22, 2021

I can think of at least a couple:

  • serializing other non-BigInt values as numbers, e.g. from a userland BigDecimal library
  • escaping more characters in strings, e.g. when there's an intermediary which mangles non-ascii characters

gibson042 added a commit to gibson042/proposal-json-parse-with-source that referenced this issue May 28, 2022
gibson042 added a commit that referenced this issue May 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants