Skip to content

Preserve non-string dict keys in rich display#9301

Merged
manzt merged 7 commits intomainfrom
manzt/dict-keys
Apr 21, 2026
Merged

Preserve non-string dict keys in rich display#9301
manzt merged 7 commits intomainfrom
manzt/dict-keys

Conversation

@manzt
Copy link
Copy Markdown
Collaborator

@manzt manzt commented Apr 21, 2026

Fixes #9288
Fixes #2667

Marimo displays dicts using application/json but Python dicts aren't JSON and accept non-string keys (int, tuple, ...).

The existing serializer worked around this limitation by passing primitive keys to json.dumps (which stringifies them) and running str() on composite keys. This lead to:

These changes extend the existing text/plain+<type>: leaf-mimetype convention (already used for value encoding) to dict keys. Non-string keys are emitted as prefixed strings that the frontend decodes on render and copy. Literal string keys that happen to start with text/plain+ are escaped so they round-trip unchanged.

Wire format

Python key Wire string
"hello" "hello"
"text/plain+int:2" (literal str that looks encoded) "text/plain+str:text/plain+int:2"
2 / 2**64 (any int) "text/plain+int:<value>"
2.5 "text/plain+float:2.5"
float('nan') / float('inf') "text/plain+float:nan" / "text/plain+float:inf"
True / False "text/plain+bool:True" / "text/plain+bool:False"
None "text/plain+none:"
(1, 2) "text/plain+tuple:[1, 2]"
frozenset({1, 2}) "text/plain+frozenset:[1, 2]"

Before / after

my_map = {"2": "oh", 2: "no"}

Before:

{ 1 Items
  "2": "no"          # silently dropped one entry
}

After:

Copilot AI review requested due to automatic review settings April 21, 2026 14:19
@manzt manzt added the bug Something isn't working label Apr 21, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Apr 21, 2026 9:27pm

Request Review

Comment thread tests/_output/formatters/test_structures.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes rich dict rendering/copying when Python dicts contain non-string keys by introducing a reversible “typed key” wire encoding (text/plain+<type>:) so keys survive JSON round-trips without collisions or type loss.

Changes:

  • Backend: extend structure flattening to support a key_formatter, and use it in the structures formatter to encode non-string dict keys with text/plain+... prefixes (escaping literal string keys that already start with the prefix).
  • Tests (Python): add regression and coverage tests for non-string key encoding (ints, floats incl. NaN/Inf, tuples, frozensets, escaping, nesting).
  • Frontend: decode encoded keys for tree rendering and for “copy as Python” output; add corresponding unit tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/_output/formatters/test_structures.py Adds coverage/regression tests asserting dict keys encode safely and round-trip through strict JSON parsing.
marimo/_utils/flatten.py Adds optional key_formatter hook to control dict-key repacking during flatten/unflatten.
marimo/_output/formatters/structures.py Implements key encoding (_key_formatter) and wires it into format_structure() output for JSON.
frontend/src/components/editor/output/JsonOutput.tsx Decodes typed key strings for display in the JSON tree and for Python-like copy output.
frontend/src/components/editor/output/tests/json-output.test.ts Adds copy-output tests to ensure encoded keys decode into correct Python literals.
frontend/src/components/editor/output/tests/JsonOutput-mimetype.test.tsx Adds render test verifying encoded keys display as Python-style keys (unquoted ints, tuples, etc.).

Comment thread frontend/src/components/editor/output/JsonOutput.tsx Outdated
Comment thread frontend/src/components/editor/output/JsonOutput.tsx Outdated
Comment thread frontend/src/components/editor/output/JsonOutput.tsx
Comment thread frontend/src/components/editor/output/__tests__/JsonOutput-mimetype.test.tsx Outdated
Comment thread marimo/_output/formatters/structures.py Outdated
Comment thread marimo/_output/formatters/structures.py Outdated
@manzt manzt force-pushed the manzt/dict-keys branch from 2346ad0 to 64379ce Compare April 21, 2026 14:41
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 64379ce to 3ec0172 Compare April 21, 2026 14:54
@manzt
Copy link
Copy Markdown
Collaborator Author

manzt commented Apr 21, 2026

As a follow up, I think we could make quoting consistent between keys and values. fixed

image

manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 3ec0172 to 4b84ff9 Compare April 21, 2026 15:14
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 4b84ff9 to 0d60939 Compare April 21, 2026 15:15
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from fac7e59 to 151907c Compare April 21, 2026 20:20
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 151907c to 1e07171 Compare April 21, 2026 21:05
manzt added 2 commits April 21, 2026 17:25
Rich display of a Python dict serialized as application/json via
json.dumps, which coerces non-string keys to strings. That means
{"2": "oh", 2: "no"} emitted duplicate JSON keys that JSON.parse
collapses on the frontend (entries silently dropped), and tuple keys
like (1, 2) rendered as quoted "(1, 2)" (type info lost).

Non-string primitive and composite keys are now encoded with the same
text/plain+<type>: mimetype convention used for values. The frontend
can decode them to restore the original Python types.

- flatten: new optional key_formatter param applied to each dict key
  before repacking; existing json_compat_keys behavior preserved as the
  default for other callers.
- structures: _key_formatter handles int, float (incl. NaN/Inf), bool,
  None, tuple, frozenset, and escapes literal string keys that start
  with 'text/plain+' so they round-trip unchanged.

Fixes #9288. Partial fix for #2667 (frontend render in follow-up).
Decode the text/plain+<type>: keys emitted by the Python side so dict
output renders with the right Python types: int/float/bool/None
unquoted, tuples in parens, frozenset({...}), and string keys that
were escaped (because they looked encoded) are re-quoted as plain
strings.

- JsonOutput: standalone keyRenderer backed by a small KEY_DECODERS
  table, wired into the JsonViewer only when valueTypes is 'python'.
- getCopyValue: pre-walk the data to rewrite encoded keys into
  REPLACE_PREFIX/SUFFIX marker strings so the existing quote-strip
  pass unquotes them as Python literals. NaN/Inf float keys copy as
  float('nan'), float('inf'), -float('inf').

Closes #2667 (tuple keys display as strings). Companion to the
backend encoder that fixes #9288.
manzt added 5 commits April 21, 2026 17:25
Manual smoke-test notebook for the rich display of Python dicts, exercising:

- baselines (empty, single-entry, record-shaped string-key dicts)
- value variety (ints, bigints, floats, NaN/Inf, bools, None, strings,
  lists, tuples, sets, frozensets, nested dicts, bytes)
- non-string keys (collision cases, all primitives, NaN/Inf, tuple,
  frozenset)
- the text/plain+str: string-escape edge case
- nesting and dict-in-list / tuple-of-dict composition
- defaultdict and OrderedDict
- Python-level True/1/1.0 hash-collapse
- the copy-to-Python button with an all-types target

Each section includes a serialized() helper that shows the wire JSON and
the JSON.parse entry count, making it easy to spot if a future change
silently drops entries.

Related issues: #9288, #2667.
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
Previously set values serialized as `text/plain+set:{1, 2, 3}` (Python
set-literal string via `str()`) and frozenset values fell through to
the `text/plain:` fallback (plain-text display). Both used Python's
single-quoted repr for string elements, so a dict like:

    {"a": frozenset({"x", "y"})}

rendered with inconsistent quoting — double-quoted keys/values
throughout except for the frozenset value's elements, which came out
single-quoted (`frozenset({'x', 'y'})`).

Normalize both to the JSON-list payload form we already use for tuple
values and non-string key encoding. The frontend shares a pair of
helpers (`formatSetPayload`, `formatFrozensetPayload`) between the
tree renderer and the copy path, handling the empty cases correctly
(`set()` and `frozenset()`, not `{}`).

Wire format changes:

- set:       `text/plain+set:{1, 2, 3}`       -> `text/plain+set:[1, 2, 3]`
- frozenset: `text/plain:frozenset({'x','y'})` -> `text/plain+frozenset:["x", "y"]`

Rendering is now consistent:

- `{1, 2}` / `set()` for sets
- `frozenset({"x", "y"})` / `frozenset()` for frozensets

Tests updated accordingly.
Replace `assert x == {literal}` patterns in the dict-key-encoding
tests with `assert x == snapshot({literal})`. Functionally identical
today, but if we ever need to update the expected wire format, the
snapshots auto-update with `pytest --inline-snapshot=update` instead
of hand-edited in every test.

Also cleans up the redundant `import json` inside several test
functions — the module-level import (hoisted earlier) is in scope.
@manzt manzt force-pushed the manzt/dict-keys branch from 1e07171 to 6ca6b9d Compare April 21, 2026 21:25
@manzt manzt merged commit 2952e30 into main Apr 21, 2026
43 checks passed
@manzt manzt deleted the manzt/dict-keys branch April 21, 2026 21:52
@github-actions
Copy link
Copy Markdown

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.23.3-dev15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rich dict display collapses integer-like string and int keys tuple dict keys display as strings

3 participants