Skip to content

feat(cloud): TOON re-encoder for JSON-array outputs#4

Closed
jhonatanjunio wants to merge 1 commit into
mainfrom
feat/p1-toon-encoder
Closed

feat(cloud): TOON re-encoder for JSON-array outputs#4
jhonatanjunio wants to merge 1 commit into
mainfrom
feat/p1-toon-encoder

Conversation

@jhonatanjunio
Copy link
Copy Markdown
Owner

Closes #3

Summary

Adds a Token-Oriented Object Notation (TOON, https://github.com/toon-format/toon) re-encoder applied to JSON-array outputs from gh, kubectl, aws, gcloud, and az. Reduces tokens on this shape by 40-60% by declaring the schema once in a header and listing rows CSV-style.

Why

gh pr list --json …, kubectl get pods -o json, aws ec2 describe-instances --output json, etc. all return arrays of uniform objects — exactly the shape TOON was designed for. Today CloudHandler only strips noise; the heavy redundant field-name repetition stays. Free 40-60%.

What changed

  • New module src/strategies/toon.rs (~340 LoC incl. tests).
    • try_to_toon(text: &str) -> Option<String> — hand-rolled JSON sub-parser + TOON emitter, zero-dep.
    • Strictly lossless-or-reject: returns None for any non-uniform / nested / out-of-order / unsmaller input.
  • CloudHandler calls it when the command requests JSON (--json, -o json, -ojson, --output json, --output=json, --format json, --format=json, -f json); otherwise pipes through unchanged.
  • commands::cloud::tests::detects_json_request_flags verifies the detector across CLI conventions.

Worked example

Input (raw gh pr list --json number,title,state):

[
    {"number": 1, "title": "Add feature", "state": "OPEN"},
    {"number": 2, "title": "Fix bug", "state": "MERGED"},
    {"number": 3, "title": "Refactor", "state": "CLOSED"}
]

Output (TOON):

items[3]{number,title,state}:
  1,Add feature,OPEN
  2,Fix bug,MERGED
  3,Refactor,CLOSED

Constraints honoured

  • Zero new runtime deps (Cargo.toml unchanged).
  • Returns None if TOON would not be smaller than the input — never makes things worse.
  • Falls back to the existing pipeline on any non-conforming shape.

Tests

13 new unit tests:

TOON encoder (12):

  • encodes_uniform_gh_pr_list
  • encodes_kubectl_pods_shape
  • quotes_values_with_commas
  • rejects_heterogeneous_keys
  • rejects_keys_in_different_order
  • rejects_nested_objects
  • rejects_nested_arrays
  • rejects_empty_array
  • rejects_non_array_root
  • rejects_when_toon_not_smaller
  • handles_nulls_and_booleans_and_floats
  • decodes_json_string_escapes

CloudHandler dispatch (1):

  • detects_json_request_flags

Full lib suite: 172/172 passing (160 prior + 12 + 1).

Out of scope

  • NDJSON streams from docker ps --format '{{json .}}' — follow-up.
  • cargo metadata --format-version 1 — nested by design, not worth TOON.

…, gcloud, az)

Adds `strategies::toon` — a zero-dep encoder that re-emits JSON arrays of
uniform flat objects as TOON (Token-Oriented Object Notation,
https://github.com/toon-format/toon). TOON declares the schema once in a
header and lists rows CSV-style, typically saving 40-60% on tokens for
this shape.

`CloudHandler` calls the encoder when the command requests JSON output
(`--json`, `-o json`, `--output json`, `--format=json`, etc.) and pipes
through to the standard filter pipeline otherwise.

The encoder is strictly lossless-or-reject: it returns None whenever the
input deviates from a top-level array of flat uniform objects (mismatched
keys, key ordering, nested values, empty array, non-array root, or when
the TOON output would not actually be smaller).

Closes claudioemmanuel#128

Signed-off-by: Jhonatan Junio <jhonatanjuniocp@gmail.com>
@jhonatanjunio jhonatanjunio force-pushed the feat/p1-toon-encoder branch from 02f4b83 to 5202d61 Compare May 18, 2026 17:11
@jhonatanjunio
Copy link
Copy Markdown
Owner Author

Superseded by upstream PR — moved to claudioemmanuel/squeez. See README of this fork for the upstream link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

P1: Add TOON encoder for JSON-array outputs (gh, kubectl, aws)

1 participant