`::JSON.pretty_generate` should sort hash keys

**Summary**
`JSON.pretty_generate` currently preserves the insertion order of Ruby hashes. This leads to non-deterministic output when hash construction order varies, making diffs noisy and reproducibility harder. Add an option to sort hash keys during generation to produce stable, predictable JSON output.

**Motivation / Problem**

* In many workflows (config generation, test fixtures, CI artifacts), stable serialization is critical.
* Hash insertion order may differ across code paths, Ruby versions, or data sources, causing semantically identical objects to produce different JSON.
* This complicates:

  * Git diffs and code reviews
  * Caching and content hashing
  * Snapshot testing
  * Reproducible builds

**Proposed Solution**
Introduce an option to `JSON.pretty_generate` (and possibly `JSON.generate`) to sort object keys lexicographically.

**API Options (one of):**

1. Keyword argument:

   ```ruby
   JSON.pretty_generate(obj, sort_keys: true)
   ```
2. Extend `JSON::State`:

   ```ruby
   state = JSON::State.new(sort_keys: true)
   JSON.pretty_generate(obj, state)
   ```

**Behavior**

* When `sort_keys: true`, all hashes are serialized with keys sorted (string comparison).
* Default remains `false` to preserve backward compatibility and performance characteristics.

**Example**

```ruby
obj = { b: 1, a: 2 }

JSON.pretty_generate(obj)
# => {
#      "b": 1,
#      "a": 2
#    }

JSON.pretty_generate(obj, sort_keys: true)
# => {
#      "a": 2,
#      "b": 1
#    }
```

**Alternatives Considered**

* Pre-sorting hashes before serialization:

  * Requires deep traversal and duplication of data structures
  * Error-prone and inefficient for large nested objects

* Relying on insertion order discipline:

  * Not robust across boundaries or contributors

**Impact**

* Improves determinism and reproducibility across tooling and environments
* Reduces diff noise and improves developer experience
* Aligns with behavior available in other ecosystems (e.g., Python’s `json.dumps(sort_keys=True)`)

**Performance Considerations**

* Sorting introduces overhead proportional to key count per object
* Acceptable when opt-in; no impact on default behavior

**Backward Compatibility**

* Fully backward compatible if default remains unsorted

**Test Plan**

* Unit tests verifying:

  * Sorted vs unsorted output for flat and deeply nested hashes
  * Stability across multiple invocations
  * Mixed key types (symbols/strings) normalized to strings before sort
* Benchmark comparison with and without sorting

**Open Questions**

* Should sorting be strictly lexicographic on stringified keys?
* Should there be a global default toggle via `JSON::State` configuration?

**Additional Context**
This feature would support reproducible outputs in CI pipelines and long-lived systems where deterministic artifacts are a requirement.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`::JSON.pretty_generate` should sort hash keys #976

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

::JSON.pretty_generate should sort hash keys #976

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`::JSON.pretty_generate` should sort hash keys #976