Refactor fastJsonFormat for high-performance JSON formatting and Unicode decoding #3

Sumith-Kumar-Saini · 2025-11-09T19:27:04Z

Description

This PR refactors fastJsonFormat in src/index.js to improve speed, memory efficiency, and Unicode handling.
Key updates:

Added decodeUnicodeString() for proper \uXXXX decoding (including surrogate pairs).
Used Uint8Array lookup tables for structural and whitespace characters.
Switched to chunked string building for better performance on large inputs.
Removed legacy helper functions (scanString, scanAtom, etc.) and simplified logic.

Technical Details

Fully backward compatible with previous output.
Cleaner and more maintainable code with reduced overhead.
Improved large-file handling and Unicode correctness.

Benchmark

CPU: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
RAM: 16.0 GB
GPU: Intel(R) Iris(R) Plus Graphics 655

Testing

Manually verified formatting accuracy and Unicode output.
Compared outputs with JSON.stringify and legacy implementation.
Passed all existing fastJsonFormat tests.

Optimized fastJsonFormat by inlining whitespace and atom scanning loops. Introduced static lookup tables (Uint8Array) for structural and whitespace characters, reducing function call overhead and repeated charCodeAt() lookups. Benchmark improvements: ~10–20% faster on large JSON inputs. Refs: #1

…ecoding

Sumith-Kumar-Saini · 2025-11-10T11:11:25Z

@helloanoop, I would be grateful if you could review my pull request.

helloanoop · 2025-11-10T11:29:33Z

@Sumith-Kumar-Saini can you share screenshots of benchmark run before and after your changes, so that I can see the exact improvement across different data sizes

Sumith-Kumar-Saini · 2025-11-10T12:55:01Z

Hey @helloanoop 👋
Please find the benchmark screenshots before and after the changes below.
The left side shows results from the old code (upstream/main), and the right side shows results after my changes.

Benchmarks:

System (for reference):

Intel Core i5 (4 cores / 8 threads, ~2.4 GHz)
16 GB RAM
Integrated Intel GPU
Windows 10 Pro

Let me know if you’d like additional sizes or runs.

helloanoop · 2025-11-10T13:00:51Z

@Sumith-Kumar-Saini This is fantastic. Wanted to have further discussion, Can you accept my connection on Linkedin. Lets talk there

Sumith-Kumar-Saini · 2025-11-10T13:05:21Z

Thanks! Glad you found it useful. I've accepted your LinkedIn connection — happy to continue the discussion there.

helloanoop · 2025-11-10T15:17:51Z

Hey @Sumith-Kumar-Saini Really impressed with the perf improvement 👏 👏

I ran benchmark on my system, while the performance doubles for files around 100kb, it gets worse than the current performance for larger data sizes. Could you check what might be causing that ?

I am running this on Macbook M4 Air 16GB Ram

Sumith-Kumar-Saini · 2025-11-11T16:57:10Z

Thanks for catching that, @helloanoop, I’ll dig into the large file performance issue and see what’s causing the slowdown — will share an update soon.

helloanoop · 2025-11-12T07:55:07Z

Thank you @Sumith-Kumar-Saini

The performance improvements in your approach are significant, so I do want to get your PR merged.
If it turns out that in your approach, it only works for smaller data sizes, then I am open to keep both approaches. Your approach for sizes less than 2MB, and current approach for larger data sizes

helloanoop · 2025-11-13T06:19:27Z

Fantastic @Sumith-Kumar-Saini

Latest benchmark improvements are 🔥

helloanoop · 2025-11-13T06:20:41Z

@Sumith-Kumar-Saini Could you fix the conflicts ? It resulted since we merged this PR to decode forward slashes.

… modules

Sumith-Kumar-Saini · 2025-11-13T12:53:19Z

Hey @helloanoop,

I have dug deeper to identify what was causing the long operation and found that the outputArray wasn’t working properly. The main issue was that the string concatenation process was taking too much time, which slowed down the execution. That's the reason I used TextCodec (AKA TextEncoder and TextDecoder) and Uint8Array to store chunks of textBuffer.

Use of AI Assistance

Yes, I have used AI to help me understand some of the low-level programming concepts in JavaScript.
It also assisted me in handling some of the more complex logic since I needed to explore different ways to improve performance. From my understanding, working closer to the language’s low-level operations can significantly help with optimisation.
I apologise for using AI, but it was mainly to enhance my understanding of complex, low-level logic.

Code Update

I have added the forward_slash logic to the decodeEscapedUnicode function.
The fastJsonFormat can also run on instances of the String object in JavaScript.

Question

Can we convert this CommonJS to ES6+?
Because, from what I have read, ES modules (ES6+) are generally faster and more performant than CommonJS (CJS).

Performance Details

Based on my testing, the new implementation is approximately 98% faster than the current program.
The 98% improvement is an average result from multiple test runs.
It also scales well with larger datasets (though it still blocks the main thread, which is the tradeoff).

Current Program Benchmark

Size	fast-json-format	json-bigint	lossless-json	JSON.stringify
100 KB	238 ops/sec	233 ops/sec	191 ops/sec	730 ops/sec
1 MB	22 ops/sec	22 ops/sec	14 ops/sec	68 ops/sec
5 MB	4 ops/sec	4 ops/sec	2 ops/sec	14 ops/sec
10 MB	2 ops/sec	2 ops/sec	1 ops/sec	7 ops/sec

New Implementation Program Benchmark

Size	fast-json-format	json-bigint	lossless-json	JSON.stringify
100 KB	397 ops/sec	246 ops/sec	199 ops/sec	783 ops/sec
1 MB	43 ops/sec	23 ops/sec	15 ops/sec	73 ops/sec
5 MB	8 ops/sec	4 ops/sec	2 ops/sec	14 ops/sec
10 MB	4 ops/sec	2 ops/sec	1 ops/sec	7 ops/sec

Future Plans

I plan to implement asynchronous processing using data streams — either through Readable in Node.js or ReadableStream in the browser. (or both cross-environment)

Lastly, could you please check the program's performance and compare it with the current codebase?

Sumith-Kumar-Saini · 2025-11-15T11:22:08Z

Hey @helloanoop,

The tests have passed successfully, and this branch is now ready for review and merge.

Quick Question

I was wondering if it would be possible to make this package more customizable by refactoring the functionality into a class. This way, we (as contributors) could potentially add features and make certain parts of the package plugin-optional. Is that something we can explore?
Can we convert this to TypeScript (ES6+)? It offers better typing, performance, and scalability.

helloanoop · 2025-11-15T15:30:49Z

Great job @Sumith-Kumar-Saini !

Below are the before and after benchmarks on my machine

helloanoop · 2025-11-15T15:33:51Z

I apologise for using AI, but it was mainly to enhance my understanding of complex, low-level logic

No need to apologise. I would infact encourage one use of AI to augment themselves, so long as you know and understand the logic and reasoning behind the code.

Can we convert this CommonJS to ES6+?

Yes

Can we convert this to TypeScript (ES6+)? It offers better typing, performance, and scalability.

Anything as long as it results in the improvement of the benchmarks and runs on browser.

Sumith-Kumar-Saini and others added 6 commits October 23, 2025 01:12

chore: simplify test script in package.json

b61f346

chore: bump version to 0.1.1 in package.json

c3f771b

Merge branch 'main' into perf/inline-scan-optimization

af93ac4

refactor: json formatter for simplicity and performance

ad6adc7

perf(json): optimize fastJsonFormat with chunked output and Unicode d…

da21ecb

…ecoding

Sumith-Kumar-Saini added 2 commits November 12, 2025 23:43

refactor(json): improve performance and readability of fastJsonFormat

573e532

refactor(json): optimize fastJsonFormat and rename utility functions

3ebc669

Sumith-Kumar-Saini marked this pull request as draft November 13, 2025 11:37

Sumith-Kumar-Saini and others added 2 commits November 13, 2025 17:09

refactor(core): extract constants and utility functions into separate…

4df5928

… modules

Merge branch 'main' into perf/inline-scan-optimization

6907799

Sumith-Kumar-Saini marked this pull request as ready for review November 13, 2025 12:53

helloanoop approved these changes Nov 15, 2025

View reviewed changes

helloanoop merged commit 578b5ab into usebruno:main Nov 15, 2025
1 check passed

Refactor fastJsonFormat for high-performance JSON formatting and Unicode decoding #3

Refactor fastJsonFormat for high-performance JSON formatting and Unicode decoding #3

Uh oh!

Conversation

Sumith-Kumar-Saini commented Nov 9, 2025

Description

Technical Details

Benchmark

Testing

Uh oh!

Sumith-Kumar-Saini commented Nov 10, 2025

Uh oh!

helloanoop commented Nov 10, 2025

Uh oh!

Sumith-Kumar-Saini commented Nov 10, 2025

Uh oh!

helloanoop commented Nov 10, 2025

Uh oh!

Sumith-Kumar-Saini commented Nov 10, 2025

Uh oh!

helloanoop commented Nov 10, 2025

Uh oh!

Sumith-Kumar-Saini commented Nov 11, 2025

Uh oh!

helloanoop commented Nov 12, 2025

Uh oh!

helloanoop commented Nov 13, 2025

Uh oh!

helloanoop commented Nov 13, 2025

Uh oh!

Sumith-Kumar-Saini commented Nov 13, 2025

Use of AI Assistance

Code Update

Question

Performance Details

Current Program Benchmark

New Implementation Program Benchmark

Future Plans

Uh oh!

Sumith-Kumar-Saini commented Nov 15, 2025

Quick Question

Uh oh!

helloanoop commented Nov 15, 2025

Uh oh!

helloanoop commented Nov 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants