Skip to content

Conversation

@Sumith-Kumar-Saini
Copy link
Contributor

Description

This PR refactors fastJsonFormat in src/index.js to improve speed, memory efficiency, and Unicode handling.
Key updates:

  • Added decodeUnicodeString() for proper \uXXXX decoding (including surrogate pairs).
  • Used Uint8Array lookup tables for structural and whitespace characters.
  • Switched to chunked string building for better performance on large inputs.
  • Removed legacy helper functions (scanString, scanAtom, etc.) and simplified logic.

Technical Details

  • Fully backward compatible with previous output.
  • Cleaner and more maintainable code with reduced overhead.
  • Improved large-file handling and Unicode correctness.

Benchmark

  • CPU: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz
  • RAM: 16.0 GB
  • GPU: Intel(R) Iris(R) Plus Graphics 655
Screenshot 2025-11-10 004512

Testing

  • Manually verified formatting accuracy and Unicode output.
  • Compared outputs with JSON.stringify and legacy implementation.
  • Passed all existing fastJsonFormat tests.

Sumith-Kumar-Saini and others added 6 commits October 23, 2025 01:12
Optimized fastJsonFormat by inlining whitespace and atom scanning loops.
Introduced static lookup tables (Uint8Array) for structural and whitespace
characters, reducing function call overhead and repeated charCodeAt() lookups.

Benchmark improvements: ~10–20% faster on large JSON inputs.

Refs: #1
@Sumith-Kumar-Saini
Copy link
Contributor Author

@helloanoop, I would be grateful if you could review my pull request.

@helloanoop
Copy link
Contributor

@Sumith-Kumar-Saini can you share screenshots of benchmark run before and after your changes, so that I can see the exact improvement across different data sizes

@Sumith-Kumar-Saini
Copy link
Contributor Author

Hey @helloanoop 👋
Please find the benchmark screenshots before and after the changes below.
The left side shows results from the old code (upstream/main), and the right side shows results after my changes.

Benchmarks:

image

System (for reference):

  • Intel Core i5 (4 cores / 8 threads, ~2.4 GHz)
  • 16 GB RAM
  • Integrated Intel GPU
  • Windows 10 Pro

Let me know if you’d like additional sizes or runs.

@helloanoop
Copy link
Contributor

@Sumith-Kumar-Saini This is fantastic. Wanted to have further discussion, Can you accept my connection on Linkedin. Lets talk there

@Sumith-Kumar-Saini
Copy link
Contributor Author

Thanks! Glad you found it useful. I've accepted your LinkedIn connection — happy to continue the discussion there.

@helloanoop
Copy link
Contributor

Hey @Sumith-Kumar-Saini Really impressed with the perf improvement 👏 👏

I ran benchmark on my system, while the performance doubles for files around 100kb, it gets worse than the current performance for larger data sizes. Could you check what might be causing that ?

I am running this on Macbook M4 Air 16GB Ram

image

@Sumith-Kumar-Saini
Copy link
Contributor Author

Thanks for catching that, @helloanoop, I’ll dig into the large file performance issue and see what’s causing the slowdown — will share an update soon.

@helloanoop
Copy link
Contributor

Thank you @Sumith-Kumar-Saini

The performance improvements in your approach are significant, so I do want to get your PR merged.
If it turns out that in your approach, it only works for smaller data sizes, then I am open to keep both approaches. Your approach for sizes less than 2MB, and current approach for larger data sizes

@helloanoop
Copy link
Contributor

Fantastic @Sumith-Kumar-Saini

Latest benchmark improvements are 🔥

image

@helloanoop
Copy link
Contributor

@Sumith-Kumar-Saini Could you fix the conflicts ? It resulted since we merged this PR to decode forward slashes.

@Sumith-Kumar-Saini Sumith-Kumar-Saini marked this pull request as draft November 13, 2025 11:37
@Sumith-Kumar-Saini
Copy link
Contributor Author

Hey @helloanoop,

I have dug deeper to identify what was causing the long operation and found that the outputArray wasn’t working properly. The main issue was that the string concatenation process was taking too much time, which slowed down the execution. That's the reason I used TextCodec (AKA TextEncoder and TextDecoder) and Uint8Array to store chunks of textBuffer.


Use of AI Assistance

  • Yes, I have used AI to help me understand some of the low-level programming concepts in JavaScript.
  • It also assisted me in handling some of the more complex logic since I needed to explore different ways to improve performance. From my understanding, working closer to the language’s low-level operations can significantly help with optimisation.
  • I apologise for using AI, but it was mainly to enhance my understanding of complex, low-level logic.

Code Update

  • I have added the forward_slash logic to the decodeEscapedUnicode function.
  • The fastJsonFormat can also run on instances of the String object in JavaScript.

Question

  • Can we convert this CommonJS to ES6+?
    Because, from what I have read, ES modules (ES6+) are generally faster and more performant than CommonJS (CJS).

Performance Details

  • Based on my testing, the new implementation is approximately 98% faster than the current program.
  • The 98% improvement is an average result from multiple test runs.
  • It also scales well with larger datasets (though it still blocks the main thread, which is the tradeoff).

Current Program Benchmark

Size fast-json-format json-bigint lossless-json JSON.stringify
100 KB 238 ops/sec 233 ops/sec 191 ops/sec 730 ops/sec
1 MB 22 ops/sec 22 ops/sec 14 ops/sec 68 ops/sec
5 MB 4 ops/sec 4 ops/sec 2 ops/sec 14 ops/sec
10 MB 2 ops/sec 2 ops/sec 1 ops/sec 7 ops/sec

New Implementation Program Benchmark

Size fast-json-format json-bigint lossless-json JSON.stringify
100 KB 397 ops/sec 246 ops/sec 199 ops/sec 783 ops/sec
1 MB 43 ops/sec 23 ops/sec 15 ops/sec 73 ops/sec
5 MB 8 ops/sec 4 ops/sec 2 ops/sec 14 ops/sec
10 MB 4 ops/sec 2 ops/sec 1 ops/sec 7 ops/sec

Future Plans

  • I plan to implement asynchronous processing using data streams — either through Readable in Node.js or ReadableStream in the browser. (or both cross-environment)

Lastly, could you please check the program's performance and compare it with the current codebase?

@Sumith-Kumar-Saini Sumith-Kumar-Saini marked this pull request as ready for review November 13, 2025 12:53
@Sumith-Kumar-Saini
Copy link
Contributor Author

Hey @helloanoop,

Tests

The tests have passed successfully, and this branch is now ready for review and merge.


Quick Question

  • I was wondering if it would be possible to make this package more customizable by refactoring the functionality into a class. This way, we (as contributors) could potentially add features and make certain parts of the package plugin-optional. Is that something we can explore?
  • Can we convert this to TypeScript (ES6+)? It offers better typing, performance, and scalability.

@helloanoop
Copy link
Contributor

Great job @Sumith-Kumar-Saini !

Below are the before and after benchmarks on my machine

image

@helloanoop
Copy link
Contributor

I apologise for using AI, but it was mainly to enhance my understanding of complex, low-level logic

No need to apologise. I would infact encourage one use of AI to augment themselves, so long as you know and understand the logic and reasoning behind the code.

Can we convert this CommonJS to ES6+?

Yes

Can we convert this to TypeScript (ES6+)? It offers better typing, performance, and scalability.

Anything as long as it results in the improvement of the benchmarks and runs on browser.

@helloanoop helloanoop merged commit 578b5ab into usebruno:main Nov 15, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants