-
Notifications
You must be signed in to change notification settings - Fork 352
Description
While investigating workarounds for jruby/jruby#6265 I realized that all dumping for e.g. to_json
is done first to an in-memory buffer (always a Ruby String) even when given an IO object to which the json should be written. This applies to all three implementations: the pure-Ruby version, the C version, and the Java version.
This could obviously be more efficient if the json appends were writes directly to the given IO, or if it were possible to provide a String-like object that receives the appends. A rework of the generator subsystem would be necessary to pass any provided IO or String-like through the various dump methods.
This would have several benefits:
- No intermediate String to hold the entirety of the dumped json.
- No intermediate Strings for components of a dumped collection; Array and Hash currently dump each element or pair to a separate String and then append that String to the result buffer.
- Reduced allocation, copying, and GC overhead when dumping directly to IO.
- Potential to provide IO-like or String-like receivers of the dumped json, allowing for a workaround to the Java 2GB array limitation (Exception NegativeArraySizeException during JSON.dump of large Hash jruby/jruby#6265).
I'm hoping to attempt this for at least the Java and Ruby versions of the generator, but I may need help making the same change in the C extension. If others are interested in helping with any of these implementations, it would be greatly appreciated.