- 
                Notifications
    
You must be signed in to change notification settings  - Fork 58
 
Fetch dataclip body as string, not json, from postgres #3651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
           still need to figure out how to pretty print it  | 
    
          Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@            Coverage Diff             @@
##             main    #3651      +/-   ##
==========================================
- Coverage   89.86%   89.63%   -0.23%     
==========================================
  Files         409      410       +1     
  Lines       17022    17075      +53     
==========================================
+ Hits        15296    15306      +10     
- Misses       1726     1769      +43     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done
@stuartc , I think I've got something here for #3641. The change, in essence, is to extract these big objects from Postgres as text, not JSON. If we get them from Postgres as JSON, Elixir uses a lot of memory interpreting them as maps. To run the benchmarking test, create a noisy/random 2MB dataclip in your system then fire up your server and run this from
Iex:tldr - there's a big impact! 🚀
Problem:
PostgreSQL stores JSON as compact JSONB (1.86 MB for this dataclip).
When loaded as an Elixir map, it expands ~38x due to:
- Immutable data structure overhead
- Metadata for every map/list/string
- Deep nesting creates many small allocations
OLD APPROACH (baseline):
- Query JSONB → Elixir map (~38x memory amplification)
- Jason.encode!(map) → JSON string (creates another copy)
- Peak memory: ~70.5 MB for this dataclip
NEW APPROACH (optimized):
- Query with fragment("?::text", d.body) → JSON string directly
- PostgreSQL does the conversion, no Elixir map
- Peak memory: ~1.86 MB for this dataclip
- Memory reduction: ~97% ⭐
Impact on Production:
With 2000MB memory limit and 1.86MB dataclips:
- OLD: ~28 concurrent requests before OOM
- NEW: ~1077 concurrent requests before OOM
- Improvement: 38x more capacity! ⭐
Additional Benefits:
My full results with the problem dataclip from production: