Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON.ARRAPPEND: cpu bound #295

Closed
mhaberler opened this issue Mar 30, 2021 · 13 comments · Fixed by #324
Closed

JSON.ARRAPPEND: cpu bound #295

mhaberler opened this issue Mar 30, 2021 · 13 comments · Fixed by #324
Assignees
Labels
q:performance Question focusing on performance/observability

Comments

@mhaberler
Copy link

I'm using JSON.ARRAPPEND to append GeoJSON features as they are received to a GeoJSON LineString coordinates array and corresponding properties

array length are < 5000

redis-server becomes CPU-bound in this method (all top 10 ops reported by SLOWLOG get are JSON.ARRAPPEND)

any suggestions?

@filipecosta90
Copy link
Contributor

@mhaberler can you share more details about the data, the specific arguments used on JSON.ARRAPPEND and version of RedisJSON?

@mhaberler
Copy link
Author

will try to reproduce in an example

mhaberler pushed a commit to mhaberler/rejson-perf that referenced this issue Mar 31, 2021
@mhaberler
Copy link
Author

mhaberler commented Mar 31, 2021

here it is: https://github.com/mhaberler/rejson-perf

it does 100 arrappends/s on a json object

let it run for a few minutes and watch the CPU usage of the redis server - it will eventually max out a core

similar results on MacOS and Debian buster

good luck!

actually I'd be interested to learn how you tackle such a problem. My first thought would be profiling the redis-server, but I am fuzzy how that would work with a rust module

@mhaberler
Copy link
Author

any progress on this issue?

That'd be a release stopper bug in my book

@filipecosta90 filipecosta90 self-assigned this Apr 20, 2021
@filipecosta90
Copy link
Contributor

Hi there @mhaberler, sorry for the delay on this ( Was focusing on RedisConf related work and completely missed this one -- mea culpa :) ).
I've addressed this issue to myself, and will provide further feedback in the next couple of days.

@filipecosta90 filipecosta90 added the q:performance Question focusing on performance/observability label Apr 20, 2021
@mhaberler
Copy link
Author

thanks for the update - really hope this gets fixed or I need to look for an alternative

@filipecosta90
Copy link
Contributor

@mhaberler confirmed that this is a very good use-case to profile further.
Pre-steps
0.1) - Use the following RDB to mimic: https://s3.amazonaws.com/benchmarks.redislabs/gh/redisjson/295/dump.rdb
0.2) - Use the following memtier_benchmark command: memtier_benchmark --command '"JSON.ARRAPPEND" "path:sonde:foo" ".properties.coordinateProperties" "{\"type\": \"Feature\", \"geometry\": {\"type\": \"Point\", \"coordinates\": [26.2153, 60.34085, 417.0]}, \"properties\": {\"temp\": 3.6, \"serial\": \"S0730589\", \"humidity\": 64.6, \"vel_h\": 16.0, \"uploader_callsign\": \"Lahti 24h raspberry\", \"time\": 1617220020, \"uploader\": {\"callsign\": \"Lahti 24h raspberry\"}}}"' -n 25 --json-out-file 5k.arrappend-results.json
0.3) - Using the current RedisJSON master version: commit=445aa844. Redis v6.2.0 standalone instance.

Initial analysis steps
Using your script just to set the initial json and then using memtier_benchmark to fully top-up the Redis Instance we can see that:

    • Complexity is not O(1) even for a low number of array elements. We start at around 200ms p50 ( even though not stable at all but good enough to see the problem ) and move to 400-600ms p50 lat.
    • With 200 concurrent clients, we're observing a very poor performance ( 50 ops/sec @436ms p50 latency including RTT ).
4         Threads
50        Connections per thread
25        Requests per client


ALL STATS
============================================================================================================
Type                   Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------------
"json.arrappend"s        53.47       420.36245       436.22300       987.13500       987.13500        20.35 
Totals                   53.47       420.36245       436.22300       987.13500       987.13500        20.35 
  1. Even using a simpler ARRAPEND command we see the same perf patterns: try it with memtier_benchmark --command '"JSON.ARRAPPEND" "path:sonde:foo" ".geometry.coordinates" "[47, 15, 920]"'

@filipecosta90
Copy link
Contributor

filipecosta90 commented Apr 21, 2021

TODOs that will provide right away @mhaberler:

  • Profile on CPU stacks ( doing it now )
  • Check regression vs v1.0 ( cc @gkorland @rafie )

@mhaberler
Copy link
Author

boy, did I pick the right command :-)

@filipecosta90
Copy link
Contributor

filipecosta90 commented Apr 21, 2021

@gkorland @rafie this is indeed a regression ( comparing master commit=445aa844 vs v1.0.7 ). If possible take a look at pub fn value_op<F>(&mut self, path: &str, mut fun: F) -> Result<Value, Error> and inner fn's. Added the on cpu top consumers bellow.

v1.0.7 Results ( Use the following RDB: https://s3.amazonaws.com/benchmarks.redislabs/gh/redisjson/295/dumpv1.0.7.rdb ):

$ memtier_benchmark --command '"JSON.ARRAPPEND" "path:sonde:foo" ".properties.coordinateProperties" "{\"type\": \"Feature\", \"geometry\": {\"type\": \"Point\", \"coordinates\": [26.2153, 60.34085, 417.0]}, \"properties\": {\"temp\": 3.6, \"serial\": \"S0730589\", \"humidity\": 64.6, \"vel_h\": 16.0, \"uploader_callsign\": \"Lahti 24h raspberry\", \"time\": 1617220020, \"uploader\": {\"callsign\": \"Lahti 24h raspberry\"}}}"' -n 25 --hide-histogram
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%,   0 secs]  0 threads:        5000 ops,       0 (avg:   30389) ops/sec, 0.00KB/sec (avg: 11.30MB/sec),  0.00 (avg:  6.58) msec latency

4         Threads
50        Connections per thread
25        Requests per client


ALL STATS
============================================================================================================
Type                   Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------------
"json.arrappend"s     30348.83         6.57989         6.43100        13.37500        14.46300     11552.09 
Totals                30348.83         6.57989         6.43100        13.37500        14.46300     11552.09

master version results ( Use the following RDB: https://s3.amazonaws.com/benchmarks.redislabs/gh/redisjson/295/dump.rdb ):

$ memtier_benchmark --command '"JSON.ARRAPPEND" "path:sonde:foo" ".properties.coordinateProperties" "{\"type\": \"Feature\", \"geometry\": {\"type\": \"Point\", \"coordinates\": [26.2153, 60.34085, 417.0]}, \"properties\": {\"temp\": 3.6, \"serial\": \"S0730589\", \"humidity\": 64.6, \"vel_h\": 16.0, \"uploader_callsign\": \"Lahti 24h raspberry\", \"time\": 1617220020, \"uploader\": {\"callsign\": \"Lahti 24h raspberry\"}}}"' -n 25 --hide-histogram
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%,  96 secs]  0 threads:        5000 ops,       0 (avg:      51) ops/sec, 0.00KB/sec (avg: 19.76KB/sec),  -nan (avg: 3851.86) msec latencycy

4         Threads
50        Connections per thread
25        Requests per client


ALL STATS
============================================================================================================
Type                   Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------------
"json.arrappend"s        56.60       477.40891       432.12700      1032.19100      1032.19100        21.55 
Totals                   56.60       477.40891       432.12700      1032.19100      1032.19100        21.55

Rust Hotspots:
image

image
image

PS: to properly detect the on-cpu hotspots I've changed the build rule to:

build:
        RUSTFLAGS="-g -C force-frame-pointers=yes" cargo build --all --all-targets $(CARGO_FLAGS)
        cp $(TARGET_DIR)/librejson.$(RUST_SOEXT.$(OS)) $(TARGET)

Used HW:
The performance results provided in this issue were collected using:

Hardware platform: A physical HPE ProLiant DL380 Gen10 Server, with one Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz, Maximum memory capacity of 3 TB, disabling CPU Frequency Scaling and with all configurable BIOS and CPU system settings set to performance.
OS: Ubuntu 18.04 Linux release 4.15.0-123.
Installed Memory: 64GB DDR4-SDRAM @ 2933 MHz

@filipecosta90
Copy link
Contributor

@mhaberler while we sort the current master version regression can you proceed with version 1.0.7 ? This regression should not be visible there. WDYT?

@filipecosta90
Copy link
Contributor

@mhaberler notice that #324 deeply improves the performance of the operation.

@mhaberler
Copy link
Author

will have a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
q:performance Question focusing on performance/observability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants