Performance issue while serving responses of around 180Kb #4656

himanshumps · 2023-03-29T08:57:32Z

Version

4.3.7

Context

I encountered an exception which looks suspicious while doing the deserialization to JsonObject (without model classes) as well as the performance is degraded. The CPU utilisation is also going 100%.

Do you have a reproducer?

https://github.com/himanshumps/large-response.git

Steps to reproduce

There are two endpoints: http://localhost:8080/noProcessing and http://localhost:8080/jsonProcessing

Here is the DockerFile

FROM [registry.access.redhat.com/ubi8/openjdk-17@sha256:79585ca02551ecff9d368905d7ce387232b9fd328256e7a715ae3c4ec7b086d3](http://registry.access.redhat.com/ubi8/openjdk-17@sha256:79585ca02551ecff9d368905d7ce387232b9fd328256e7a715ae3c4ec7b086d3)
COPY ./target/large-response-1.0.0-SNAPSHOT-fat.jar ./app.jar
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","app.jar"]

And here is the docker command to run it

docker run -d --memory="4g" --cpus="2" -p 8888:8080 large_response

I ran through plow but you can use wrk as well (50 users for a duration of 5minutes).

./plow http://testmachine.com:8888/noProcessing -c 50 -d 5m
./plow http://testmachine.com:8888/jsonProcessing -c 50 -d 5m

Extra

https://groups.google.com/g/vertx/c/j3IcS8b8nMo

The text was updated successfully, but these errors were encountered:

pmlopes · 2023-04-03T09:04:14Z

I don't fully understand what you're trying to measure.

In one end point, you're serving a static json resource, which you should not use toString() but encode():

https://github.com/himanshumps/large-response/blob/master/src/main/java/com/test/large/response/MainVerticle.java#L24

On a second end point, you're doing much more:

parse 188Kb of JSON (which will certainly create lots of GCs)
remove 2 keys
set the connection response to chunked mode (multiple writes)
encode the new JSON object back to JSON string
only do 1 write (call end(...))

I'd expect some CPU usage here, also using random UUID will require lots of entropy as the implementation relies on SecureRandom, which causes more load on the CPU.

If the bottleneck of the application is the json encode/decode perhaps you should consider using a different parser? For example: https://jsoniter.com/ can be used with Vert.x too.

Given that that parser can parse to a Map<String, Object> container, you can use the result and wrap it with something along these lines:

new JsonObject(Jsoniter.deserialize(input))

And the reverse:

JsonStream.serialize(vertxJsonObject.getMap())

himanshumps · 2023-04-03T14:11:39Z

Hi @pmlopes

Thanks for looking into the issue. I do not think toString is an issue as it calls encode internally.

vert.x/src/main/java/io/vertx/core/json/JsonObject.java

Line 1157 in 2095bf9

public String toString() {

For the second point, I did went down jsoniter few days back but still the performance is bad.

    ReactiveReadStream<String> reactiveReadStream = ReactiveReadStream.readStream();
    couchbaseReadReactive.get(key)
        .subscribeOn(scheduler)
        .publishOn(scheduler)
        .flatMap(getResult -> {
          Any any = JsonIterator.deserialize(getResult.contentAsBytes());
          Map deserialize = any.asMap();
          verticleUtils.removeMetaDataAndReturnVoid(deserialize);
          return Mono.defer(() -> Mono.just(JsonStream.serialize(deserialize))).subscribeOn(scheduler);
          //return verticleUtils.removeMetaData(getResult.contentAsObject());
        })
        .subscribe(reactiveReadStream);
    reactiveReadStream
        .handler(result -> {
          routingContext.response()
              //.setChunked(true)
              .putHeader(HttpHeaders.CONTENT_TYPE.toString(), HttpConstants.APPLICATION_HAL_JSON)
              .end(result, NULL_HANDLER);
        })
        .exceptionHandler(throwable -> {
          sendErrorResponse(routingContext, throwable);
        });

surajkumar · 2023-05-29T12:35:33Z

Hi @himanshumps, I ran your reproducer in Apache JMeter and tested with 50 concurrent users.

Here are my findings:

The CPU does in fact reach 100% when there are a few concurrent users calling your simple endpoint. However, this only happens when the same user is re-used. If it's a new connection each time, the CPU does not exceed any unexpected thresholds. I was seeing highs of around 10% CPU on my system when I unticked "Same user on each iteration" in JMeter.

But the important thing is, it's not the Vert.x process that has the high CPU usage. It's Apache JMeter that is eating up my CPU.

Therefore, I don't see any bugs with the code you have provided.

I suspect Plow to be doing the same. You should confirm what process is eating up your CPU.

Hope this helps.

vietj · 2023-05-29T14:59:38Z

thanks @surajkumar for your investigations

performance testing should always be done using 2 distincts machine (or 3 if there is a database), I will close this issue

himanshumps added the bug label Mar 29, 2023

vietj closed this as completed May 29, 2023

vietj added invalid and removed bug labels May 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue while serving responses of around 180Kb #4656

Performance issue while serving responses of around 180Kb #4656

himanshumps commented Mar 29, 2023

pmlopes commented Apr 3, 2023

himanshumps commented Apr 3, 2023

surajkumar commented May 29, 2023

vietj commented May 29, 2023

Performance issue while serving responses of around 180Kb #4656

Performance issue while serving responses of around 180Kb #4656

Comments

himanshumps commented Mar 29, 2023

Version

Context

Do you have a reproducer?

Steps to reproduce

Extra

pmlopes commented Apr 3, 2023

himanshumps commented Apr 3, 2023

surajkumar commented May 29, 2023

vietj commented May 29, 2023