Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dto streaming #3103

Merged
merged 2 commits into from
Jan 31, 2017
Merged

Dto streaming #3103

merged 2 commits into from
Jan 31, 2017

Conversation

tareksha
Copy link
Contributor

@tareksha tareksha commented Nov 17, 2016

What does this PR do?

Eliminates the redundant creation of full in-memory JSON tress and then strings when reading DTOs from requests or writing them to responses. Applies to general consumption of DtoFactory as well, not JAX-RS services only.

What issues does this PR fix or reference?

  • Performance.
    Current method always constructs a full replica of the DTO objects as an in-memory JsonElement tree, and then asks Gson to dump the tree as a full string. There are two issues here:
    • Gson can generate the JSON in a stream-based method based on JsonWriter which eliminates the need for any intermediate trees. Actually no JSON elements are created at all here.
    • The whole method of generating/parsing full JSON texts is not needed, the JAX-RS CheJsonProvider passes stream Reader and response Writer that can used directly by Gson, eliminating even the need of an intermediate full JSON texts.
  • Maintainability.
    A large part of the generated DTO implementation code is there to replicate the DTO data in JsonElement trees, bit by bit. The code section in the generator that takes care of generating this java code in the tool is the most complicated area in the already hard to maintain generator. Gson can do this automatically and more efficiently (previous point).

Previous behavior

  • Full JSON trees built when converting DTOs to JSONs.
  • Full JSON text is created before parsing DTOs from JAX-RS requests and (vise versa for responses).

New behavior

  • JSON trees eliminated by using Gson to generate the DTOs objects directly.
  • Full JSON texts eliminated by using request Readers and response Writers directly in DtoFactory.
  • Smaller files generated for DTO implementations: workspace API reduced by 30%, machine API by 29%

Changelog

Removes redundant in-memory JSON trees for faster DTO parsing performance.

Signed-off-by: Tareq Sharafy tareq.sha@gmail.com

@skabashnyuk
Copy link
Contributor

skabashnyuk commented Nov 17, 2016

Hello @tareqhs thanks for this PR. Do you have any statistic that confirms performance optimization?

@codenvy-ci
Copy link

Build # 1052 - FAILED

Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1052/ to view the results.

@tareksha
Copy link
Contributor Author

tareksha commented Nov 17, 2016

Hi @skabashnyuk ,
Here are my benchmark results:

Setup
Prepare a random WorkspaceDto, composed of:

  • 10 ProjectConfigDto, each composed of
    • 10-character name
    • 10-character type
    • 10-character path
    • 10 Link entries, each with a short name and 118-character command line
  • 10 CommandDto, each composed of
    • 10-character name
    • 10-character type
    • 5 simple attributes
  • 3 EnvironmentDto, each composed of
    • 3 ExtendedMachineDto, each with 5 attributes

All times below are in milliseconds.

Benchmark 1
Prepare a reusable Writer that has zero writing overhead (similar to /dev/null).
START TIMER
Call JsonSerializable.toJson(Writer) on the test DTO 20,000 times.
STOP TIMER
Because JsonSerializable.toJson(Writer) was introduced in the fix, the non-fixed benchmark actually does writer.write(dtoObj.toJson())

Without fix : 6601, 5569, 5931, 5479, 5486, 5838, 5523, 5556, 5521, 5495
With fix : 2945, 2714, 2692, 2890, 2704, 2738, 2700, 2705, 2724, 2756

Omitting the bootstrap pass, average benchmark time is reduced by 52%

Benchmark 2
Convert the full DTO to a JSON. Resulting text is made of 42,320 characters.
Open 20,000 UTF-8 streams that read the same content, all in-memory.
START TIMER
Call DtoFactory.createDtoFromJson(InputStream,Class) on all the open streams, one after another.
STOP TIMER

Without fix : 6758, 6460, 6480, 6492, 6602, 6455, 6431, 6478, 6400, 6617
With fix : 2533, 2361, 2337, 2301, 2347, 2294, 2300, 2367, 2451, 2474

Omitting the bootstrap pass, average benchmark time is reduced by 63%

* use Reader/Writer directly in DtoFactory, do not create full JSON texts
* read/write DTOs using direct streaming methods instead of building trees
* configure Gson type adapters instead of generating tree building code

Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
@codenvy-ci
Copy link

@tareksha
Copy link
Contributor Author

hi @skabashnyuk , any update on this ?

@codenvy-ci
Copy link

Build # 1149 - FAILED

Please check console output at https://ci.codenvycorp.com/job/che-pullrequests-build/1149/ to view the results.

@skabashnyuk
Copy link
Contributor

ci-build

@codenvy-ci
Copy link

@tareksha
Copy link
Contributor Author

tareksha commented Dec 1, 2016

@skabashnyuk seems the ci passes now

@codenvy-ci
Copy link

@@ -176,6 +177,16 @@ protected String getJsonFieldName(Method getterMethod) {
return new ArrayList<>(getters.values());
}

protected Set<String> getSuperGetterNames(Class<?> dto) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc

@@ -46,5 +46,18 @@
void setParentField(String parentField);

ChildDto withParentField(String parentField);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of changes in DTOHierarchy class?

Copy link
Contributor Author

@tareksha tareksha Dec 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shadowed fields cause caused initial failure, fixed this and added them to make the behavior is consistent
@skabashnyuk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit more. explanation is not clear for me.

Copy link
Contributor Author

@tareksha tareksha Dec 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gson fails if a class has a shadowed field (a field with the same name as another field in a base class). To avoid this shadowed fields are omitted from generated classes and this test verifies the behavior. @skabashnyuk

@codenvy-ci
Copy link

@tareksha
Copy link
Contributor Author

hi @skabashnyuk , any update on this ?

@skabashnyuk
Copy link
Contributor

No updates. I'm looking for some free time-frame to make some integration and performance testing internally. Recently we made some performance testing of ws-agent and one of the slowest part was DTO parsing codenvy/codenvy#1466. Interesting how your PR can affect on this.

@skabashnyuk
Copy link
Contributor

@skabashnyuk skabashnyuk added status/open-for-dev An issue has had its specification reviewed and confirmed. Waiting for an engineer to take it. team/platform sprint/next kind/task Internal things, technical debt, and to-do tasks to be performed. labels Jan 17, 2017
@mkuznyetsov mkuznyetsov merged commit ac876e3 into eclipse-che:master Jan 31, 2017
@mkuznyetsov mkuznyetsov removed the status/open-for-dev An issue has had its specification reviewed and confirmed. Waiting for an engineer to take it. label Jan 31, 2017
@skabashnyuk skabashnyuk added this to the 5.2.0 milestone Jan 31, 2017
@bmicklea
Copy link

Added Changlog entry.

@tareksha tareksha deleted the dto_streaming branch February 1, 2017 06:00
@JamesDrummond JamesDrummond mentioned this pull request Feb 6, 2017
9 tasks
tareksha added a commit to tareksha/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: I1465e8b1023511e9ce248340c45f3cf9eb957157
tareksha added a commit to tareksha/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: Iea2704e625105ad483ee9f1348b217b0bc493d6e
Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
tareksha added a commit to tareksha/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: Iea2704e625105ad483ee9f1348b217b0bc493d6e
Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
tareksha added a commit to tareksha/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: Iea2704e625105ad483ee9f1348b217b0bc493d6e
Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
tareksha added a commit to tareksha/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: Iea2704e625105ad483ee9f1348b217b0bc493d6e
Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
tareksha added a commit to tareksha/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: Iea2704e625105ad483ee9f1348b217b0bc493d6e
Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
tareksha added a commit to codenvy-legacy/che-core that referenced this pull request Feb 13, 2017
down-port eclipse-che/che#3103

Change-Id: Iea2704e625105ad483ee9f1348b217b0bc493d6e
Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>
JPinkney pushed a commit to JPinkney/che that referenced this pull request Aug 17, 2017
* Streaming approach for DTO serialization

* use Reader/Writer directly in DtoFactory, do not create full JSON texts
* read/write DTOs using direct streaming methods instead of building trees
* configure Gson type adapters instead of generating tree building code

Signed-off-by: Tareq Sharafy <tareq.sha@gmail.com>

* add missing javadoc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. sprint/current
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants