Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations for Node class (CPU time and Memory usage) #3233

Merged
merged 6 commits into from
Jun 15, 2021

Conversation

4everTheOne
Copy link
Contributor

@4everTheOne 4everTheOne commented Apr 22, 2021

Why replacing LinkedList with ArrayList?

Lets take a look at the table below which compares LinkedList with the ArrayList.

Comparison

Operation LinkedList time complexity ArrayList time complexity Preferred
Insert at last index O(1) O(1) (If array copy operation is Considered then O(N)) LinkedList
Insert at given index O(N) O(N) LinkedList
Search by value O(N) O(N) ArrayList
Get by index O(N) O(1) ArrayList
Remove by value O(N) O(N) LinkedList
Remove by index O(N) O(N) LinkedList

Source: Performance Analysis of ArrayList and LinkedList in Java by Anant Mishra

There are some situations where is preferred to use the LinkedList. This happens when the list is updated more time than it's read.

Benefits

In our case LinkedList isn't the best choice since the fields are read more time then they are written to (ex: childNodes is read every time someone calls Node#findAll(...)). In this scenario, the ArrayList fits better, since we can iterate it with a complexity of O(1) instead of O(N), reducing the CPU time required to execute the operation.

Another benefit of using ArrayList instead of LinkedList is the use of less memory overhead. LinkedList requires additional information, pointing where the next node is in memory while the ArrayList uses an sequential Array to store all the values.

Downsides

The insertions of nodes in a tree, now has a complexity of O(N) instead of the previous O(1).

Why was HashSet replaced with ArrayList?

The ideia behind of Node#observers is to keep record of the list of observers to be notified when something happens on the node. The problem with the use of the HashSet in this scenario it uses a lot of memory and it's functionality is not properly used. An HashSet it worth to be used when we need to use the HashSet#contains() that has complexity oif O(1). In this case we are not taking any advantage of this case property, since the method that is used more often is HashSet#foreach. In this case, we can replace the use of a HashSet with an ArrayList that is more CPU and memory friendly.

Why was ArrayList#trimToSize() called after removing items from the list?

This call tries to reduce the memory footprint when an item is removed from a node. When the items is removed from the ArrayList the Array size is the same. If we imagine the Array has space allocated for 30 elements but only stores 2, we are wasting precious memory space. Calling this method makes sure that we are only using the space we really need.

Evaluation

The evaluation was made using the tools VisualVM and JavaParser Symbol Solver tests. In this evaluation, the execution time was reduced the execution time by +- 30 seconds.

Current:
0-Original

Proposed:
3-Final-Optimazation

A HashSet uses more memory that a List and we can achieve an identical behavior by checking if the observer is present in the List before adding it.
ArrayList is much more efficient in read operations than LinkedList making the process of iterating thru the list faster.
@4everTheOne 4everTheOne changed the title Optimizations for Node class (CPU time) Optimizations for Node class (CPU time and Memory usage) Apr 22, 2021
@4everTheOne 4everTheOne deleted the efficiency-improvements branch April 26, 2021 16:17
@4everTheOne 4everTheOne restored the efficiency-improvements branch April 27, 2021 15:51
@4everTheOne 4everTheOne reopened this Apr 27, 2021
@4everTheOne 4everTheOne marked this pull request as draft April 27, 2021 15:55
@4everTheOne 4everTheOne marked this pull request as ready for review April 27, 2021 16:02
@codecov
Copy link

codecov bot commented Jun 1, 2021

Codecov Report

Merging #3233 (a9c5c04) into master (ccdde0e) will increase coverage by 0.003%.
The diff coverage is 80.000%.

❗ Current head a9c5c04 differs from pull request most recent head 2c7be03. Consider uploading reports for the commit 2c7be03 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##              master     #3233       +/-   ##
===============================================
+ Coverage     57.336%   57.340%   +0.003%     
  Complexity      2570      2570               
===============================================
  Files            616       616               
  Lines          34306     34309        +3     
  Branches        5802      5802               
===============================================
+ Hits           19670     19673        +3     
  Misses         12667     12667               
  Partials        1969      1969               
Flag Coverage Δ
AlsoSlowTests 57.340% <80.000%> (+0.003%) ⬆️
javaparser-core 52.987% <80.000%> (+0.005%) ⬆️
javaparser-symbol-solver 35.442% <60.000%> (-0.001%) ⬇️
jdk-10 57.327% <80.000%> (-0.003%) ⬇️
jdk-11 57.333% <80.000%> (+0.003%) ⬆️
jdk-12 57.327% <80.000%> (-0.003%) ⬇️
jdk-13 57.333% <80.000%> (+0.003%) ⬆️
jdk-14 57.333% <80.000%> (+0.003%) ⬆️
jdk-15 57.327% <80.000%> (+0.003%) ⬆️
jdk-16 57.333% <80.000%> (+0.003%) ⬆️
jdk-8 57.332% <80.000%> (-0.003%) ⬇️
jdk-9 57.330% <80.000%> (+0.006%) ⬆️
macos-latest 57.331% <80.000%> (+0.003%) ⬆️
ubuntu-latest 57.326% <80.000%> (+0.003%) ⬆️
windows-latest 57.331% <80.000%> (+0.003%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../src/main/java/com/github/javaparser/ast/Node.java 71.511% <80.000%> (+0.250%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ccdde0e...2c7be03. Read the comment docs.

@maartenc maartenc merged commit 3871088 into javaparser:master Jun 15, 2021
@maartenc
Copy link
Contributor

Merged :-)
Thanks!

@MysterAitch MysterAitch added the PR: Changed A PR that changes implementation without changing behaviour (e.g. performance) label Jul 6, 2021
@MysterAitch MysterAitch added this to the next release milestone Jul 6, 2021
@4everTheOne 4everTheOne deleted the efficiency-improvements branch December 8, 2022 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: Changed A PR that changes implementation without changing behaviour (e.g. performance)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants