Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the memory usage of certain graphs #2381

Merged
merged 4 commits into from
Jan 14, 2023

Conversation

sighingnow
Copy link
Collaborator

@sighingnow sighingnow commented Jan 13, 2023

What do these changes do?

  • Expose the retain_oid options to Python
  • Upgrade vineyard to latest version to handle the memory usage issues
  • Adding errors="ignore" to avoid decoding/encoding errors
  • Enable int32_t OID type support

Benchmarking

Settings:

  • 1 worker
  • loading from CSV (710MB)
    • nodes: 330MB, 21865475
    • edges: 380MB, 11089373
  • workload: loading (generate_eid=False) -> project -> pagerank

Memory usage:

  • int64_t oid:
    • retain_oid=True:
      • after loading graph, peak: 3.53GB
      • after project, peak: 3.72GB
      • after running the application, peak: 4.49GB
    • retain_oid=False:
      • after loading graph, peak: 3.33GB
      • after project, peak: 3.53GB
      • after running the application, peak: 4.29GB
  • string oid: graph: , peak:
    • retain_oid=True:
      • after loading graph, peak: 5.16GB
      • after project, peak: 5.18GB
      • after running the application, peak: 6.00GB
    • retain_oid=False:
      • after loading graph, peak: 4.68GB
      • after project, peak: 4.70GB
      • after running the application, peak: 5.51GB

Related issue number

Fixes #2127
Fixes #2269

See also: v6d-io/v6d#1151

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sighingnow sighingnow force-pushed the ht/fragment-mem-opt branch 3 times, most recently from 3fd1240 to bfb2746 Compare January 13, 2023 05:58
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
@sighingnow sighingnow force-pushed the ht/fragment-mem-opt branch 2 times, most recently from c38d7e5 to 0980132 Compare January 13, 2023 07:20
@codecov-commenter
Copy link

codecov-commenter commented Jan 13, 2023

Codecov Report

Merging #2381 (953ccbf) into main (e5aed5a) will increase coverage by 38.05%.
The diff coverage is 89.02%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2381       +/-   ##
===========================================
+ Coverage   34.38%   72.43%   +38.05%     
===========================================
  Files          88       88               
  Lines        9711     9752       +41     
===========================================
+ Hits         3339     7064     +3725     
+ Misses       6372     2688     -3684     
Impacted Files Coverage Δ
python/graphscope/analytical/app/java_app.py 24.36% <ø> (-4.57%) ⬇️
python/graphscope/client/connection.py 38.61% <0.00%> (ø)
python/graphscope/framework/dag_utils.py 65.89% <ø> (+30.79%) ⬆️
python/graphscope/client/session.py 75.37% <40.00%> (+13.34%) ⬆️
python/graphscope/framework/graph_builder.py 88.57% <50.00%> (ø)
python/graphscope/tests/unittest/test_lazy.py 96.21% <50.00%> (+96.21%) ⬆️
python/graphscope/framework/graph.py 85.37% <80.00%> (+25.69%) ⬆️
python/graphscope/tests/conftest.py 82.71% <92.00%> (+34.83%) ⬆️
python/graphscope/client/archive.py 64.15% <100.00%> (+28.30%) ⬆️
python/graphscope/config.py 100.00% <100.00%> (ø)
... and 70 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 97b7224...953ccbf. Read the comment docs.

Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
@yecol
Copy link
Collaborator

yecol commented Jan 13, 2023

Responsive patch! 💐

Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
@sighingnow sighingnow merged commit 45cec05 into alibaba:main Jan 14, 2023
@sighingnow sighingnow deleted the ht/fragment-mem-opt branch January 14, 2023 02:20
@github-actions
Copy link
Contributor

github-actions bot commented Jan 14, 2023

😭 Deploy PR Preview 45cec05 failed. Build logs

🤖 By surge-preview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce peak memory occupation during graph construction Reduce memory usage during graph construction
4 participants