Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arena_memory_resource optimization: disable tracking allocated blocks by default #732

Merged

Conversation

rongou
Copy link
Contributor

@rongou rongou commented Mar 16, 2021

This is done similarly to #702.

Previously arena_memory_resource maintained a set of allocated blocks, but this was only used for reporting/debugging purposes. Maintaining this set requires a set::find at every deallocation, which can get expensive when there are many allocated blocks. This PR moves the tracking behind a default-undefined preprocessor flag. This results in some speedup in the random allocations benchmark for arena_memory_resource. Tracking can be enabled by defining RMM_POOL_TRACK_ALLOCATIONS.

This should also fix the Spark small shuffle buffer issue: NVIDIA/spark-rapids#1711

Before:

------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
BM_RandomAllocations/arena_mr/1000/1            1.36 ms         1.36 ms          457
BM_RandomAllocations/arena_mr/1000/4            1.21 ms         1.21 ms          517
BM_RandomAllocations/arena_mr/1000/64           1.22 ms         1.22 ms          496
BM_RandomAllocations/arena_mr/1000/256          1.08 ms         1.07 ms          535
BM_RandomAllocations/arena_mr/1000/1024        0.949 ms        0.948 ms          583
BM_RandomAllocations/arena_mr/1000/4096        0.853 ms        0.848 ms          680
BM_RandomAllocations/arena_mr/10000/1           98.7 ms         98.3 ms            8
BM_RandomAllocations/arena_mr/10000/4           65.4 ms         65.4 ms            9
BM_RandomAllocations/arena_mr/10000/64          16.6 ms         16.5 ms           38
BM_RandomAllocations/arena_mr/10000/256         11.2 ms         11.2 ms           48
BM_RandomAllocations/arena_mr/10000/1024        9.45 ms         9.44 ms           62
BM_RandomAllocations/arena_mr/10000/4096        9.24 ms         9.20 ms           59
BM_RandomAllocations/arena_mr/100000/1          7536 ms         7536 ms            1
BM_RandomAllocations/arena_mr/100000/4          3002 ms         3002 ms            1
BM_RandomAllocations/arena_mr/100000/64          170 ms          170 ms            3
BM_RandomAllocations/arena_mr/100000/256         107 ms          107 ms            7
BM_RandomAllocations/arena_mr/100000/1024       96.0 ms         95.7 ms            6
BM_RandomAllocations/arena_mr/100000/4096       86.7 ms         86.7 ms            6

After:

------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
BM_RandomAllocations/arena_mr/1000/1            1.20 ms         1.20 ms          519
BM_RandomAllocations/arena_mr/1000/4            1.08 ms         1.08 ms          588
BM_RandomAllocations/arena_mr/1000/64           1.11 ms         1.11 ms          552
BM_RandomAllocations/arena_mr/1000/256         0.957 ms        0.957 ms          611
BM_RandomAllocations/arena_mr/1000/1024        0.857 ms        0.857 ms          687
BM_RandomAllocations/arena_mr/1000/4096        0.795 ms        0.793 ms          724
BM_RandomAllocations/arena_mr/10000/1           73.0 ms         73.0 ms           10
BM_RandomAllocations/arena_mr/10000/4           45.7 ms         45.7 ms           14
BM_RandomAllocations/arena_mr/10000/64          14.4 ms         14.4 ms           40
BM_RandomAllocations/arena_mr/10000/256         9.87 ms         9.82 ms           60
BM_RandomAllocations/arena_mr/10000/1024        8.72 ms         8.72 ms           69
BM_RandomAllocations/arena_mr/10000/4096        7.32 ms         7.30 ms           85
BM_RandomAllocations/arena_mr/100000/1          6384 ms         6384 ms            1
BM_RandomAllocations/arena_mr/100000/4          2480 ms         2480 ms            1
BM_RandomAllocations/arena_mr/100000/64          147 ms          147 ms            5
BM_RandomAllocations/arena_mr/100000/256         103 ms          103 ms            7
BM_RandomAllocations/arena_mr/100000/1024       78.1 ms         78.1 ms            9
BM_RandomAllocations/arena_mr/100000/4096       72.3 ms         72.3 ms            9

@abellina

@rongou rongou added 3 - Ready for review Ready for review by team non-breaking Non-breaking change improvement Improvement / enhancement to an existing function cpp Pertains to C++ code labels Mar 16, 2021
@rongou rongou requested a review from harrism March 16, 2021 21:57
@rongou rongou requested a review from a team as a code owner March 16, 2021 21:57
@rongou rongou requested a review from codereport March 16, 2021 21:57
@rongou rongou self-assigned this Mar 16, 2021
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@rongou
Copy link
Contributor Author

rongou commented Mar 17, 2021

@harrism Thanks! Do I need a second approver for this or is it ready to merge?

Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@codereport codereport added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for review Ready for review by team labels Mar 17, 2021
@rongou
Copy link
Contributor Author

rongou commented Mar 17, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 3826a89 into rapidsai:branch-0.19 Mar 17, 2021
@harrism
Copy link
Member

harrism commented Mar 17, 2021

Yes you need two approvers, hence two reviewers auto-assigned.

@rongou rongou deleted the arena-optional-allocation-tracking branch May 20, 2021 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge cpp Pertains to C++ code improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants