Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a memory pool per device. #746

Merged
merged 3 commits into from
Mar 3, 2021
Merged

Use a memory pool per device. #746

merged 3 commits into from
Mar 3, 2021

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Mar 2, 2021

Fixes #742. Doesn't seem to impact allocation time, e.g. comparing to #137 (comment):

Memory pool timings:
 ────────────────────────────────────────────────
                                   Time          
                           ──────────────────────
     Tot / % measured:           786s / 5.78%    

 Section           ncalls     time   %tot     avg
 ────────────────────────────────────────────────
 pooled alloc       8.79M    31.0s  68.2%  3.52μs
   1.1 alloc        8.79M    21.3s  47.0%  2.42μs
     pooled free    55.6k   81.3ms  0.18%  1.46μs
   pooled free       115k    169ms  0.37%  1.47μs
 pooled free        8.62M    14.4s  31.8%  1.68μs
 ────────────────────────────────────────────────
Allocator timings:
 ────────────────────────────────────────
                           Time          
                   ──────────────────────
 Tot / % measured:       786s / 3.22%    

 Section   ncalls     time   %tot     avg
 ────────────────────────────────────────
 alloc      8.79M    14.8s  58.5%  1.68μs
 free       8.79M    10.5s  41.5%  1.19μs
 ────────────────────────────────────────

A very small difference. Either way, there's still 50% of execution time being left on the table (comparing the time it takes to do the pool allocation vs the time spent calling the CUDA allocator).

@maleadt maleadt marked this pull request as draft March 2, 2021 17:46
@codecov
Copy link

codecov bot commented Mar 3, 2021

Codecov Report

Merging #746 (afc145a) into master (9fe3015) will increase coverage by 0.03%.
The diff coverage is 52.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #746      +/-   ##
==========================================
+ Coverage   79.46%   79.50%   +0.03%     
==========================================
  Files         124      124              
  Lines        7471     7456      -15     
==========================================
- Hits         5937     5928       -9     
+ Misses       1534     1528       -6     
Impacted Files Coverage Δ
lib/cudnn/CUDNN.jl 67.27% <0.00%> (ø)
src/pool/binned.jl 0.00% <0.00%> (ø)
src/pool/simple.jl 0.00% <0.00%> (ø)
src/state.jl 80.00% <0.00%> (-7.18%) ⬇️
src/utilities.jl 91.37% <ø> (-0.56%) ⬇️
src/pool.jl 87.21% <88.63%> (+4.51%) ⬆️
src/pool/split.jl 97.12% <96.92%> (-0.05%) ⬇️
lib/cudadrv/memory.jl 84.00% <100.00%> (ø)
lib/cudadrv/pool.jl 96.42% <100.00%> (ø)
lib/cudadrv/stream.jl 83.33% <100.00%> (+2.08%) ⬆️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9fe3015...afc145a. Read the comment docs.

@maleadt maleadt marked this pull request as ready for review March 3, 2021 09:40
@maleadt maleadt merged commit 6ec21d3 into master Mar 3, 2021
@maleadt maleadt deleted the tb/pool_per_device branch March 3, 2021 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Per-device memory pool
1 participant