Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA 11.2: Add MemoryAsyncPool to support malloc_async #4592

Merged
merged 27 commits into from Apr 6, 2021

Conversation

leofang
Copy link
Member

@leofang leofang commented Feb 1, 2021

Blocked by #4537. The new stuff starts at 0b840b1.

This PR enables the following canonical usage, similar to CuPy's other memory pools:

cupy.cuda.set_allocator(cupy.cuda.MemoryAsyncPool().malloc)

Similar to MemoryPool, this new MemoryAsyncPool supports multiple devices, which is not transparent if just using the bare malloc_async() from #4537.

Moreover, having a wrapper like MemoryAsyncPool allows more possibilities, such as

  • Setting which cudaMemPool_t handle to use for each device via the pool constructor
  • Enforce memory alignment
  • Error handling when OOM occurs
  • Matching the interface of other CuPy pools (despite most of them do not work as we are not doing any bookkeeping here)

In the future we could support MemoryAsyncPool.set_limit() by using cudaMemPoolTrimTo(). I do not add it in this PR because it's not yet clear to me how nicely it can be when working with other libraries that also use the async pool.

@leofang leofang changed the title [WIP] Add MemoryAsyncPool supporting stream-ordered memory allocator [WIP] CUDA 11.2: Add MemoryAsyncPool supporting stream-ordered memory allocator Feb 1, 2021
@kmaehashi kmaehashi added cat:feature New features/APIs prio:medium labels Feb 2, 2021
@leofang leofang changed the title [WIP] CUDA 11.2: Add MemoryAsyncPool supporting stream-ordered memory allocator CUDA 11.2: Add MemoryAsyncPool supporting malloc_async Feb 5, 2021
@leofang leofang marked this pull request as ready for review February 5, 2021 03:06
@leofang leofang changed the title CUDA 11.2: Add MemoryAsyncPool supporting malloc_async CUDA 11.2: Add MemoryAsyncPool to support malloc_async Feb 5, 2021
@leofang
Copy link
Member Author

leofang commented Feb 9, 2021

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit b05f5a8, target branch master) failed with status FAILURE.

@leofang
Copy link
Member Author

leofang commented Feb 9, 2021

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 5b5ed3a, target branch master) succeeded!

@chainer-ci
Copy link
Member

Jenkins CI test (for commit e6538b1, target branch master) failed with status FAILURE.

@leofang
Copy link
Member Author

leofang commented Mar 19, 2021

@emcastillo Looks like the CI could randomly hit OOM in the added tests. What should I do? Make the tests less memory-hungry? Mark them slow?

@leofang
Copy link
Member Author

leofang commented Mar 19, 2021

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit a5e5255, target branch master) failed with status FAILURE.

@leofang
Copy link
Member Author

leofang commented Mar 20, 2021

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit a5e5255, target branch master) succeeded!

@leofang
Copy link
Member Author

leofang commented Mar 20, 2021

@emcastillo Tests added in this PR passed twice in a row. Should be safe now?

@chainer-ci
Copy link
Member

@emcastillo This pull-request is marked as st:test-and-merge, but there were no activities for the last 3 days. Could you check?

1 similar comment
@chainer-ci
Copy link
Member

@emcastillo This pull-request is marked as st:test-and-merge, but there were no activities for the last 3 days. Could you check?

@emcastillo emcastillo removed the st:test-and-merge (deprecated) Ready to merge after test pass. label Mar 28, 2021
@emcastillo
Copy link
Member

lets rebase

@leofang
Copy link
Member Author

leofang commented Mar 28, 2021

lets rebase

Done!

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 9e1d636, target branch master) succeeded!

@leofang
Copy link
Member Author

leofang commented Apr 4, 2021

Jenkins, test this please

@chainer-ci
Copy link
Member

Jenkins CI test (for commit c49e805, target branch master) succeeded!

@emcastillo emcastillo added this to the v10.0.0a1 milestone Apr 6, 2021
@emcastillo emcastillo added st:test-and-merge (deprecated) Ready to merge after test pass. to-be-backported Pull-requests to be backported to stable branch labels Apr 6, 2021
@mergify mergify bot merged commit 5880b98 into cupy:master Apr 6, 2021
chainer-ci pushed a commit to chainer-ci/cupy that referenced this pull request Apr 6, 2021
CUDA 11.2: Add `MemoryAsyncPool` to support `malloc_async`
@leofang leofang deleted the mempool_impl branch April 6, 2021 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:feature New features/APIs prio:medium st:test-and-merge (deprecated) Ready to merge after test pass. to-be-backported Pull-requests to be backported to stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants