Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve context switching #761

Merged
merged 6 commits into from
Mar 12, 2021
Merged

Improve context switching #761

merged 6 commits into from
Mar 12, 2021

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Mar 11, 2021

Some performance (inferability) fixes, but also some bug fixes where we didn't switch to the appropriate context from finalizers. This would only surface in a multi-device setting (cc @marius311, in case you're still trying to do multi-device work in a single process).

@maleadt maleadt added enhancement New feature or request bugfix This gets something working again. cuda libraries Stuff about CUDA library wrappers. labels Mar 11, 2021
@codecov
Copy link

codecov bot commented Mar 11, 2021

Codecov Report

Merging #761 (72de062) into master (062ef3d) will decrease coverage by 0.00%.
The diff coverage is 80.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #761      +/-   ##
==========================================
- Coverage   78.26%   78.26%   -0.01%     
==========================================
  Files         122      122              
  Lines        7280     7269      -11     
==========================================
- Hits         5698     5689       -9     
+ Misses       1582     1580       -2     
Impacted Files Coverage Δ
lib/cudadrv/pool.jl 88.88% <0.00%> (-7.54%) ⬇️
lib/curand/random.jl 90.76% <0.00%> (+2.70%) ⬆️
src/pool/binned.jl 0.00% <0.00%> (ø)
src/pool/simple.jl 0.00% <0.00%> (ø)
src/state.jl 87.36% <85.71%> (+0.18%) ⬆️
src/pool.jl 83.89% <93.33%> (+0.37%) ⬆️
lib/cudadrv/context.jl 89.87% <100.00%> (+0.83%) ⬆️
lib/cudadrv/events.jl 91.17% <100.00%> (-0.26%) ⬇️
lib/cudadrv/memory.jl 85.03% <100.00%> (ø)
lib/cudadrv/module.jl 70.00% <100.00%> (-0.97%) ⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 062ef3d...72de062. Read the comment docs.

Base automatically changed from tb/async_pin to master March 12, 2021 06:56
The call to is_pinned generates a 'failure' for every memory copy.

Enabling CUTENSOR tests causes some downstream failures in _other_
CUTENSOR tests though, so disable those (pinned memory) tests.
@marius311
Copy link
Contributor

Thanks for the ping, I would have missed this but the timing is perfect for me to give this another go, will report back 🤞

@maleadt maleadt merged commit 3d37595 into master Mar 12, 2021
@maleadt maleadt deleted the tb/context_switching branch March 12, 2021 09:34
@marius311
Copy link
Contributor

So funnily enough the very first thing I ran with this PR I saw finalizer error, then haven't been able to reproduce in an ~hour of some farily complicated stuff. So maybe there's still an edge case out there, but overall definitely seems like an improvement, thanks!

@maleadt
Copy link
Member Author

maleadt commented Mar 12, 2021

I've been seeing some other finalizer errors on CI too, it's always CUDA.cudaError_enum(0x000002c5) which indicates an API call after the context has been destroyed. That only ever can happen at process exit, so it probably doesn't actually hurt, but I'd like to find the source of it of course. So if you know how to reproduce, or even have some vague ideas about it, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This gets something working again. cuda libraries Stuff about CUDA library wrappers. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants