-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obscure GPU error when time stepping a model with Julia 1.5 and Oceananigans v0.33.0 #828
Comments
Could be a failure of https://github.com/JuliaGPU/Adapt.jl/blob/a62a2568f1199d0e7b154eb6001afa629d31038e/src/base.jl#L3 |
@maleadt Was wondering if you've seen this issue before or if you have any ideas on how to fix it? |
Adapt.jl isn't relevant here, that's for CPU->GPU conversions. This is a tuple operation not getting inferred properly, leading to a dynamic call to a runtime function. Those aren't supported. With 1.5, there's been a bunch of latency optimizations that have affected inference quality, e.g., when not forcing specialization on functions with Your best bet here would be to step through using Cthulhu and figure out which code isn't inferring properly. If it's tough code to get though, you could always add the call to |
Also note the scalar iteration; you might want to look into that too. |
Thanks @maleadt! If it's actually in the Mostly just documenting what I did in this comment so I can pick up later on: Tried playing around with Cthulhu.jl a little bit, didn't understand everything it was telling me but ended up putting Not sure if this is useful but it seems like the
True haha, we only finally started dynamically disabling scalar ops yesterday in PR #851 after being burnt by scalar ops one too many times! |
You're looking at CPU code here; you rather want to prefix your application with
But seeing how it's fairly complex you might be better off using |
Thank you @maleadt for the very helpful comments. We toyed around a bit with forced specialization on functions in the past and in other contexts when we had difficulties with inlining. We should look into that. I'm not sure we are forcing specialization everywhere and it sounds like we may want to, to be conservative regarding GPU compilation. Looking at the error, it seems like the problem function is Oceananigans.jl/src/Solvers/solve_for_pressure.jl Lines 36 to 42 in ea33ab0
right? Since the default model is "horizontally periodic",
The function Oceananigans.jl/src/Solvers/index_permutations.jl Lines 11 to 17 in ea33ab0
Is it worth toying around with different ways to write |
I think the easiest is to check with Cthulhu which of these expressions are badly inferred and result in a call to |
If we're not sure what the cause of the problem is and Cthulu.jl doesn't help (or in my case too much of a noob) we could try commenting out the pressure solve (or parts of it) to try and isolate the problematic line(s) or kernel(s). I did try playing around with refactoring |
So I was able to isolate the issue to the @maleadt @vchuravy Does this error message say anything useful? Was not able to reproduce in a minimal working example for CUDA.jl or KernelAbstractions.jl unfortunately... I'll open a PR to make this issue a priority.
|
Might have something to do with the new tuple/named tuple syntax but I thought Julia 1.4 -> 1.5 wasn't supposed to introduce any breaking changes? https://julialang.org/blog/2020/08/julia-1.5-highlights/#implicit_keyword_argument_values
Julia 1.4.2
Julia 1.5.0
The text was updated successfully, but these errors were encountered: