Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalar Indexing is not allowed on GPU array (KH instability) #3522

Closed
sangeethasankar01 opened this issue Mar 25, 2024 · 16 comments
Closed

Scalar Indexing is not allowed on GPU array (KH instability) #3522

sangeethasankar01 opened this issue Mar 25, 2024 · 16 comments

Comments

@sangeethasankar01
Copy link

sangeethasankar01 commented Mar 25, 2024

@glwagner I am attempting to run the Kelvin-Helmholtz instability example on a GPU, but the model fails and throws errors. Can someone help me to sort out this error? Please find the attached error below:

using Random, Statistics

mean_perturbation_kinetic_energy = Field(Average(1/2 * (u^2 + w^2)))
noise(x, z) = randn()
set!(model, u=noise, w=noise, b=noise)
rescale!(simulation.model, mean_perturbation_kinetic_energy, target_kinetic_energy=1e-6)
growth_rates, power_method_data = estimate_growth_rate(simulation, mean_perturbation_kinetic_energy, perturbation_vorticity, b)

@info "Power iterations converged! Estimated growth rate: $(growth_rates[end])"
Error: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
@sangeethasankar01 sangeethasankar01 changed the title @glwagner I am trying to run the Kelvin-Helmholtz instability example on a GPU; however, the model has failed, throwing some errors. Can someone help me to sort this error. Please find the attached error below: Scalar Indexing is not allowed on GPU array (KH instability) Mar 25, 2024
@glwagner
Copy link
Member

What line triggers the error?

@navidcy
Copy link
Collaborator

navidcy commented Mar 25, 2024

@sangeethasankar01 if you post the whole error message that you get with the stack trace we will understand more which line triggered this error.

@sangeethasankar01
Copy link
Author

sangeethasankar01 commented Mar 26, 2024

Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.

Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore C:\Users\ADMIN\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore C:\Users\ADMIN\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore C:\Users\ADMIN\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:116
  [5] getindex(A::CuArray{Float64, 3, CUDA.Mem.DeviceBuffer}, I::Int64)
    @ GPUArrays C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:48
  [6] scalar_getindex
    @ C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:34 [inlined]
  [7] _getindex
    @ C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:17 [inlined]
  [8] getindex
    @ C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:15 [inlined]
  [9] getindex
    @ C:\Users\ADMIN\.julia\packages\OffsetArrays\rMTtC\src\OffsetArrays.jl:422 [inlined]
 [10] getindex
    @ C:\Users\ADMIN\.julia\packages\Oceananigans\E4XVr\src\Fields\field.jl:540 [inlined]
 [11] rescale!(model::NonhydrostaticModel{Oceananigans.TimeSteppers.RungeKutta3TimeStepper{Float64, @NamedTuple{u::Field{Face, Center, Center, Nothing, RectilinearGrid{Float64, Periodic, Flat, Bounded, Float64, Float64, Float64, OffsetArrays.OffsetVector{Fl... (repeats 1 time)
 [12] top-level scope
    @ In[10]:9

This is an error message I am getting from my code. Please share your comments on how to resolve this issue.

@navidcy
Copy link
Collaborator

navidcy commented Mar 26, 2024

thanks!

ps: enclosing code/error in triple backticks

```
code
```

makes it much more readable. (You can edit the post and enclose the error msg in triple backticks)

@navidcy
Copy link
Collaborator

navidcy commented Mar 26, 2024

seems that the issue is coming from rescale!(mode) and if I had to guess from the [1, 1, 1] at

rescale_factor = (target_kinetic_energy / energy[1, 1, 1])

I'll try to reproduce this.

@glwagner
Copy link
Member

Try

rescale_factor = CUDA.@allowscalar (target_kinetic_energy / energy[1, 1, 1]) 

@navidcy
Copy link
Collaborator

navidcy commented Mar 27, 2024

You might have to call

using CUDA

at the top of the script.

@sangeethasankar01
Copy link
Author

@glwagner As you mentioned, I attempted to follow your suggestion by specifying the CUDA.@allowscalar. However, I still encountered the same error on a different line.

Code:

function rescale!(model, energy; target_kinetic_energy = 1e-6)
    compute!(energy)
    rescale_factor = CUDA.@allowscalar √(target_kinetic_energy / energy[1, 1, 1]) 
    #rescale_factor = √(target_kinetic_energy / energy[1, 1, 1])

    for f in merge(model.velocities, model.tracers)
        f .*= rescale_factor
    end

    return nothing
end

Error:

Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.

Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore C:\Users\ADMIN\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore C:\Users\ADMIN\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore C:\Users\ADMIN\.julia\packages\GPUArraysCore\GMsgk\src\GPUArraysCore.jl:116
  [5] getindex(A::CuArray{Float64, 3, CUDA.Mem.DeviceBuffer}, I::Int64)
    @ GPUArrays C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:48
  [6] scalar_getindex
    @ C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:34 [inlined]
  [7] _getindex
    @ C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:17 [inlined]
  [8] getindex
    @ C:\Users\ADMIN\.julia\packages\GPUArrays\Hd5Sk\src\host\indexing.jl:15 [inlined]
  [9] getindex
    @ .\subarray.jl:290 [inlined]
 [10] macro expansion
    @ .\multidimensional.jl:917 [inlined]
 [11] macro expansion
    @ .\cartesian.jl:64 [inlined]
 [12] macro expansion
    @ .\multidimensional.jl:912 [inlined]
 [13] _unsafe_getindex!
    @ .\multidimensional.jl:925 [inlined]
 [14] _unsafe_getindex(::IndexCartesian, ::SubArray{Float64, 3, CuArray{Float64, 3, CUDA.Mem.DeviceBuffer}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, ::Base.Slice{Base.OneTo{Int64}}, ::Int64, ::Base.Slice{Base.OneTo{Int64}})
    @ Base .\multidimensional.jl:903
 [15] _getindex
    @ .\multidimensional.jl:889 [inlined]
 [16] getindex(::SubArray{Float64, 3, CuArray{Float64, 3, CUDA.Mem.DeviceBuffer}, Tuple{UnitRange{Int64}, UnitRange{Int64}, UnitRange{Int64}}, false}, ::Function, ::Int64, ::Function)
    @ Base .\abstractarray.jl:1291
 [17] estimate_growth_rate(simulation::Simulation{NonhydrostaticModel{Oceananigans.TimeSteppers.RungeKutta3TimeStepper{Float64, @NamedTuple{u::Field{Face, Center, Center, Nothing....

To solve this error, I tried to add the same comment CUDA.@allowscalar to the last line of the code below. Now, the code is running without any errors.

using Random, Statistics

mean_perturbation_kinetic_energy = Field(Average(1/2 * (u^2 + w^2)))
noise(x, z) = randn()
#CUDA.allowscalar()
set!(model, u=noise, w=noise, b=noise)
rescale!(simulation.model, mean_perturbation_kinetic_energy, target_kinetic_energy=1e-6)
growth_rates, power_method_data = CUDA.@allowscalar estimate_growth_rate(simulation, mean_perturbation_kinetic_energy, perturbation_vorticity, b)

@info "Power iterations converged! Estimated growth rate: $(growth_rates[end])"

Thank you. I genuinely appreciate your support. Please let me know about your suggestion about the added comment line (whether it is correct or not).

@sangeethasankar01
Copy link
Author

sangeethasankar01 commented Mar 27, 2024

You might have to call

using CUDA

at the top of the script.

@navidcy I already called this comment at the top of the script. However, I faced the same issue that I mentioned.

@navidcy
Copy link
Collaborator

navidcy commented Mar 27, 2024

Let me try to run the script on GPU.
Out of curiosity, which version of Oceananigans and CUDA you use? Could you print out the output of using Pkg; Pkg.status()?

@sangeethasankar01
Copy link
Author

sangeethasankar01 commented Mar 27, 2024

Let me try to run the script on GPU. Out of curiosity, which version of Oceananigans and CUDA you use? Could you print out the output of using Pkg; Pkg.status()?

Please find the details below.


julia> Pkg.status()
Status `C:\Users\ADMIN\.julia\environments\v1.10\Project.toml`
  [052768ef] CUDA v5.2.0
  [13f3f980] CairoMakie v0.11.9
  [7073ff75] IJulia v1.24.2
  [033835bb] JLD2 v0.4.46
  [bdcacae8] LoopVectorization v0.12.166
  [510215fc] Observables v0.5.5
  [9e8cae18] Oceananigans v0.90.11
  [91a5bcdd] Plots v1.40.2
  [438e738f] PyCall v1.96.4

@glwagner
Copy link
Member

Can you show the estimate_growth_rate function that you are using?

@glwagner
Copy link
Member

Writing @allowscalar will of course always fix a scalar indexing error. But you may end up with slow code, which defeats the purpose of using the GPU.

Taking a look at the script: https://github.com/CliMA/Oceananigans.jl/blob/main/examples/kelvin_helmholtz_instability.jl

we see that there are calls to getindex in a few places:

energy₀ = energy[1, 1, 1]

energy₁ = energy[1, 1, 1]

https://github.com/CliMA/Oceananigans.jl/blob/fb9455bc10721a880d9911a39735e37f1c4961e0/examples/kelvin_helmholtz_instability.jl#L256C1-L257C1

length(σ), energy[1, 1, 1], σ[end], convergence(σ))

push!(power_method_data, (ω=collect(interior(ω)[:, 1, :]), b=collect(interior(b)[:, 1, :]), σ=deepcopy(σ)))

Wherever we write energy[1, 1, 1], it is safe to use @allowscalar. This is just a single indexing operation and will be fast.

However, where we write collect, its probably better to rewrite this in a different way.

But note that collect is only used for plotting. So what we do here depends on the application.

@sangeethasankar01 are you trying to get this to run on the GPU for some specific purpose? Or is this merely an educational exercise?

I also think we should convert this issue to a discussion. I don't think we want to make these changes to the example in the source code.

@sangeethasankar01
Copy link
Author

@sangeethasankar01 are you trying to get this to run on the GPU for some specific purpose? Or is this merely an educational exercise?

I am a research scholar at IIT Madras, working under the supervision of Dr. Arjun Jagannathan. Currently, I'm exploring various problems related to flow instabilities. However, my current focus on this Kelvin-Helmholtz instability problem is purely for educational practice. I aim to understand the architecture of the code and optimize it for GPU execution.

@sangeethasankar01
Copy link
Author

Can you show the estimate_growth_rate function that you are using?

I made the below changes only to enable this code to run on the GPU.

Change:1

function rescale!(model, energy; target_kinetic_energy = 1e-6)
    compute!(energy)
    rescale_factor = CUDA.@allowscalar √(target_kinetic_energy / energy[1, 1, 1]) 
    #rescale_factor = √(target_kinetic_energy / energy[1, 1, 1])

    for f in merge(model.velocities, model.tracers)
        f .*= rescale_factor
    end

    return nothing
end

using Printf

## b) Check for convergence
convergence(σ) = length(σ) > 1 ? abs((σ[end] - σ[end-1]) / σ[end]) : 9.1e18  # pretty big (not Inf tho)

Change:2

using Random, Statistics

mean_perturbation_kinetic_energy = Field(Average(1/2 * (u^2 + w^2)))
noise(x, z) = randn()
#CUDA.allowscalar()
set!(model, u=noise, w=noise, b=noise)
rescale!(simulation.model, mean_perturbation_kinetic_energy, target_kinetic_energy=1e-6)
growth_rates, power_method_data = CUDA.@allowscalar estimate_growth_rate(simulation, mean_perturbation_kinetic_energy, perturbation_vorticity, b)

@info "Power iterations converged! Estimated growth rate: $(growth_rates[end])"

Could you please review this and provide your suggestions?

@navidcy
Copy link
Collaborator

navidcy commented Apr 1, 2024

Ideally you don't want CUDA.@allowscalar in front of estimate_growth_rate but rather you wanna go and modify estimate_growth_rate to ensure that you only put CUDA.@allowscalar in places that does not affect performance. Scalar operations on the GPU could diminish all the speedup you get from the GPU, so CUDA.@allowscalar must be used with great caution!! There is a relevant discussion in the docs.

(I'll convert this Issue into a Discussion btw.)

@CliMA CliMA locked and limited conversation to collaborators Apr 1, 2024
@navidcy navidcy converted this issue into discussion #3527 Apr 1, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants