Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are we actually @unrolling when we think we are? #3374

Closed
glwagner opened this issue Nov 2, 2023 · 14 comments · Fixed by #3403
Closed

Are we actually @unrolling when we think we are? #3374

glwagner opened this issue Nov 2, 2023 · 14 comments · Fixed by #3403
Labels
cleanup 🧹 Paying off technical debt

Comments

@glwagner
Copy link
Member

glwagner commented Nov 2, 2023

On Julia 1.10 users are met with an avalanche tsunami of warnings

warning: /Users/gregorywagner/.julia/packages/KernelAbstractions/WoCk1/src/extras/loopinfo.jl:28:0: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering

For example we try

@unroll for i = 2:Nx
cᵏ⁻¹ = get_coefficient(i-1, j, k, grid, c, p, tridiagonal_direction, args...)
bᵏ = get_coefficient(i, j, k, grid, b, p, tridiagonal_direction, args...)
aᵏ⁻¹ = get_coefficient(i-1, j, k, grid, a, p, tridiagonal_direction, args...)
t[i, j, k] = cᵏ⁻¹ / β
β = bᵏ - aᵏ⁻¹ * t[i, j, k]
fᵏ = get_coefficient(i, j, k, grid, f, p, tridiagonal_direction, args...)
# If the problem is not diagonally-dominant such that `β ≈ 0`,
# the algorithm is unstable and we elide the forward pass update of ϕ.
definitely_diagonally_dominant = abs(β) > 10 * eps(float_eltype(ϕ))
!definitely_diagonally_dominant && break
ϕ[i, j, k] = (fᵏ - aᵏ⁻¹ * ϕ[i-1, j, k]) / β
end

but this loop probably can't be unrolled because Nx is a runtime value, not a compile time constant.

I don't know if we ever @unroll properly...

Seems like the easiest thing is just to stop pretending that we @unroll.

@jlk9

@glwagner glwagner added the cleanup 🧹 Paying off technical debt label Nov 2, 2023
@navidcy
Copy link
Collaborator

navidcy commented Nov 6, 2023

(Indeed "tsunami" is more appropriate here rather than "avalanche".)

@navidcy
Copy link
Collaborator

navidcy commented Dec 31, 2023

Hm... removing all unrolls from solve_batched_tridiagonal_system_kernel didn't heal the warnings...

@glwagner
Copy link
Member Author

glwagner commented Jan 2, 2024

aren't there more than that

@glwagner
Copy link
Member Author

glwagner commented Jan 2, 2024

don't remove them all because some could be legit

@navidcy
Copy link
Collaborator

navidcy commented Jan 2, 2024

Sure, I only removed them just to see if they were the culprit.

@glwagner
Copy link
Member Author

glwagner commented Jan 2, 2024

That was just an example. There's a lot of erroneous usage.

This might help:

(base) gregorywagner:src/ (glw/fix-adapt) $ grep -r unroll ./*                                            [11:11:55]
./Advection/Advection.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Advection/stretched_weno_smoothness.jl:        @unroll for j = 1:3
./Advection/stretched_weno_smoothness.jl:        @unroll for j = 1:3
./BoundaryConditions/fill_halo_regions_open.jl:# and need to unroll a loop over the boundary normal direction.
./BoundaryConditions/fill_halo_regions.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./BoundaryConditions/fill_halo_regions_periodic.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for i = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for j = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for k = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for n = 1:M
./BoundaryConditions/fill_halo_regions_periodic.jl:        @unroll for i = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for n = 1:M
./BoundaryConditions/fill_halo_regions_periodic.jl:        @unroll for j = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for n = 1:M
./BoundaryConditions/fill_halo_regions_periodic.jl:        @unroll for k = 1:H
./BoundaryConditions/fill_halo_regions_flux.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Fields/regridding_fields.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Fields/regridding_fields.jl:    @inbounds @unroll for k = 1:target_grid.Nz
./Fields/regridding_fields.jl:            @unroll for k_src = k₋_src:k₊_src-1
./Fields/regridding_fields.jl:    @inbounds @unroll for j = 1:target_grid.Ny
./Fields/regridding_fields.jl:            @unroll for j_src = j₋_src:j₊_src-1
./Fields/regridding_fields.jl:    @inbounds @unroll for i = 1:target_grid.Nx
./Fields/regridding_fields.jl:            @unroll for i_src = i₋_src:i₊_src-1
./Fields/field_boundary_buffers.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/NonhydrostaticModels/update_hydrostatic_pressure.jl:    @unroll for k in grid.Nz-1 : -1 : 1
./Models/NonhydrostaticModels/NonhydrostaticModels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/ShallowWaterModels/store_shallow_water_tendencies.jl:    @unroll for t in 1:3
./Models/ShallowWaterModels/ShallowWaterModels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/HydrostaticFreeSurfaceModels/HydrostaticFreeSurfaceModels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/HydrostaticFreeSurfaceModels/compute_w_from_continuity.jl:    @unroll for k in 2:grid.Nz+1
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    # hand unroll first loop
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    @unroll for k in 2:grid.Nz
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    # hand unroll first loop
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    @unroll for k in 2:grid.Nz
./Solvers/batched_tridiagonal_solver.jl:        @unroll for i = 2:Nx
./Solvers/batched_tridiagonal_solver.jl:        @unroll for i = Nx-1:-1:1
./Solvers/batched_tridiagonal_solver.jl:        @unroll for j = 2:Ny
./Solvers/batched_tridiagonal_solver.jl:        @unroll for j = Ny-1:-1:1
./Solvers/batched_tridiagonal_solver.jl:        @unroll for k = 2:Nz
./Solvers/batched_tridiagonal_solver.jl:        @unroll for k = Nz-1:-1:1
./Solvers/Solvers.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Solvers/fourier_tridiagonal_poisson_solver.jl:    @unroll for i in 2:Nx-1
./Solvers/fourier_tridiagonal_poisson_solver.jl:    @unroll for j in 2:Ny-1
./Solvers/fourier_tridiagonal_poisson_solver.jl:    @unroll for k in 2:Nz-1

@glwagner
Copy link
Member Author

glwagner commented Jan 2, 2024

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

@navidcy
Copy link
Collaborator

navidcy commented Jan 2, 2024

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

I'm not sure I understand what you mean here. Can you give an example that's OK and one that's not?
Also, @unroll comes from KernelAbstractions.Extras.LoopInfo.@unroll, right? The docstring is not really helping me on this:

help?> KernelAbstractions.Extras.LoopInfo.@unroll
  @unroll expr


  Takes a for loop as expr and informs the LLVM unroller to fully unroll it, if it is safe to do so and the loop count is known.

  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  @unroll N expr


  Takes a for loop as expr and informs the LLVM unroller to unroll it N times, if it is safe to do so.

In particular, I don't know what "if it is safe to do so" refers to.

@navidcy
Copy link
Collaborator

navidcy commented Jan 2, 2024

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

Do you simply mean that

@unroll for j in 1:4; do_this(); end

is OK but

N=4
@unroll for j in 1:N; do_this(); end

is not?

@glwagner
Copy link
Member Author

glwagner commented Jan 4, 2024

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

Do you simply mean that

@unroll for j in 1:4; do_this(); end

is OK but

N=4
@unroll for j in 1:N; do_this(); end

is not?

Both are fine the way you have written them, because even in the second case the compiler is able to infer that N is always 4, the way you've written it. But @unroll for i = 1:grid.Nx is not fine because grid.Nx is not known at compile time, it is passed into the function as a property of the grid. At compile time, only the type of the grid is known, and not the values that are contained in it.

If one is careful to pass the limits of the loop as compile-time information, then we can pass information into a function. Typically this is done with objects like like Val(N) which have type signature ::Val{N}. Since here N is type information it is known to the compiler.

@glwagner
Copy link
Member Author

glwagner commented Jan 4, 2024

Can you give an example that's OK and one that's not?

The example that is ok is when the limit of the loop N is passed in via an argument with type Val{N}. Then N is known to the compiler. This is what I tried to indicate, sorry for not being clear.

@navidcy
Copy link
Collaborator

navidcy commented Jan 7, 2024

Gotcha!

@glwagner
Copy link
Member Author

glwagner commented Jan 7, 2024

Here's the basic structure

Not ok because loop limits are runtime values:

function loop(N)
    @unroll for i = 1:N
    # etc.
end

Maybe ok because, in principle, loop limit is encoded in type information

function loop(::Val{N}) where N
    @unroll for i = 1:N
    # etc.
end

The second case is called with loop(Val(N)).

@navidcy
Copy link
Collaborator

navidcy commented Feb 6, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup 🧹 Paying off technical debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants