Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalarization transformation #2499

Open
LonelyCat124 opened this issue Feb 8, 2024 · 7 comments
Open

Scalarization transformation #2499

LonelyCat124 opened this issue Feb 8, 2024 · 7 comments
Assignees
Labels
enhancement NG-ARCH Issues relevant to the GPU parallelisation of LFRic and other models expected to be used in NG-ARCH

Comments

@LonelyCat124
Copy link
Collaborator

In parts of the physics codes for LFRiC we come across loop patterns such as this:

do i = ...
  do l = ...
    temp_in(l) = array(l,i) * array2(l,i)
  end do
  call exp_v(n, temp, temp_in)
  do l = ...
   !do some based on temp(l)
  end do
end do

Once we inline and fuse this loop structure, we get loops like this:

do i = ...
  do l = ...
    temp_in(l) = array(l,i) * array2(l,i)
    temp(l) = exp(temp_in(l))
   !do some based on temp(l)
  end do
end do

For cases such as this, temp_in and temp can be scalarised providing that nothing outside the loop depends on their values (which would already be a strange implementation choice, since it would only be for the final value of i). This would help us remove some false dependencies, as there is a write-write dependency on temp(l) if we use collapse on this loop, however these are not necessary since temp can just be a local scalar instead.

The goal of this transformation would be to take code like the above (post all the other inline and loop fusion transformations) and generate:

do i = ...
  do l = ...
   temp_in_scalar = array(l,i) * arary2(l,i)
   temp_scalar = exp(temp_in_scalar)
   !do something based on temp_scalar
  end do
end do

At this point, we can apply target + loop with collapse which will lead to less kernel launches and synchronization, and probably better performance on GPU.

@LonelyCat124
Copy link
Collaborator Author

First step - on Reference: Find next (and previous?) Reference to this symbol.

@hiker
Copy link
Collaborator

hiker commented Apr 18, 2024

First step - on Reference: Find next (and previous?) Reference to this symbol.

The VariableAccess information already contains all accesses in order.

@LonelyCat124
Copy link
Collaborator Author

First step - on Reference: Find next (and previous?) Reference to this symbol.

The VariableAccess information already contains all accesses in order.

Yeah - the plan is to use that data with this transformation.

@LonelyCat124
Copy link
Collaborator Author

LonelyCat124 commented Apr 19, 2024

@hiker I'm a bit confused about how to use the VariablesAccessInfo - is there a linkage between VariablesAccessInfo and the node for a given read/write access? E.g. For a given routine if I wanted to find (in order) all the accesses/dependencies on a given symbol (slash signature) can I do that with the VariablesAccessInfo? I can find the sequence of reads/writes but if I wanted to refer back to the relevant Reference is that currently possible?

Ah I guess its .node in AccesInfo?

@hiker
Copy link
Collaborator

hiker commented Apr 19, 2024

@hiker I'm a bit confused about how to use the VariablesAccessInfo - is there a linkage between VariablesAccessInfo and the node for a given read/write access? E.g. For a given routine if I wanted to find (in order) all the accesses/dependencies on a given symbol (slash signature) can I do that with the VariablesAccessInfo? I can find the sequence of reads/writes but if I wanted to refer back to the relevant Reference is that currently possible?

Ah I guess its .node in AccesInfo?

Yes :) I saw the comments in the wrong order, and commented elsewhere :)

sergisiso added a commit that referenced this issue Apr 26, 2024
(Towards #2499) Initial implementation of next_access function on Reference
@LonelyCat124
Copy link
Collaborator Author

@sergisiso If the next access to an array reference (that is otherwise a potential target for scalarization) - if its contained within an IfBlock that isn't also an ancestor of the Loop I'm "scalarizing" I will just ignore it rather than dealing with the if condition - unless you think we should specifically try to handle if blocks here?

@LonelyCat124
Copy link
Collaborator Author

Also I realise that I probably need to be careful with next_access, since I think next_access for something like:

a(i) = a(i) + 1

will point to the LHS of the assignment, so I should also check the RHS of the assignment in this case for scalarization.

LonelyCat124 added a commit that referenced this issue Apr 29, 2024
@LonelyCat124 LonelyCat124 added the NG-ARCH Issues relevant to the GPU parallelisation of LFRic and other models expected to be used in NG-ARCH label May 3, 2024
@LonelyCat124 LonelyCat124 self-assigned this May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement NG-ARCH Issues relevant to the GPU parallelisation of LFRic and other models expected to be used in NG-ARCH
Projects
None yet
Development

No branches or pull requests

2 participants