Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDEA: Proxy block for adjoint #2979

Closed
connorjward opened this issue Jun 12, 2023 · 5 comments
Closed

IDEA: Proxy block for adjoint #2979

connorjward opened this issue Jun 12, 2023 · 5 comments

Comments

@connorjward
Copy link
Contributor

One of the key performance problems with the adjoint is the cost of setting up fresh solvers as the tape is traversed. Assuming that the adjoint problem involves a time loop, many of these solvers are repeating work done by other blocks (example). I think that this problem stems from the fact that we unroll time loops on the tape and information is lost about the equivalence of solve blocks.

My suggestion is as follows:

  1. Add a ProxyBlock class to pyadjoint that points to some original block.
  2. Add a new adjoint kwarg to decorated functions so the functions can know if they are repeated operations or not, and hence whether or not to store themselves as proxy blocks. Something like:
    while t < T:
        solve(..., ad_block_id="mysolve")
@dham
Copy link
Member

dham commented Jun 12, 2023

I think this is something like where we ought to go. I'm not sure whether proxy block is the right way to do it.

The way we share state between forward solves right now is by having all the related solve blocks share some state information. This could be expanded to more shared state (particularly the adjoint solves).

I think it would be worth fleshing out which of these approaches is preferable. Maybe put it on the meeting agenda.

@colinjcotter
Copy link
Contributor

I like this idea of leaving it up to the coder to decide

@dham
Copy link
Member

dham commented Jun 12, 2023

I like this idea of leaving it up to the coder to decide

I don't think either of these options does that. This is still the same taping process.

What is proposed is that if a e.g. NonLinearVariationalSolver has its solve method called twice, you either get:

  1. A Solve block the first time and then a Proxy block pointing at the Solve block the second time.
  2. Two solve blocks but they both have an (e.g.) ._ad_block_shared_state member which contains the data that is shared between the two blocks (the forward and adjoint solvers, for example).

@connorjward
Copy link
Contributor Author

A related pipe dream of mine is for us to employ enough smart caching that we could get near to equivalent performance calling the solve function compared with creating and reusing solvers.

This would all be interesting to discuss in this week's meeting.

@connorjward
Copy link
Contributor Author

The conclusion from this week's meeting is that having a proxy block like this is practically equivalent to creating and reusing a solver object as reusing the solver naturally connects solve blocks.

I still want to find ways to optimise solver instantiation such that these strategies aren't required, but this specific ProxyBlock idea isn't the answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants