Skip to content

AD: Extend differentiable programming #462

@ranocha

Description

@ranocha

The first steps towards differentiating through Trixi were made in #461. The next steps outlined there are

  • Benchmark and improve performance of jacobian_ad_forward (improving AD #464)
  • Allow differentiating through a complete solve including time integration (improving AD #464)
  • Make callbacks differentiable
    • AnalysisCallback (Extending AD: AnalysisCallback #487)
    • SaveRestartCallback and SaveSolutionCallback: Shall they save the dual numbers or the underlying plain floats (using Printf.tofloat in Julia v1.6)? Decision in Trixi meeting: Save the floats for now until we need the more complicated part.
    • AMRCallback: Needs improvement of the indicators, see below
    • VisualizationCallback: We should probably only visualize the real parts, e.g. using Printf.tofloat
  • Fix indicators and shock-capturing volume integrals (issue explained in AD via ForwardDiff #461 (comment), tracked in jacobian_ad_forward Not working with IndicatorHennemannGassner #1252)
  • Check Euler+gravity
  • Adapt the mesh types to allow differentiating geometric parameters
  • Look for matrix coloring techniques and return sparse matrices to speed-up the computation
  • Can we use something like Measurements.jl? (Extend AD tutorial #522)
  • Can we use ModelingToolkit.jl?
  • What about other modes of AD, e.g. reverse mode or something like the other tools used in Flux.jl?
    • Most of them do not support mutating operations, so they are not really useful for us.
  • What about Enzyme.jl? It promises to support mutating operations and works at the LLVM level
  • Check Tapir.jl
  • Integration with ChainRules to make some parts more efficient, e.g. by using the explicit formulae of Jesse Chan and his student?
    • This does currently not work in general since ChainRulesCore does not support mutating operations (mutating calls JuliaDiff/ChainRulesCore.jl#242). However, it would still be nice to see whether we can get more efficient versions using these explicit formulae.
    • Maybe we can provide chain rules for some of the core methods (logarithmic mean, numerical fluxes) to speed up the calculations?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions