In [1]:
%load_ext autoreload
%autoreload 2

# Setting up equations using Automatic Differentiation and abstract equations
This tutorial is meant as an introduction to a new framework for defining and working with (non-linear) equations in PorePy. Specifically, the aim is to develop an approach which:
1. Gives a transparent way of specifying non-linear, multiphysics and multi-dimensional equations.
2. Speeds up assembly of Jacobian matrices, in particular for geometries with many subdomains.
3. Is better suited to combine with general linear and non-linear solvers etc.

## Disclaimer
The framework, referred to as the "ad framework' (ad = automatic differentiation) is currently (Spring 2021) under more or less active development. The below tutorial is intended to give an overview of the design and use of the framework, to ease adaptation in new projects. Since the code is under active development, the code will change, hopefully, this tutorial will keep track. In the same spirit, the tutorial will strive to point to uncertainties on how the code will actually function, indicate code that is likely to change, document best practice when opinions on this exist etc.

## Background
Over its first few years of existence, PorePy was mainly applied to linear problems; the development was focused more on mixed-dimensionality than on non-linearities. There were two notable exceptions:
1. Simulations of viscous fingering in fractured domains, paper [here](https://arxiv.org/abs/1906.10472)
2. Simulations of thermo-poromechanics coupled with deformation of fractures, where the latter was modeled as a contact mechanics problem, see for instance [this paper](https://arxiv.org/abs/2008.06289)

The two projects both used a Newton-type approach to solving the resulting linear system, but took fundamentally different approaches in the linearization: In the contact-mechanics problem, the Jacobian matrix was formed 'by hand' on a block-matrix level, so, to correctly linearize complex expressions, the user was responsible for applying the chain rule correctly on all terms, on all subdomains. In addition to requiring precision of the user, this approach become somewhat cumbersome on the interface between subdomains were extra classes were had to be implemneted to couple different terms (technically, this has to do with the design of the Assembler object; however, for the purpose of this tutorial there is no need to understand this fully).

The non-linear transport problem took a different approach: The project implemented Ad, and thereby removed the painstaking implementation of the Jacobian matrix. To see this works, look first at the [tutorial](https://github.com/pmgbergen/porepy/blob/develop/tutorials/automatic_differentiation.ipynb) on Ad in general, and next on the [tutorial](https://github.com/pmgbergen/porepy/blob/develop/tutorials/compressible_flow_with_automatic_differentiation.ipynb) on how to combine Ad with discretization operations in general.

## Scope of the new Ad framework
The new approach to Ad can be seen as an extension of the existing functionality, with the following ambitions:
1. For the purpose of writing equations, it should be possible to consider multiple grids simultaneously, with no need for for-loops or similar.
2. Instead of the immediate evaluation of residuals and derivatives applied in the existing framework, the new approach should apply delayed evaluation.

The first point will both improve code redability, and substantially improve runtimes in cases with many subdomains. The latter allows for advanced linear and non-linear solvers, and possibly for automatic updates of discretizations of non-linear terms, both of which will be explored in the future.

## Framework components
So far, the framework consists of three types of classes:
1. Grid-dependent operators, defined on one or multiple subdomain grids. Examples are:
    * divergence and trace operators
    * boundary conditions, 
    * projections between mortar and subdomain grids
    * projections between sets of subdomains and subsets.
2. Variables. These carry the numerical state of the primary variables, and also values at previous time steps and iteration states.
3. Discretization objects. These are mainly shells around standard PorePy discretiation methods.
4. Classes needed to turn variables and discretizations into equations, linearize them etc.

## Test case: A mixed-dimensional grid.
As a test case, we define a mixed-dimensional grid, which we for simplicity let be Cartesian

In [2]:
import numpy as np
import porepy as pp

import scipy.sparse.linalg as spla

# fractures 1 and 2 cross each other in (3, 3)
frac_1 = np.array([[2, 2], [2, 4]])
frac_2 = np.array([[2, 5], [3, 3]])
# fracture 3 is isolated
frac_3 = np.array([[6, 6], [1, 5]])

gb = pp.meshing.cart_grid([frac_1, frac_2, frac_3], nx=np.array([7, 7]))

Next, we define variables on the subdomains and interfaces. This is done as before:

In [3]:
# String representations of the variables.
pressure_var = 'pressure'
mortar_var = 'mortar_flux'

# Loop over all subdomains, define a cell centered variable
for _, d in gb:
    d[pp.PRIMARY_VARIABLES] = {pressure_var: {'cells': 1}}
    
# Also loop over interfaces
for _, d in gb.edges():
    d[pp.PRIMARY_VARIABLES] = {mortar_var: {'cells': 1}}


Parameter assignmnet is also done as before, see [this tutorial](https://github.com/pmgbergen/porepy/blob/develop/tutorials/parameter_assignment_assembler_setup.ipynb) for details. Specifically, we will consider a mixed-dimensional flow problem.

In [4]:
param_key = 'flow'

matrix_perm = 1
fracture_perm = 1e2

interface_diffusivity = 1e2

for g, d in gb:
    if g.dim == 2:
        perm = pp.SecondOrderTensor(matrix_perm * np.ones(g.num_cells))

        # Dirichlet conditions on right and left
        left = np.where(np.abs(g.face_centers[0] - gb.bounding_box()[0][0]) < 1e-6)[0]
        right = np.where(np.abs(g.face_centers[0] - gb.bounding_box()[1][0]) < 1e-6)[0]
        bc_cond = ['dir'] * (left.size + right.size)
        bc = pp.BoundaryCondition(g, np.hstack((left, right)), bc_cond)
        
        bc_val = np.zeros(g.num_faces)
        bc_val[left] = 1
        specified_data = {'second_order_tensor': perm,
                         'bc': bc,
                         'bc_values': bc_val}
        d = pp.initialize_data(g, d, param_key, specified_data)
        
    else:
        perm = pp.SecondOrderTensor(fracture_perm * np.ones(g.num_cells))
        
        # No-flow Neumann conditions
        bc = pp.BoundaryCondition(g)
        bc_val = np.zeros(g.num_faces)
        specified_data = {'second_order_tensor': perm,
                         'bc': bc,
                         'bc_values': bc_val}
        d = pp.initialize_data(g, d, param_key, specified_data)
        
# Initialize data for interfaces as well        
for e, d in gb.edges():
    mg = d['mortar_grid']
    kn = interface_diffusivity * np.ones(mg.num_cells)
    pp.initialize_data(mg, d, param_key, {'normal_diffusivity': kn})
    

We also give numerical values to the pressure and flux variables, just so that we get more interesting numbers below

In [5]:
for g, d in gb:
    pp.set_state(d)
    d[pp.STATE][pressure_var] = np.random.rand(g.num_cells)
    
for e, d in gb.edges():
    pp.set_state(d)
    d[pp.STATE][mortar_var] = np.random.rand(d['mortar_grid'].num_cells)

### Definition of grid-related operators
Now, we are ready to apply the new Ad framework to this mixed-dimensional problem. The key to exploit this efficiently (in terms of both user friendliness and computational speed) is to operate on several grids simultaneously. To that end, we make a list of all subdomain grids, and similarly of all the edges (*not* mortar grids - we need to keep the link to the adjacent subdomains).

**NOTE**: The order of the grid in the list is important, as it sets the ordering of variables, discretization object etc. It is recommended to define a list of grids and use this throughout to define variables etc. A list of mortar grids should be made similarly.

In [6]:
grid_list = [g for g, _ in gb]
edge_list = [e for e, _ in gb.edges()]

Now, we can for instance define a joint divergence operator for all subdomains:

In [7]:
div = pp.ad.Divergence(grid_list)

Note that this is not a matrix, but a special object:

In [8]:
type(div)

porepy.numerics.ad.grid_operators.Divergence

We will come back to how to translate div into a numerical expression.

We can also define merged projection operators between the subdomain and mortar grids. This can be done either on the whole gb, or on parts of it. The ordering of the grids is important, and frankly not completely clear, but the following seems to work (if you get a warning, disregard it; this will be handled at a later point):

In [9]:
mortar_proj = pp.ad.MortarProjections(gb=gb)

  "Is it really meaningful to ask for signs of a one sided mortar grid?"


Critically, the initialization defines a list of grids (and edges), just the same way as we did in the grid list, and, since iterations over the grid bucket items uses a fixed order, we're good.

Finally, we will need a representation of boundary conditions:

In [10]:
bound_ad = pp.ad.BoundaryCondition(param_key, grids=grid_list)

Again, this is not a numerical boundary condition, but rather a way to access given boundary data.

### Mixed-dimensional Ad variables
The next step is to define Ad representations of the (mixed-dimensional) variables. For this, we need no less than three different steps (fortunately, we can use these objects for other parts below as well). 

First, define a degree of freedom manager. For users who have been exposed to the Assembler, this is actually part of that class which has been moved to a separate object, which is responsible for keeping track of local which indices belong to which degrees of freedom:

In [11]:
dof_manager = pp.DofManager(gb)  # note: no pp.ad here

Next, define an EquationManager. This is a class which may be significantly changed in the months to come, but for the moment, it is responsible for providing Ad  representations of the variables.

In [12]:
equation_manager = pp.ad.EquationManager(gb, dof_manager)

Finally, we can define Ad variables

In [13]:
p = equation_manager.merge_variables([(g, pressure_var) for g in grid_list])
lmbda = equation_manager.merge_variables([(e, mortar_var) for e in edge_list])

Note that p and lmbda do not have numerical values. What we have done is instead to prepare to:
1. Prepare the ground to write equations with the equations
2. Prepare for the later translation of the equations to numerical values (values and derivatives)

To get some information about the variables, we can type

In [14]:
print(p)
print(lmbda)

Merged variable with name pressure, id 10
Composed of 5 variables
Degrees of freedom in cells: 1, faces: 0, nodes: 0
Total size: 59

Merged interface variable with name mortar_flux, id 11
Composed of 5 variables
Degrees of freedom in cells: 1, faces: 0, nodes: 0
Total size: 21



### Mixed-dimensional ad equations
Next, we turn to discretization. To be compatible with the Ad framework, PorePy discretizations need a wrapper which mainly allows for the delayed evaluation of the expressions. For instance, the Ad version of Mpfa is defined by writing

In [15]:
mpfa = pp.ad.MpfaAd(param_key, grid_list)

This object, once again, has no numerical values, but is rather an abstract representation of a standard Mpfa discretization. The two versions of Mpfa refer to the discretization matrices resulting from the discretization in similar ways: Mpfa has attributes like 'flux_matrix_key', which specifies where the flux discretization matrix is stored. Similarly, MpfaAd has an attribute 'flux', which, upon parsing of an Ad experession (below), will access the same discretization matrix.

To show how this works in action, we can define the flux discretization on subdomain as


In [16]:
interior_flux = mpfa.flux * p

In essence, there are two types of Ad objects:
1. Atomic objects, like mpfa.flux and p. These can be considered pointers to places in the data dictionary where the numerical values associated with the objects are stored. For instance, p in our example points to a collection of d[pp.STATE][pressure_var], where d is the data dictionary for each of the grids on which p was defined.
2. Composite objects, like interior_flux, formed by combining Ad objects (which themselves can be atomic or composites) using basic mathematical operations.

These Ad objects are not designed for numerical evaluation by themselves, they can be thought of as recipes for combining discretizations, variables etc (). To parse a recipes, it must be wrapped in an additional layer, termed an Ad Expression, and then provided with a GridBucket, from where it can pull numerical values for variables, discretization matrices and so on.

In [17]:
eval_flux = pp.ad.Expression(interior_flux, dof_manager)
eval_flux.discretize(gb)
num_flux = eval_flux.to_ad(gb=gb)
print(num_flux)

Ad array of size 134
Jacobian is of size (134, 59) and has 174 elements


We note that num_flux has the size of the total number of faces in the grids, and that its Jacobian matrix is a mapping from cells to faces.

On a technical level (no need to understand this), composite Ad objcets are implemented as a tree structure, where the leaves are atomic Ad objects. Parsing of the expression is done by identification of these leves, and then use standard forward Ad to evaluate the composites.

We can define more elaborate combinations of variables. The interior_flux object (side note: Even though we just wrapped it into an Expression, the original composite Ad object is still alive) represents only the part of the flux caused by pressure variations internal to subdomains. To get the full flux, we need to account for boundary conditions from external boundaries, as well as from internal boundaries to domains of lower dimensions.

Note that for the time being, we cannot write 'mpfa.bound_flux * (bound_ad + mortar_proj... * lmbba); the parsing of the expressions do not respect parathesis the way it should. To be improved, hopefully.

In [18]:
full_flux = interior_flux + mpfa.bound_flux * bound_ad + mpfa.bound_flux*mortar_proj.mortar_to_primary_int * lmbda

Now, it is interesting to see what happens when the numerical value of full_flux is computed:

In [19]:
vals = pp.ad.Expression(full_flux, dof_manager).to_ad(gb)
print(f'Size of value array: {vals.val.shape}')
print(f'Size of Jacobian matrix: {vals.jac.shape}')

Size of value array: (134,)
Size of Jacobian matrix: (134, 80)


Compare the size of the Jacobian matrix with the size of the matrix for int_flux: The number of rows is still equal to the total number of faces in the grid, but the number of columns has increased to also include derivatives with respect to the mortar variables.

We can also compute the projection of the mortar fluxes onto the lower-dimensional subdomains, where they are manifested as sources:

In [20]:
sources_from_mortar = mortar_proj.mortar_to_secondary_int * lmbda

Put together, we now have the full mass conservation equation on all subdomains:

In [21]:
conservation = div * full_flux + sources_from_mortar

We can also define equations for the interface mortars. To that end, we first define the pressure trace on internal boundaries - the most accurate representation of this trace is a bit complex within Mpfa (and Tpfa) methods

In [22]:
pressure_trace_from_high = (mortar_proj.primary_to_mortar_avg * mpfa.bound_pressure_cell * p
        + mortar_proj.primary_to_mortar_avg * mpfa.bound_pressure_face * mortar_proj.mortar_to_primary_int * lmbda
                 )


Next, we define a discretization object for the mortar equation

In [23]:
robin = pp.ad.RobinCouplingAd(param_key, edge_list)

Now, we can write the Darcy-type equation for the interface flux

In [24]:
interface_flux_eq = (robin.mortar_scaling * (pressure_trace_from_high - 
                                                    mortar_proj.secondary_to_mortar_avg * p)
                     + robin.mortar_discr * lmbda)

### Assemble the system of equations
Now, we are ready to assemble the full system, formed by the conservation statement and the interface flux equations. Assembly takes two steps:
1. Convert the Ad objects into Expressions, preparing for numerical evaluation.
2. Feed the Expressions to the EquationManager, thereby join them together into a coupled system. 

We can do this as follows:

In [25]:
eqs = [pp.ad.Expression(conservation, dof_manager), pp.ad.Expression(interface_flux_eq, dof_manager)]
equation_manager.equations += eqs

The equation_manager can be used to assemble the coupled linear system, much in the same way as an Expression is evaluated. Before that, the discretization matrices must be constructed.

**NOTE**: The computed solution has the interpretation of the update to the existing state, that is, the random values we assigned above. The solution must be distributed in an additive manner. 


In [30]:
# first discretize
equation_manager.discretize(gb)
# next assemble the equations
A, b = equation_manager.assemble_matrix_rhs()

# Solve, system, note the minus sign on the right hand side
solution = spla.spsolve(A, b)

# Distribute variable to local data dictionaries
dof_manager.distribute_variable(solution, additive=True)

exporter = pp.Exporter(gb, 'ad_test')
exporter.write_vtu([pressure_var])



## What have we done
We summarize the steps needed to define an equation:
1. Define variables 
2. Define grid-related operators (not strictly necessary, but most often)
3. Define discretizations
4. Combine into equations, and evaluate.

## More advanced usage
Below are a few additional techniques which are needed to define other types of equations (to be covered in a more elaborate set of tutorials in the future?):

To access the state of a variable on the previous time step, do

In [27]:
p_prev = p.previous_timestep()

To use a variable on only a few subdomains, use subdomain projections:

In [28]:
g_2d = gb.grids_of_dimension(2)
subdomain_proj = pp.ad.SubdomainProjections(grids=g_2d, gb=gb)

For examples on how to use these, see the BiotContactMechanics model.

## Moving parts
As hinted to above, the new Ad framework is under active development. There are currently several known issues and shortcomings, including terms in equations that cannot be handled, function names that will change, parameters to functions that should be included or kicked out etc.

Below is an attempt at guessing how the main components of the framework will evolve in the future:
* The variables, discretizations and Expressions will likely stay more or less as they are, although variable names, functions etc. may be changed.
* The EquationManager will likely evolve, if nothing else because it is the least used part of the code.
* Better support for constitutive laws etc.
* 

