# Flux input files

CO$_2$ fluxes are stored in two main places during the inversion: The optimizer need the fluxes to be provided as a control vector, and the transport model needs them to be provided as a transport model input file:
- the control vector only includes information about the flux category to be optimized, and at the resolution at which it is optimized (typically, lower than the resolution of the transport model);
- the transport model needs fluxes from all categories (incluing those prescribed), and at its native resolution.

The conversion from control vector to transport model input file occurs at the beginning of each iteration, and is done in two steps:
1. Generate a flux structure (e.g. one instance of the `formatters.lagrange.Struct` class) from the control vector. This is done by the `interface.VecToStruct` method).
2. Write a transport model input file from that flux structure. This is done by the `formatters.lagrange.WriteStruct` method.

The simplest way to setup fluxes in the inversion is therefore to create the initial flux structure using one of the three methods below, and to pass it to the interface during its initialization (e.g. via the `ancilliary` attribute in the `__init__` of the `interfaces.monthlyFlux.Interface` class, used in the [example notebook](var4d.html)). 

## 1. Creation of an zero flux structure (`formatters.lagrange.CreateStruct`)

The flux structure is essentially a python dictionary (with an added `__add__` method), with the following hierarchy:

```
{'cat1': {'emis': emis,
  'time_interval': {'time_start': ts, 'time_end': te},
  'lats': lats,
  'lons': lons,
  'region': rname},
 'cat2': 
   ...}
 }
```
where:
- `emis` is a *(nt, nlat, nlon)* numpy array storing the net surface flux (in $\mu$g/m$^2$/s);
- `lats` and `lons` are arrays storing the lat/lon coordinates (respectively of dimension `(nlat, )` and `(nlon, )`) of the center of the grid points;
- `ts` and `te` are arrays (of dimension `(nt, )`) storing the start and end of the time intervals (arrays of `datetime.datetime` objects)
- `rname` is the region name.

The `formatters.lagrange.CreateStruct` can be used to initialize an empty structure (i.e. with the data in the good shape, but set to zero). Call with `CreateStruct(categories, region, start, end, dt))`

## 2. Using a pre-processed input file (`formatters.lagrange.ReadStruct`)

The `ReadStruct` function is used in the inversion to load the data from a file previously written using `WriteStruct`. It is therefore also possible to generate such a file externally, and to just load it using `ReadStruct`. The file needs to be in the netCDF4 format, with a `time_components` global dimension, set to **6** , and one group for each flux category. Each group has three dimensions (`nlat`, `nlon` and `nt`) and five variables:
- `emis`: the flux estimate for that category (in $\mu$g/m$^2$/s);
- `lats` and `lons`: the lat/lon coordinates of the centers of the grid cells;
- `times_start` and `times_end`: the times of the start and end of each time step (_int_ arrays, with the dates decomposed as (year, month, day, hour, min, sec)).

## 3. From a flux archive (`formatters.lagrange.ReadArchive`)

The `formatters.lagrange` module also contains a `ReadArchive` function, which constructs a flux structure from an archive of annual flux files:
- Each annual flux file contains one single flux estimate (i.e. just one flux category).
- The flux files are named following the pattern **{prefix}.{source}.{year}.nc**

The function takes four arguments:
- prefix: a _string_, including the path of the files;
- start, end: two _datetime.datetime_ objects, specifying the start and end of the simulation period;
- category: a dictionary establishing the correspondance between the **source** and the corresponding category name in the inversion.

For example, with `prefix='/path/to/the/files/flux_co2.'`, `category={'biosphere':'LPJ_GUESS', 'fossil':'EDGAR'}`, `start=datetime(2010,1,1)` and `end=datetime(2011,1,1)`, the function will call `ReadArchive(prefix, start, end, categories)` will return a `Struct` spanning one year, constructed from the data in the files **/path/to/the/files/flux_co2.LPJ_GUESS.2010.nc** and **/path/to/the/files/flux_co2.EDGAR.2010.nc**. 
Note that the spatial and temporal resolution is that of the files themselves (no other check is done).

The archived files are in netCDF4 format, with three dimensions (time, lat and lon), and four variables (time, lat, lon and co2flux), following the example below: