In [1]:
import edit.data

# Modifications

In addition to directly getting data with specifications of the variables, `modifications` can be used to modify the retrieval of data.

### The syntax is as follows:

Can be str of form

```
- '!accumulate[period: "6 hourly"]:tcwv>accum_tcwv'
```

Where the `!ABC` references the function to apply, the `[]` the init kwargs needed in json form,
and all after `:` being the normal variable specification with anything after `>` being the new name.

Or dictionary with following keys

```
    - source_var (REQUIRED)     Variable to modify
    - modification (REQUIRED)   Modification to apply
    - target_var                Rename of variable
    - **                        Any other keys for `modification`
```

This will be transparent to the user, and only act upon retrieval of data.

Available modifications include:

```
    - !accumulate
    - !mean
    - !aggregate
    - !constant
```

## Automatic Accumulations
When retrieving data from an index, a common operation is accumulating a variable over a temporal period. If the index is decorated with `edit.data.indexes.decorators.variable_modifications` this accumulation can be done in the `variable` specification with no other action needed. 




In [None]:
ERA5_accumulations = edit.data.archive.NCI.ERA5('!accumulate[period: "6 hourly"]:tcwv>accum_tcwv')
ERA5_accumulations

In [None]:
# Single timestep
ERA5_accumulations['2023-01-01T00']

Unnamed: 0,Array,Chunk
Bytes,3.96 MiB,255.94 kiB
Shape,"(721, 1440)","(182, 360)"
Dask graph,16 chunks in 8 graph layers,16 chunks in 8 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.96 MiB 255.94 kiB Shape (721, 1440) (182, 360) Dask graph 16 chunks in 8 graph layers Data type float32 numpy.ndarray",1440  721,

Unnamed: 0,Array,Chunk
Bytes,3.96 MiB,255.94 kiB
Shape,"(721, 1440)","(182, 360)"
Dask graph,16 chunks in 8 graph layers,16 chunks in 8 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [None]:
# Full data of 6 hourly accumulations
ERA5_accumulations['2023-01-01']

Unnamed: 0,Array,Chunk
Bytes,95.05 MiB,4.75 MiB
Shape,"(24, 721, 1440)","(19, 182, 360)"
Dask graph,32 chunks in 28 graph layers,32 chunks in 28 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 95.05 MiB 4.75 MiB Shape (24, 721, 1440) (19, 182, 360) Dask graph 32 chunks in 28 graph layers Data type float32 numpy.ndarray",1440  721  24,

Unnamed: 0,Array,Chunk
Bytes,95.05 MiB,4.75 MiB
Shape,"(24, 721, 1440)","(19, 182, 360)"
Dask graph,32 chunks in 28 graph layers,32 chunks in 28 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


## Constants

Some variables can be considered constants, and can be held fixed by the `edit.data.Indexes`, and held in memory to reduce compute time

In [2]:
ERA5_constants = edit.data.archive.NCI.ERA5('!constant[query: "2000-01-01T00"]:lsm')
ERA5_constants

In [3]:
# Any other time will get the constant query time
ERA5_constants['2023-01-01']