# Tutorial for the transition state analysis module

## 1. Production uMLIP calculations
Files mlip_prod.py, calc.py and parse_output.py in the production_calc module are provided to run uMLIP calculations. Users can also opt to use their own scripts.

### 1.1 Production calculations
For the `mlip_prod.py` file to run correctly,

- prepare the input file that only contains {key: Structure} pairs in one dictionary
- make sure that the input file path is correct
- change the name of the MLIP in `mlip_prod.py` to the one to be used
- check that the name of the MLIP can be called correctly from `calc.py`

After the set up, run the following cell or in the terminal, in the same directory as the input file:

In [None]:
!python mlip_prod.py

Note that the above file will generate `.jsonl` files by default. This is done for multiprocessing purposes, and is assumed to be the default output format in the following section. Users may also supply `.json` files, but they must be in the {key: energy value} format.

### 1.2 Output parsing
This can be done with the `parse_output.py` file. For this to work correclty,

- make sure to change the `methods` variable in the file to be a list of MLIP names the user has supplied
- check that the currently directory has subdirectories whose names match elements in `methods`
- these subdirectories should contain calculations results as `.json` or `.jsonl`
- for `.jsonl` files, they should have the format {"key": key, "data": energy value}
- for `.json` files, they should have the format {key: energy value}

Then run the following cell or in the terminal, in the same direcotry as the output directories:

In [None]:
!python parse_output.py

## 2. TS analysis module

### 2.1 Instantiating `TSAnalysis`
The `TSAnalysis` class can be instantiated with a data dictionary in the form of \
`{` \
`"DFT": {key: energy}`\
`MLIP_name_0: {key: energy}` \
`MLIP_name_1: {key: energy}` \
`...` \
`}`\

note that 
- it has to contain a DFT field
- the keys have to be consistent in all methods
- the keys should be in the format `identifier.hop_key.image_number` for the `NebPathwayResult`-related features to work

Additionally, a `barrier_cutoff` input argument needs to be supplied. This truncates all the hops that are above the cutoff for analysis (in eV).

In [None]:
from ts_analysis import TSAnalysis
tsa = TSAnalysis(
    data=data,
    barrier_cutoff=5
)

### 2.2 Plot analysis figures
Below are some methods to obtain parity plots and shape error distribution plots.

### 2.2.1 Barrier error

In [None]:
tsa.get_barrier_scatter_plot(
    mlip_method=mlip_method
)

### 2.2.2 Point-wise energy error

In [None]:
tsa.get_point_wise_energy_error_plot(
    mlip_method=mlip_method
)

### 2.2.3 Transition state shape error

In [None]:
tsa.get_sign_change_plot(
    mlip_method=mlip_method,
    type="energy_diff_sign"
)