# How to Aggregate and Plot AmpTools .fit Results
If you're here, you've most likely just finished running an amplitude analysis using the [AmpTools](https://github.com/mashephe/AmpTools) framework, and now you have a whole bunch of fit results, stored as `.fit` files, that you need to start plotting. Within this tutorial you will learn how to:
1. Aggregate these fit and data files into flattened `.csv` files
2. Load and plot the `.csv` fit results using python's pandas and matplotlib libraries
   1. Analyze mass independent fit results across multiple bins of invariant mass
   1. Analyze the distributions of several fit results within a mass bin

As mentioned in the [repo's README](../README.md), this tutorial assumes you are already familiar with how AmpTools functions and the basics of amplitude analysis. Please note that this tutorial will *not* cover plotting the fit's angular distributions. This requires the full information of the `.root` files and is effectively handled by the [halld_sim plotter scripts](https://github.com/JeffersonLab/halld_sim/blob/master/src/programs/AmplitudeAnalysis/vecps_plotter/vecps_plotter.cc)

## Environment
1. At the top right of the notebook your language / kernel is listed. Make sure `.venv (Python 3.9.18)` is selected. If the option does not appear, make sure to the virtual environment is active (see [**Setup > Python** in the README](../README.md) for details).
2. Next we want to ensure that our GlueX environment is setup. Normally we could simply run `source setup_gluex.csh` if we were doing this in the terminal, but we'll need to go about it a special way to run this in the jupyter notebook:

In [None]:
from pathlib import Path
import os
import subprocess

# first lets define the parent dir (repo home)
parent_dir = str(Path().resolve().parents[0]) # 0 is the 1st parent directory, i.e. the repo home
print(parent_dir)

import sys
sys.path.insert(0, parent_dir) # add the repo home directory to the list of directories Python uses to look for modules

# run the source script (done here in csh, but bash could be done instead)
command = f"source {parent_dir}/setup_gluex.csh && env"
proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True, executable='/bin/csh')
output, _ = proc.communicate()

# Parse the environment variables
env_vars = {}
for line in output.decode().splitlines():
    key, value = line.split('=', 1)
    env_vars[key] = value

# add the environment variables
os.environ.update(env_vars)

## File Aggregation
We will want to create the following 2 files in preparation for our analysis:
1. `data.csv`: constructed from the `.root` data files in each mass bin, containing the information for that bin
2. `best_fits.csv`: contains all the fit results across the entire mass range, made from the "best" of all the randomized fits in each bin

### Data
If we want to plot our fit results, we need to include the original data we are actually fitting to. This information is unfortunately not included in the `.fit` files, so we need to read it into a `data.csv` file using [convert_to_csv.py](../scripts/convert_to_csv.py). These python scripts use `argparse`, so we can conveniently see its abilities through its help message

In [None]:
%run -i $parent_dir/scripts/convert_to_csv.py -h

Great, now lets see what (sorted) files we are going to combine, and run them.

In [None]:
%run -i $parent_dir/scripts/convert_to_csv.py -i $parent_dir/data/*/*Amplitude.root -p
%run -i $parent_dir/scripts/convert_to_csv.py -i $parent_dir/data/*/*Amplitude.root

You should now see a `data.csv` file here in the [analysis directory](./)

### Fit Results
Now we will be combining all our `.fit` results across the mass bins into a flattened `.csv` file to prepare them for analysis via python. This is achieved again through the [convert_to_csv.py](../scripts/convert_to_csv.py) script, which behind the scenes interacts with [extract_fit_results.cc](../scripts/extract_fit_results.cc) to load the AmpTools `FitResults` class and import the information we need.

In [None]:
%run -i $parent_dir/scripts/convert_to_csv.py -i $parent_dir/data/*/*best.fit -o best_fits.csv

## Analysis

### Mass independent fit results

### Randomized fit results

TODO: describe here how we can aggregate a csv for just a single mass bin

NOTE: Aggregation need not be strictly applied to mass bin results. Bootstrap, $-t$ bins, or any other collection of `.fit` results can be grouped together.