# Plotting data from GEA
## Intro
This is an example Jupyter Notebook to show how our hacky EPICS Data Capture and Plotting python scripts can be turned into a Powerful AF (tm) web tool to analyze data.

<div class="alert alert-block alert-danger">
<b>Disclaimer!:</b> This is not meant to completely replace GEA but to be more of an in-depth follow-up analysis tool.
</div>

## The awesome Real Time EPICS data analysis module
The developer's purpose with this module was to give the users some flexibility into how to plot and analyze data gathered from our EPICS systems.

The first part of the module is the utility that allows the user to get data from GEA or directly from EPICS IOCs. GEA being our engineering archive is an extensive database compiled through times immemorial. However, being able to harvest data from IOCs for a particular test or investigation without having to configure said records in GEA (!!!) is also nice.

The main script in this section, found in `./util/` is:
 - `inPosDataCap.py`

The library being used by this script, found in `./lib/` is:
 - `dataFromGea.py`
 
The second part of the module is the python library that provides the users with tools to manipulate and plot data. In this library the users will find a myriad of methods/functions/utilities to filter/compare/enhance data. However, the killer functionality is the set of classes that allow for plots to be formatted and tiled without having to deal with the all that fun and exciting matplotlib crap (jk, matplotlib is the goat).
All goodies are found in the python file in `./lib/`:
 - `plotEpicsData.py`

<div class="alert alert-block alert-warning">
<b>Note:</b> As a bonus, the module allowed the dev to be a lazy pos that couldn't be bother to develop an actual UI, which would have been way better.
</div>

### Requirements and installation
This tool does not have very complicated requirements to run. The installation is really just cloning the repo from Gitlab.

<div class="alert alert-block alert-warning">
<b>Futurism Alert!:</b> I'm working on a container that should take care of most of the requirements, possibly more. But don't hold your breath.
</div>

#### Linux Requirements
Your machine (or container) needs the following core packages (as well as all their requirements, of course)
 - **EPICS >= 3.14** (or if you find a way to run Channel Access without the core epics installation, even better)
 - **Python >= 3.9** (I couldn't get interactive jupyter plots to work with 3.6)
 - **PIP3**
 - **node.js**

#### Python module requirements
These modules are needed to run the python scripts and functions (all modules for version >= 3.9 will work, install the latest with `pip3`)
 - **pyepics**
 - **numpy**
 - **h5py**
 - **ipympl**
 - **jupyterlab**

#### Installation



## How to gather data
As it was mentioned above, data can be harvested after the fact from GEA or capturing a live data stream by tapping directly into IOCs. Data is gathered using the script `inPosDataCap.py`. All data is saved by default in the `./data/` directory.

The utility `inPosDataCap.py` has two modes:
 - **GEA**: This mode is enabled by the argument `gea`. In its most basic form it **reads an input text file** passed as an argument, which contains a list of EPICS channels. The data is gathered for the **time window specified by the time string input** arguments with format `yymmddTHHMM`. The user also need to specify the **site from which data is being gathered** (input argument `gs` for Gemini South and `gn` for Gemini North). The data is then returned as an __[hdf5](https://www.neonscience.org/resources/learning-hub/tutorials/about-hdf5)__ file with the default prefix `recDataGea-{start_date}`. An example command would go as follow:
    ```
    $ ./util/inPosDataCap.py gea gs FastTrack-chans.txt 230412T2200 230412T2300
    
 
    ```    
    This would gather the data from **GEA South** for the channels in `FastTrack-chans.txt` and write it to the file `recDataGea-230412T2200.h5`. The file is saved to `./data`, since no custom dir was specified in the arguments.

 - **Channel Access**: This mode is enabled by the input argument `ca`. Data is collected live for the list of EPICS channels especified in the **input text file**, for the duration specified by the user (eg. -`hr 3` would set the script to capture data for 3 hours). An example command is shown below
    ```
    $ ./util/inPosDataCap.py ca BtO-chans.txt -min 45
    
    
    ```
    This would set CA monitors to gather data from the channels specified in `BtO-chans.txt` for 45 min after the execution of the command. Once the capture timer is done, the file `recMonCA-{date-of-execution}.h5` would be generated and stored in `./data`, as no user dir was specified.

<div class="alert alert-block alert-success">
<b>Dig deeper:</b> There are other options for interacting with this utility. Check <code>inPosDataCap.py gea -h</code> and <code>inPosDataCap.py ca -h</code> for more details.
</div>

## How to analyze and plot the data
Now we get to the good stuff. We have the data, what now?
As we saw earlier, the lib script `plotEpicsData.py` provides us with classes that handle all the plotting infrastucture. It also provides a set of functions to manipulate the data to our hearts content. For more examples, checkout `examples/` where you can find example scripts that use this library.
To illustrate how we can go about using this awesome set of tools, let's go through an example exercise

### Fault Report FR-42809
For this example we will be analizing data related to __[FR-42809](https://osc.cl.gemini.edu/browse/FR-42809)__. Run the cells in the following sections to obtain a set of graphs that will help identify what happened during the FR's event.

<div class="alert alert-block alert-warning">
<b>Warning!:</b> If you don't have access to the data file used in the example, please uncomment and run the cell below.
</div>

In [None]:
#!../util/inPosDataCap.py gea gs ../examples/FR42809-chans.txt 230417T2200 230418T0200 -cn FT-OpsNight

#### Modules
The first step is to import the necessary tools from `plotEpicsData.py`. The stuff to import is explained as follows
 - `DataAx`: This is the class that defines an individual plot. This means defining the axis box characteristics as well as the data visualization properties.
 - `DataAxePlotter`: This is the class that defines the grid and handles the relative size and position for a group of axis.

These classes are imported individually, since they could be used in several places throughout the script, dependindg in the number of plots involved. We also import `plotEpicsData` as a whole module in order to use the different functions defined in the script.

In [None]:
# The following line is needed for interactive plotting embedded in the notebook
%matplotlib ipympl

# Load the necessary modules
from chanmonitor.lib.plotEpicsData import DataAx, DataAxePlotter
import chanmonitor.lib.plotEpicsData as ped

#### The data
In the first section of the script, we define the following
 - The **name of the file** that holds the data
 - The **location** of said file
 - The **start and end times** that define the span of the plot
 
In this case, we will be looking at `FT-OpsNight-230417T2200.h5`, a file with data gathered from the Operations Night on April 17th, 2023 using the utility described in the first section. We won't define a time window explicitly, which means we'll plot the entire time range contained in the file.

In [None]:
file_name = 'FT-OpsNight-230417T2200.h5'
file_loc = '../data/'
hdf5File = file_loc + file_name
stime = ''
etime = ''

The data is extracted from the file using `extract_hdf5`. This function generates a python dictionary containing the data from all the records contained in the hdf5 file. In this case, the dictionary generated is stored in `recData`.

In [None]:
# Read h5 file
recData = ped.extract_hdf5([hdf5File],
                           stime,
                           etime)

#### Data manipulation and analysis
Once we have the dictionary, we can start playing with the data. For this example we will be doing the following processing:
 - **Extract demands from demand arrays**: Pretty self explanatory
 - **Missing demands analysis**: We want to know if any of the demands generated by the TCS where lost during transmission to the MCS. For this we'll use a function that finds any demands present at the TCS but missing in the MCS. We'll also calculate statistics to include as info in our plots. Finally well generate an array that bins the number of lost demands over a set time differential (bin window).

In [None]:
# Extract demands from TCS demands array
tcs_dmd_val = [d[13] for d in recData['tcs:drives:driveMCS.VALI'][1]]
tcs_dmd_array = [recData['tcs:drives:driveMCS.VALI'][0], tcs_dmd_val]
# Extract demands at end point array in MCS
mcs_fllw_val = [d[3] for d in recData['mc:followA.J'][1]]
mcs_fllw_array = [recData['mc:followA.J'][0], mcs_fllw_val]

# Find missing demands
tcs_lost = ped.lost_dmd(recData['tcs:drives:driveMCS.VALI'],
                        recData['mc:followA.J'])
# Calculate lost demands statistics
tcs_total = len(recData['tcs:drives:driveMCS.VALI'][0])
tcs_total_lost = len(tcs_lost[0])
lost_prcnt = (tcs_total_lost/tcs_total) * 100
tcs_total_accum = [tcs_lost[0], list(range(1,tcs_total_lost+1))]

# Distribute lost demands over bins of diff_window span
diff_window = 1
lost_pkg_diff = ped.lost_dmd_diff(tcs_lost, diff_min=diff_window)

#### Configuring plots
With the data manipulation out of the way, is time to care of data viz. For this example, we'll plot the following:
 - **Azimuth**: Demand, Demand at Tx, Demand at Rx, Current Position, Position error, Error reported by the PMAC
 - **Lost demands**: scatter plot of lost demands over time, binned plot of lost demands over time in bins of 1 min.

Each plot is configured using the `DataAx` class. With this class we can define many attributes. For this example, we'll be configuring attributes such as:
 - Line and marker color, eg. __[xkcd colors](https://xkcd.com/color/rgb/)__
 - Line width
 - Marker size
 - Marker type
 - Axis label
 - Plot label
 - Box height
 
Special mention to **box height**. If this parameter is not specified, **all boxes have the same height**. When this parameter is set for a plot, **all unspecified boxes get a height of 1 with the specific height being a multiplyer of said unit height**. This description can be confusing, but all that matters is that **plots will always fill up the entire column in which they are plotted**. The example plot should clarify this further.

<div class="alert alert-block alert-success">
<b>Go nuts:</b> Feel free to play by changing each attribute and see how the plot changes.
</div>

In [None]:
mc_azDmd = DataAx(recData['mc:azDemandPos'],
                  'xkcd:grass green',
                  label='mc:azDemandPos',
                  ylabel='Position [deg]',
                  marker='o',
                  marksize=5,
                  zone=zones_dict,
                  linewidth=1.5)

mc_azPos = DataAx(recData['mc:azCurrentPos'],
                  'xkcd:bright blue',
                  label='mc:azCurrentPos',
                  ylabel='Position [deg]',
                  marker='o',
                  marksize=5,
                  linewidth=1.25)

mc_azPos_tcs = DataAx(tcs_dmd_array,
                      'xkcd:hot pink',
                      label='tcs:drives:driveMCS.VALI',
                      ylabel='Position [deg]',
                      marker='o',
                      marksize=5,
                      linewidth=1.25)

mc_azPos_fllw = DataAx(mcs_fllw_array,
                       'xkcd:neon blue',
                       label='mc:followA.J',
                       ylabel='Position [deg]',
                       marker='o',
                       marksize=5,
                       linewidth=1.25)

mc_azPos_tcs = DataAx(tcs_dmd_array,
                      'xkcd:hot pink',
                      label='tcs:drives:driveMCS.VALI',
                      ylabel='Position [deg]',
                      marker='o',
                      marksize=5,
                      linewidth=1.25)

mc_azPos_fllw = DataAx(mcs_fllw_array,
                       'xkcd:neon blue',
                       label='mc:followA.J',
                       ylabel='Position [deg]',
                       marker='o',
                       marksize=5,
                       linewidth=1.25)

mc_azErr = DataAx([recData['mc:azPosError'][0],
                   recData['mc:azPosError'][1]],
                  'xkcd:brick red',
                  label='mc:azPosError',
                  ylabel='Position Error [deg]',
                  ylims = ylim_dr_mcs,
                  height = 2,
                  linewidth=1.25)

mc_azPmacErr = DataAx(recData['mc:azPmacPosError'],
                      'xkcd:plum',
                      label='mc:azPmacPosError',
                      ylabel='Position Error [deg]',
                      ylims = ylim_dr_mcs,
                      height = 2,
                      linewidth=1.25)

tcs_lost_dmd = DataAx(tcs_lost,
                      'xkcd:cherry',
                      linestyle='',
                      marker='o',
                      label=f'Lost Dmd: {lost_prcnt:.2f}% {tcs_total_lost}/{tcs_total}',
                      ylabel='Lost Pkg [bool]')

tcs_lost_diff = DataAx(lost_pkg_diff,
                       'xkcd:cherry',
                       marker='o',
                       height = 2,
                       drawstyle='steps-post',
                       label=f'Acc Lost Dmd, Diff={diff_window} [min] ',
                       ylabel=f'Acc Lost Pkg [count]')

In [None]:
plts = DataAxePlotter(ncols=1)

plts.Axe['c1']['mc_azPos'] = mc_azPos
#plts.Axe['c2']['mc_azDmd'] = mc_azDmd
#plts.Axe['c2']['mc_azPos'] = DataAx.update_axe(mc_azPos,
#                                                   shaxname='mc_azDmd')
#plts.Axe['c2']['mc_azPos_tcs'] = DataAx.update_axe(mc_azPos_tcs,
#                                                   shaxname='mc_azDmd')
#plts.Axe['c2']['mc_azPos_fllw'] = DataAx.update_axe(mc_azPos_fllw,
#                                                   shaxname='mc_azDmd')
# plts.Axe['c1']['mc_azPmacDmd'] = mc_azPmacDmd
#plts.Axe['c1']['mc_azErr'] = mc_azErr
#plts.Axe['c1']['mc_azPmacErr'] = mc_azPmacErr
#plts.Axe['c1']['tcs_lost_dmd'] = tcs_lost_dmd
#plts.Axe['c1']['tcs_lost_diff'] = tcs_lost_diff

# plts.Axe['c2']['mc_elDmd'] = mc_elDmd
# plts.Axe['c2']['mc_elPos'] = DataAx.update_axe(mc_elPos,
                                                   # shaxname='mc_elDmd')
# plts.Axe['c2']['mc_elPmacDmd'] = mc_elPmacDmd
# plts.Axe['c2']['mc_elErr'] = mc_elErr
# plts.Axe['c2']['mc_elPmacErr'] = mc_elPmacErr

# plts.Axe['c3']['mc_trk_azerr_fft'] = mc_trk_azerr_fft
# plts.Axe['c3']['mc_trk_elerr_fft'] = mc_trk_elerr_fft
plts.positionPlot()
plts.plotConfig('FR-42809 Analysis')