# Tutorial 2: Plotting Time Resolved, Temperature-Jump SAXS Data Analysis

**Package Information:**<br>
Currently the [tr_tjump_saxs](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/tree/main?ref_type=heads "tr_tjump_saxs") package only works through the Python3 command line. The full dependencies can be found on the [Henderson GitLab Page](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/tree/main?ref_type=heads "tr_tjump_saxs") and the environment can be cloned from the [environment.yml file](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/blob/main/environment.yml?ref_type=heads "environment.yml file"). The data analysis can be executed from an interactive Python command line such as [iPython](https://www.python.org/) or [Jupyter](https://jupyter.org/) or the code can be written in a script to run in a non-interactive mode. The preferred usage is in Jupyter Lab as this is the environment the package was developed in. Jupyter also provides a file where all code, output of code, and notes can be contained in a single file and serves a record of the data analysis performed, the code used to conduct the data analysis, and the output of the analysis. 

**Tutorial Information:**<br>
This set of tutorial notebooks will cover how to use the `tr_tjump_saxs` package to analyze TR, T-Jump SAXS data and the <a href="https://www.science.org/doi/10.1126/sciadv.adj0396">workflow used to study HIV-1 Envelope glycoprotein dynamics. </a> This package contains multiple modules, each containing a set of functions to accomplish a specific subtask of the TR, T-Jump SAXS data analysis workflow. Many of the functions are modular and some can be helpful for analyzing static SAXS and other data sets as well. 

**Package Modules:**<br>
> 1. `file_handling`<br>
> 2. `saxs_processing`<br>
> 3. `saxs_qc`<br>
> 4. `saxs_kinetics`<br>
> 5. `saxs_modeling`<br>

**Developer:** [@ScientistAsh](https://github.com/ScientistAsh "ScientistAsh GitHub")

**Updated:** 31 Janaury 2024

# Tutorial 2 Introduction
In this Tutorial 2 notebook, I introduce the plotting function in the `file_handling` module from the `tr_tjump_saxs` package. If you need help with loading the files or slicing loaded data, see [Tutorial 1](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/blob/main/TUTORIALS/tutorial1_file-handling.ipynb?ref_type=heads). If you find any issues with this tutorial, please create an issue on the repository GitLab page ([tr_tjump_saxs issues](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/issues "tr_tjump_saxs Issues")).

## Module functions:
> `make_dir()` makes a new directory to store output. <br>
> `make_flist()` makes a list of files. <br>
> `load_saxs()` load a single SAXS scattering or difference curves. <br>
> `load_set()` load a set of SAXS scattering or difference curves. <br>
> `plot_curve()` plots a single or set of SAXS scattering or difference curves. <br>


## Tutorial Files:

### Data Files
The original data used in this analysis is deposited on the [SASBDB](https://www.sasbdb.org/) with accession numbers:
> **Static Data:** <br>
    - *CH505 Temperature Sereies*: SASDT29, SASDT39, SASDT49, SASDT59 <br>
    - *CH848 Temperature Series*: SASDTH9, SASDTJ9, SASDTK9, SASDTL9 <br>
<br>
> **T-Jump Data:** <br>
    - *CH505 T-Jump Data*: SASDT69, SASDT79, SASDT89, SASDT99, SASDTA9, SASDTB9, SASDTC9, SASDTD9, SASDTE9, SASDTF9, SASDTG9 <br>
     - *CH848 T-Jump Data*: SASDTM9, SASDTN9, SASDTP9, SASDTQ9, SASDTR9, SASDTS9, SASDTT9, SASDTU9, SASDTV9, SASDTW9 <br>
<br>
> **Static Env SOSIP Panel:** SASDTZ9, SASDU22, SASDU32, SASDU42, SASDTX9, SASDTY9 <br>

Additional MD data associated with the paper can be found on [Zenodo](https://zenodo.org/records/10451687).

### Output Files
Example output is included in the [OUTPUT](https://github.com/ScientistAsh/tr_tjump_saxs/tree/main/TUTORIALShttps://github.com/ScientistAsh/tr_tjump_saxs/tree/main/TUTORIALS/OUTPUT/) subdirectory in the [TUTORIALS](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/tree/main/TUTORIALS?ref_type=heads) directory.  

# How to Use Jupyter Notebooks
You can execute the code directly in this notebook or create your own notebook and copy the code there.

<div class="alert alert-block alert-info">
    
    <b><i class="fa fa-info-circle" aria-hidden="true"></i>&nbsp; Tips</b><br>
    
    <b>1.</b> To run the currently highlighted cell, hit the <code>shift</code> and <code>enter</code> keys at the same time.<br>
    <b>2</b>. To get help with a specific function, place the cursor in the functions brackets and hit the <code>shift</code> and <code>tab</code> keys at the same time.

</div>

<div class="alert alert-block alert-info" style="background-color: white; border: 2px solid; padding: 10px">
    <b><i class="fa fa-star" aria-hidden="true"></i>&nbsp; In the Literature</b><br>
    
    Our <a href="https://www.science.org/doi/10.1126/sciadv.adj0396">recent paper </a> in Science Advances provides an example of the type of data, the analysis procedure, and example output for this type of data analysis.  <br> 
    
    <p style="text-align:center">
    
</div>

# Import Modules

The first step is to import the necessary python packages. The dependecies will automatically be imported with the package import.

In [None]:
# import sys to allow python to use the file browser to find files
import sys

# append the path for the tr_tjump_saxs_analysis package to the PYTHONPATH
sys.path.append(r'../')

# import CH505TF_SAXS analysis dependent packages and custom functions
from file_handling import *

<div class="alert alert-block alert-info">
    <b><i class="fa fa-info-circle" aria-hidden="true"></i>&nbsp; Tips</b><br>
    Be sure that the path for the <code>tr_tjump_saxs</code> package appended to the <code>PYTHONPATH</code> matches the path to the repository on your machine.
    </div>

# Load Data
We will use the `load_set` function to load a set of SAXS difference curves to plot. See [Tutorial 1](https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/blob/main/TUTORIALS/tutorial1_file-handling.ipynb?ref_type=heads) for more information in this function if needed. 

In [None]:
# first, make a file list
files = make_flist(directory='../../../TR_T-jump_SAXS_July2022/protein_20hz_set01/processedb/',
                  prefix='diff_protein_20hz_set01_1ms_', suffix='_Q.chi')

# sort files
files.sort()

# load files into np arrays
data, data_arr, q, err = load_set(flist=files, delim=' ', mask=10, err=False)

# Plot Data

## Overview
Once all the data you would like to work with is loaded, it is best practice to check the data with the plots to be sure the data is loaded and sliced appropriately. You can use the built in functions for plotting in python, but this module also contains a function to plot SAXS data sets. The `plot_curve` Function

## Input Parameters has 13 input parameters.
> 1. `data_arr`is the array containing the data to be plotted and is required input. This can be either a single curve or a set of curves. <br>
> 2. `q_arr` is the array containing the scattering vector values and is required input. <br>
> 3. `qmin`, `qmax,` `imin`, and `imax` control if and how the inset plot will be scaled and are optional and will not make an inset when set to `None`. The default value for these parameters in None.<br>
> 4. `x`, `y`, and `title` indicate the labels to use for the x-axis, y-axis and plot title, respectively. The parameters are optional and will pass generic labels for plots that are common for SAXS data. <br>
> 5. The `save`, `save_dir`, and `save_name` are optional and indicate if and where files will be saved. By default, nothing is saved. <br>

## Returned Values
This function does not return any values

## Raised Errors
This function does not raise any custom errors. If errors are raised, then refer to the docs for the function cited in the traceback. 

## Example 1: Basic Usage

In [None]:
plot_curve(data_arr=data_arr, q_arr=q, labels=files, qmin=0.02, qmax=0.15, 
           imin=None, imax=None, x='Scattering Vector Å $^{-1}$', y='Scattering Intensity',
           title='CH505 TR, T-Jump Difference Curves for 1ms', save=True, save_dir='./OUTPUT/TUTORIAL2/', 
           save_name='tutorial2_ex1.png')

<br>

We passed the previously loaded scattering curves for the 1ms time delay from the example data to be plotted. In this example data set, the feature of interest is between 0.02-0.15 Å$^{-1}$ scattering vector values, so the inset is applied to this q range. The strings for the labels are a bit too long and make viewing the output difficult. To make shorter string values to use for labels you can add a simple loop to build a label list. 

## Example 2: Adding a label list

In [None]:
labs = []
for i in files:
    labs.append(i[-9:-6])
    
plot_curve(data_arr=data_arr, q_arr=q, labels=labs, qmin=0.02, qmax=0.15, 
           imin=None, imax=None, x='Scattering Vector Å $^{-1}$', y='Scattering Intensity',
           title='CH505 TR, T-Jump Difference Curves for 1ms', save=True, save_dir='./OUTPUT/TUTORIAL2/', 
           save_name='tutorial2_ex2.png')

<br>

Notice how, after creating a label list, the labels indicate the image # from that dataset. 

<br>

## Example 3: Sorting data before plotting

If you would like the curves to be plotted in a sorted manner and are not already sorted, the sorting needs to happen prior to loading the data into arrays. 

In [None]:
files.sort()

data, data_arr, q, err = load_set(flist=files, delim=' ', mask=10, err=False)

labs = []
for i in files:
    labs.append(i[-9:-6])
    
plot_curve(data_arr=data_arr, q_arr=q, labels=labs, qmin=0.02, qmax=0.15, 
           imin=None, imax=None, x='Scattering Vector Å $^{-1}$', y='Scattering Intensity',
           title='CH505 TR, T-Jump Difference Curves for 1ms', save=True, save_dir='./OUTPUT/TUTORIAL2/', 
           save_name='tutorial2_ex3.png')

<div class="alert alert-block alert-warning">
    
    <i class="fa fa-exclamation-triangle"></i>&nbsp; <b>Remember to sort before loading files</b><br>
    If you sort the files after loading them then the file list indices will no longer match the data array indices.
    </div>


## Example 4: Looping Over Multiple Time Delays

In [None]:
%time

# define time delays as a list of strings
times = ['10us', '50us', '100us', '500us', '1ms']

# loop over time delay
for t in times:
    # print dataset 
    print('\033[94;1mLoading ' + str(t) + ' curves\033[94;1m')
    
    # make file list for t time delay
    files = make_flist(directory='../../../TR_T-jump_SAXS_July2022/protein_20hz_set01/processedb/',
                  prefix='diff_protein_20hz_set01_' + str(t), suffix='_Q.chi')
    
    # sort files
    files.sort()

    # load data
    data, data_arr, q, err = load_set(flist=files, delim=' ', mask=10, err=False)

    # create labels for plots
    labs = []
    for i in files:
        labs.append(i[-9:-6])
    
    # plot curves for time delay t
    plot_curve(data_arr=data_arr, q_arr=q, labels=labs, qmin=0.02, qmax=0.15, 
               imin=None, imax=None, x='Scattering Vector Å $^{-1}$', y='Scattering Intensity',
               title='CH505 TR, T-Jump Difference Curves for ' + str(t), save=True, 
               save_dir='./OUTPUT/TUTORIAL2/', 
               save_name='tutorial2_ex4_' + str(t) + '.png')
    
print('\033[92mCompleted plotting analysis!\033[92m')

<div class="alert alert-block alert-success">
    
    <i class="fa fa-check-circle"></i>&nbsp; <b>Congratulations!</b><br>
       You completed the second tutorial! Continue with <a href="https://gitlab.oit.duke.edu/tr_t-jump_saxs/y22-23/-/blob/main/TUTORIALS/tutorial3_sys_err.ipynb?ref_type=heads">Tutorial 3</a> to learn about processing SAXS scattering curves.
    </div>