## Lifetime Analysis

Starting with what I want. Building up from simple things that work. Create simulated data to verify it's functioning. Use object oriented coding.

**Purpose:** 

* A simple, object oriented set of scripts for analyzing half-life data. 
* Should work for both Ne19 and He6. Maybe even for anyone analyzing halflife data with the CAEN digitizer? 


**Process:** 

* Begin with a simple exponential and add complexity from there. 
    * Next add a simple non-paralyzable dead-time model. 
    * Be sure to integrate over bin widths.
* Use the data that was taken in Sept and Michael's code as a starting place. 
* Keep organized. This is # 1. 
* Use Brent's code and Kris's code and my own CreateFakeSpecSimple code as a guide and reminder. 
* Begin by writing the code into a ipynb and then hide that code away into scripts that are imported once things begin to work as you want them to. 
* In the end the ipynb should only contain a bit of code and should mostly just step through the analysis process which should be intuitive. 
* Simulate fake data to make sure we know it's working and also to get an idea for the stats we will need to take a competitive measurement. 

**Class Structure:** 

* Start with a HalflifeAnalysis object that reads and breaks up the data. The data will be pd df's that are attributes of the HalflifeAnalysis object.
    * The init function will call the splitter and create the attributes.
* Then the differnt type of analysis' will be subclasses. 
    * The first one being SimpleExponential. Be sure to integrate over the bins. 
    * Use lmfit. Take time to understand the stats that you are doing. 

In [65]:
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import scipy as sp
import math
import random
from scipy.optimize import curve_fit,minimize
from numpy.linalg import inv
from tqdm import tqdm as tqdm


class LifetimeAnalysis: 
    """
    Description of the class with Attributes and Method's described briefly. 
    """
  
            
    def __init__(self, Date, Cycle, TimeInterval):
        
        """
        __init__ (Method): 
            Desription: Creates class attributes based on Date, Cycle. 
            Arguments: Date (str): mmddyyyy , Cycle (int): #
            Returns: NA 
            Attributes Created: 
                * self.Data: A list of pandas df's consisting of [dfCh0,dfCh1,dfCh2] respectively.
                * self.DataCycles: A list consisting of [Ch0Cycles,Ch1Cycles], where each is a list
                of df's corresponding to each cycle. 
            Notes: 
                * Need to stick to run naming convention when taking data. runTESTmmddyyyy_#
                * Need to keep data in directory Data/Ne19Run_mmddyyyy
            
        """
    
        Path0 = Path().absolute() / 'Data/Ne19Run_{}/CH0@DT5725_1146_Data_runTEST{}_{}.csv'.format(Date, Date, Cycle)
        Path1 = Path().absolute() / 'Data/Ne19Run_{}/CH1@DT5725_1146_Data_runTEST{}_{}.csv'.format(Date, Date, Cycle)
        Path2 = Path().absolute() / 'Data/Ne19Run_{}/CH2@DT5725_1146_Data_runTEST{}_{}.csv'.format(Date, Date, Cycle)
        dfCh0 = pd.read_csv(Path0,sep=';')
        dfCh1 = pd.read_csv(Path1,sep=';')
        dfCh2 = pd.read_csv(Path2,sep=';')
        
        self.Data = [dfCh0,dfCh1,dfCh2]
        
        self.DataCycles = self.CycleSplit(self.Data,TimeInterval)
        
        
    def CycleSplit(self, Data, TimeInterval):
        """ 
        CycleSplit (Method): 
            Desription: Splits attribute self.Data into cycles for both Ch0 and Ch1 data. 
            Arguments: 
                * Data: A list of pandas df's consisting of [dfCh0,dfCh1,dfCh2] respectively.
                * TimeInterval: A tuple representing the time interval relative to t=0 that defines each cycle. 
            Returns: 
                *DataCycles: A list consisting of [Ch0Cycles,Ch1Cycles], where each is a list
                of df's corresponding to each cycle. 
            Notes: 
                * As an example: Run_1[0][5] is a df that corresponds to Ch0, Cycle 5 of Run_1. 
         """
            
        Ch0Cycles  = []
        Ch1Cycles  = []
        
        # Iterating through the number of times Ch2 recieved a signal.
        for i in np.arange(0,len(Data[2])):
            
            # Creating a temporary df for the events within the TimeInterval.
            df0Temp = Data[0][(Data[0]['TIMETAG'] > Data[2]['TIMETAG'][i]+TimeInterval[0]*10**12) & (Data[0]['TIMETAG'] < Data[2]['TIMETAG'][i]+TimeInterval[1]*10**12)]
            df1Temp = Data[1][(Data[1]['TIMETAG'] > Data[2]['TIMETAG'][i]+TimeInterval[0]*10**12) & (Data[1]['TIMETAG'] < Data[2]['TIMETAG'][i]+TimeInterval[1]*10**12)]
            
            Ch0Cycles.extend([df0Temp])
            Ch1Cycles.extend([df1Temp])
        
        # Create a list of lists of dfs. 
        DataCycles = [Ch0Cycles,Ch1Cycles]
               
        return DataCycles  

 

In [39]:
def Split_Data(ch0_dataFrame,ch2_dataFrame,minimum_counts = 0):
    """This function returns a list of the counts data split by run for channel0 based on the triggering from channel2
    Parameters
    ----------
    ch0_dataFrame : Pandas Dataframe, counts data.
    ch2_dataFrame : Pandas Dataframe, run trigger data.
    minimum_counts : int, 
        minimum counts to remove any possible signals that may occur when not 
        running but still collecting data. Default is 0.
    Returns
    -------
    ch0split_data : Pandas Dataframe,
        Returns a list of Pandas Dataframes that correspond to each cycle in the run.
    """
    #initializing an empty list for coincidencedata ch0 
    #by the clock from ch2
    
    ch0split_data = []
    
    
    #iterating through the number of times ch2 recieved a signal from our clock
    for i in np.arange(0,len(ch2_dataFrame)):
        
        #creating a mask to separate the full data into their individual runs for ch0
        ch0time_mask = np.where((ch0_dataFrame['TIMETAG'] > ch2_dataFrame['TIMETAG'][i]) & 
                                (ch0_dataFrame['TIMETAG'] < ch2_dataFrame['TIMETAG'][i+1]-(80*10**12)))#pullback time in units of 10**-12 seconds
        # Note (Drew): This "pull-back" time was set to 89s by Michael. But if the cycle length is less than this you need to change it. 
        # Ch2: it was at t=64 now it's at t=0. (So maybe try to change to 25s?)
        #This number for minimum counts is to remove any possible signals that may occur when we are not running but still collecting data
        if len(ch0time_mask[0]) >= minimum_counts:
            
            #for each run I am creating a new dataframe and storing in a list
            split_dataFrame0 = ch0_dataFrame.iloc[[ch0time_mask][0][0]].copy()
            #resetting the index
            split_dataFrame0 = split_dataFrame0.reset_index(drop=True)
            
            ch0split_data.extend([split_dataFrame0])

    #returns arrays of dataframes for channel0       
    return ch0split_data

In [66]:
# The tripple quotes are not comments as they are saved with the class and methods. 
help(LifetimeAnalysis.CycleSplit)

TimeInterval = (53,124)
a = LifetimeAnalysis('11242020',0,TimeInterval)

a.DataCycles
len(a.DataCycles)
a.DataCycles[0][1]
# Ok, right now it's only 2 cycles long... 

Help on function CycleSplit in module __main__:

CycleSplit(self, Data, TimeInterval)
    CycleSplit (Method): 
        Desription: Splits attribute self.Data into cycles for both Ch0 and Ch1 data. 
        Arguments: 
            * Data: A list of pandas df's consisting of [dfCh0,dfCh1,dfCh2] respectively.
            * TimeInterval: A tuple representing the time interval relative to t=0 that defines each cycle. 
        Returns: 
            *DataCycles: A list consisting of [Ch0Cycles,Ch1Cycles], where each is a list
            of df's corresponding to each cycle. 
        Notes: 
            * As an example: Run_1[0][5] is a df that corresponds to Ch0, Cycle 5 of Run_1.



Unnamed: 0,TIMETAG,ENERGY,ENERGYSHORT,FLAGS
2251480,205467015027998,65,61,0x4000
2251481,205467039715997,37,35,0x4000
2251482,205467042049085,102,101,0x4000
2251483,205467063283996,91,87,0x4000
2251484,205467069859997,34,32,0x4000
...,...,...,...,...
2961702,276463786017215,120,116,0x4000
2961703,276463912544244,29,25,0x4000
2961704,276464705827998,80,77,0x4000
2961705,276465682627998,31,28,0x4000


In [1]:
for i in a.DataCycles[1]:  
    print(i)
    BinNum = int(i['TIMETAG'].max()*10**-12) # Setting the number of bins = time of run in seconds. 
    plt.figure(figsize=(15,8))
    plt.tight_layout()
#     plt.title('09182020 Run {}'.format(i))
#     plt.ylabel('Counts')
#     plt.xlabel('Time (Seconds)')
#     plt.yscale('log')
    y, x, _ = plt.hist(i['TIMETAG']*10**-12,bins=BinNum);

print(y,x)


NameError: name 'a' is not defined

## Next Steps: 

* Go on and break up the cycle with a method that is called by init. Make those cycles attributes. 
* Make a plot method for nice plots of the cycles with orange lines in there for cycles. 
* Make a subclass for the simple exponential fit using lmfit. 


# Debugging: 

* Right now it's missing the last cycle. Why? 
    * Because it needs an both the ch2 signal before and after. Maybe I should make it work differently, with an interval relative to t = 0 as an input and then it's all based on the t= 0 signal. Thus getting even the last cycle. And also easier to alter. 
* I actually think it would be better to have a cycles or data or run class and then have a LifetimeAnlaysis class that takes the cycles as an argument. Seems more intuitive. 
* Why can I not just subtract some number from a column? Still getting an error for this, need to fix at some point. But working fine now at least. 

# Coding Questions: 

* When should you use self as an argument for a method in a class? What is the significance of this exactly? 
* Does a class method have open access to all class attriubutes or do they have to be listed as arguments explicitly? And what is the best practice here? 
    * Yes it does. But I need to be careful with my notation here. Python classes have class attributes and instance attributes. The difference being that class attributes are available to all instances and to the class itself. Whereas the instance attributes are only available to the instances, not the class itself. And all the class methods have access to both of these types of attributes. But generally I am working with instance attributes (self.Data)

# Fitting with a Maximum Liklihood method

**Reading:**

* https://towardsdatascience.com/a-gentle-introduction-to-maximum-likelihood-estimation-9fbff27ea12f
    * So the standard deviation is a parameter? 
    * This is gold in terms of what packages to use and such. Very clear. 
    * Maybe I should look into the seaborn.regplot tutorial and package. Maybe compare this outcome to the MLE outcome?
    * Also statsmodels. Look into this. He does all this in like 3 lines of code haha. Lots to learn. 
* Ch6 from sssssstuff
    
**Process:**

* Start with being able to use MLE to fit a line with randomness and an exponential with randomness before you ever start to use our own data. 