# Processing of waveforms

**GOAL**
- count all events
- make plot amplitude VS time to discriminate between 1 phe signals ("real counts" and afterpulses) and 2 phe signals (crosstalks)
- evaluate % of afterpulses and crosstalks wrt total

Notes: time axis has a physical minimum due to width of waveform

**Procedure**
- csv files of waveforms and timestamps combined
- csv files, sliced into single waveforms
- find minima (absolute and relative), plot them, count them (count single points of absolute minimum) and save their timestamps and amplitude
- from amplitude value you understand if a peak is noise (crosstalk, afterpulse)
- get amplitudes and timestamps of each minimum

*Afterpulse*: happens tipically at 0.5 $\mu$s after a real signal, and has an amplitude almost equal to 1 phe; threshold for considering an event an afterpulse of the event before is 6 $\mu$s.  
*Crosstalk*: single peak of amplitude 2 phe


In [1]:
import pandas as pd
from scipy.signal import argrelextrema
    # Finds the minima of any user-defined function
import numpy
import matplotlib

In [56]:
timestamp_table = pd.read_csv('HPKR00030_2cicli_OV3_time.csv', header = 0)
timestamp_table.rename(columns={'X: (s)':'Event', 'Y: (Hits)':'Delta t'}, inplace=True)
timestamp_table.head()

Unnamed: 0,Event,Delta t
0,0,0.0
1,1,0.015416
2,2,0.000355
3,3,0.004935
4,4,0.010274


In [57]:
N_of_events = len(timestamp_table) # Corresponds to the "FastFrame Count" written in the header of the wf files

In [58]:
wf_data_point = 0
line_counter = 0
n_line = -1
waveform_file = open('HPKR00030_2cicli_OV3_wf.csv') # Just opened to make automatic search of the header

In [59]:
# Routine to count number of lines of header

while n_line == -1:
    line = waveform_file.readline()
    if line.startswith("Record Length"): wf_data_point = int(line.split(',')[-1])
    if line.startswith("TIME"): n_line = line_counter
    line_counter += 1
    if line_counter == 100: print("ERROR")

In [6]:
wf_data_point

6250

In [7]:
n_line

9

In [61]:
waveform_table = pd.read_csv('HPKR00030_2cicli_OV3_wf.csv', header=n_line-1)

MemoryError: Unable to allocate 2.00 MiB for an array with shape (262144,) and data type int64

In [None]:
waveform_table.head()

In [62]:
timestamp_table.at[0,'Timestamp'] = timestamp_table.iloc[0]['Delta t']

In [66]:
timestamp_table['Delta t']

0      0.000000
1      0.015416
2      0.000355
3      0.004935
4      0.010274
         ...   
995    0.017280
996    0.017200
997    0.017403
998    1.396270
999    1.322320
Name: Delta t, Length: 1000, dtype: float64

In [88]:
timestamp_table.at[1,'Timestamp']

1.0

In [89]:
for i in range(len(timestamp_table)):
    timestamp_table.at[i+1,'Timestamp'] = timestamp_table.at[i,'Timestamp'] + timestamp_table.at[i,'Delta t']

In [94]:
timestamp_table.Timestamp.head(10)

0    0.000000
1    0.000000
2    0.015416
3    0.015771
4    0.020706
5    0.030981
6    0.035933
7    0.036720
8    0.043858
9    0.055211
Name: Timestamp, dtype: float64

In [74]:
timestamp_table.loc[[i],['Timestamp']] = 1
timestamp_table.loc[[i],['Timestamp']]

Unnamed: 0,Timestamp
999,1.0


In [85]:
for i in range(len(timestamp_table)):
    # timestamp_table.loc[i+1,['Timestamp']] = timestamp_table.loc[i,['Timestamp']] + timestamp_table.loc[i,['Delta t']]
    print(timestamp_table.at[i,'Timestamp'] + timestamp_table.at[i,'Delta t'])
    if i == 10: break

0.0
1.0154156
nan
nan
nan
nan
nan
nan
nan
nan
nan


In [53]:
timestamp_table.Timestamp

0       0.0
1       NaN
2       NaN
3       NaN
4       NaN
       ... 
998     NaN
999     NaN
1000    NaN
1001    NaN
1002    NaN
Name: Timestamp, Length: 1003, dtype: float64

In [29]:
# Create function to analyze the data
# As an alternative, you can create a function that analyzes a single waveform, then loop it over all of them
# Inside this loop one can insert the whole analysis

def analysis(timestamp_table, waveform_table, wf_datapoint):
    for n in N_events:
        snippet = waveform_table.loc[wf_datapoint*n:wf_datapoint*(n+1)-1].copy() 
        # This is a slice of the wf dataframe for a single event
        # Notice that it doesn't even need the timestamps
        
        minimum_list = argrelextrema(snippet.CH1.values, numpy.less_equal, order = 50)[0]
            # the less_equal operator of numpy is used as comparison
            # order is number of points used for window where to find 
            # [0] value only, cause argrelextrema return a matrix and we only want first return
        
        waveform_table.loc[:,'min'] = waveform_table[minimum_list]['CH1'] # Empty column unless it's a minimum then you have its amplitude
        waveform_table.loc[:,'deltat'] = waveform_table[minimum_list]['CH1'] 
     
        plt.scatter(single_waveform.TIME), single_waveform.CH1, marker='.')
        plt.scatter(single_waveform.TIME, waveform_table.min, color='darkred')

## Tips and tricks

You have to find an automated way to find the "good" absolute minimum.
How?

1. Extrapolate baseline: evaluate it by using only pre-trigger amplitudes.
2. Raise an error or quarantine the snippet if you have a number of minima that is too big
3. Evaluate the derivative of the ramp up of a peak
4. Dark count rate is number of total events / total time
5. Discriminate if two peaks have a distance between each other that is too small (is less than the width of the window)

Options to follow:

1. Throw away all noise: if the waveform is not good (has more than one minimum that goes over the threshold) throw it away completely; have to count all the waveforms that are thrown and estimate fraction of total events
2. Check on saturated events: must be true to have points over the threshold and also have some number of points (like, 10) in a very narrow range of values (all equal)

Final goals:

1. Create 2D plot with time differences (x) and amplitude (y): time differences are the differences between timestamps of any two successive "good" peaks

In [48]:
range(10)

range(0, 10)