This exercise consists in analyzing a dataset containg timing information from a series of Time-to-Digital-Converters (TDC) implemented in a pair of FPGAs. Each measurement (i.e. each row of the input file) consists of a flag that specifies the type of message ('HEAD', which in this case is always 1), two addresses of the TDC providing the signal ('FPGA' and 'TDC_CHANNEL'), and the timing information ('ORBIT_CNT', 'BX_COUNTER', and 'TDC_MEAS'). Each TDC count corresponds to 25/30 ns, whereas a unit of BX_COUNTER corresponds to 25 ns, and the ORBIT_CNT is increased every 'x' BX_COUNTER. This allows to store the time in a similar way to hours, minutes and seconds.

In [1]:
# If haven't downloaded it yet, please get the data file with wget
#!wget https://www.dropbox.com/s/xvjzaxzz3ysphme/data_000637.txt -P ./data/

1\. Create a Pandas DataFrame reading N rows of the `data/data_000637.txt` dataset. Choose N to be smaller than or equal to the maximum number of rows and larger that 10k (check the documentation).

In [2]:
import pandas as pd
import numpy as np

N = 100000
f = "./data/data_000637.txt"
df = pd.read_csv(f, nrows=N)
df

Unnamed: 0,HEAD,FPGA,TDC_CHANNEL,ORBIT_CNT,BX_COUNTER,TDC_MEAS
0,1,0,123,3869200167,2374,26
1,1,0,124,3869200167,2374,27
2,1,0,63,3869200167,2553,28
3,1,0,64,3869200167,2558,19
4,1,0,64,3869200167,2760,25
...,...,...,...,...,...,...
99995,1,0,64,3869201161,2378,29
99996,1,0,70,3869201161,2472,26
99997,1,0,58,3869201161,2558,0
99998,1,0,57,3869201161,2561,23


2\. Estimate the number of BX in a ORBIT (the value 'x').

In [3]:
y = df['BX_COUNTER']/df['ORBIT_CNT']
sum_up = y.sum()
x = sum_up/N
print("The number of BX in a ORBIT is ", x)

The number of BX in a ORBIT is  4.6338946346658543e-07


3\. Find out the duration of the data taking in hours, minutes and seconds. You can either make an estimate based on the fraction of the measurements (rows) you read, or perform this check precisely by reading the whole dataset.

In [4]:
dv = pd.read_csv(f)
count = dv['BX_COUNTER'].sum()
count = count * 25
count1 = dv['TDC_MEAS'].sum()
count1 = count1 * (25/30)
print(count1)
count += count1
seg = count/1000000000
minu = 0
hour = 0
while seg>=60:
    minu = seg/60
    if minu>=60:
        hour = minu/60
print("Duration of the data taking is ", hour, " hours, ", minu, " minutes and ", seg, " seconds.")

14552702.5
Duration of the data taking is  0  hours,  0  minutes and  58.3775171775  seconds.


4\. Create a new column with the absolute time in ns (as a combination of the other three columns with timing information) since the beginning of the data acquisition.

In [5]:
cont = 0
lista = []
for i in range(len(df)):
    cont += df.iloc[i, 4]*25
    cont += df.iloc[i, 5]*(25/30)
    lista.append(cont)
df['ABS_TIME'] = lista
df

Unnamed: 0,HEAD,FPGA,TDC_CHANNEL,ORBIT_CNT,BX_COUNTER,TDC_MEAS,ABS_TIME
0,1,0,123,3869200167,2374,26,5.937167e+04
1,1,0,124,3869200167,2374,27,1.187442e+05
2,1,0,63,3869200167,2553,28,1.825925e+05
3,1,0,64,3869200167,2558,19,2.465583e+05
4,1,0,64,3869200167,2760,25,3.155792e+05
...,...,...,...,...,...,...,...
99995,1,0,64,3869201161,2378,29,4.483226e+09
99996,1,0,70,3869201161,2472,26,4.483288e+09
99997,1,0,58,3869201161,2558,0,4.483352e+09
99998,1,0,57,3869201161,2561,23,4.483416e+09


5\. Use the `.groupby()` method to find out the noisy channels, i.e. the TDC channels with most counts (print to screen the top 3 and the corresponding counts)

In [65]:
dv['HITS'] = dv.groupby('TDC_CHANNEL')['TDC_CHANNEL'].agg(['count'])
lehen = 0
bi = 0
hiru = 0
sum1 = 0
sum2 = 0
sum3 = 0
ind = 0
for j in range(3):
    ind = dv['HITS'].idxmax()
    if j==0:
        lehen = dv.at[ind, 'TDC_CHANNEL']
        sum1 = dv.at[ind, 'HITS']
    elif j==1:
        bi = dv.at[ind, 'TDC_CHANNEL']
        sum2 = dv.at[ind, 'HITS']
    elif j==2:
        hiru = dv.at[ind, 'TDC_CHANNEL']
        sum3 = dv.at[ind, 'HITS']
    dv.at[ind, 'HITS'] = 0
print("This are the top 3 channels")
print(lehen, " with ", sum1, " hits.")
print(bi, " with ", sum2, " hits.")
print(hiru, " with ", sum3, " hits.")

This are the top 3 channels
56  with  108059.0  hits.
60  with  66020.0  hits.
8  with  64642.0  hits.


6\. Count the number of non-empty orbits (i.e. the number of orbits with at least one hit). Count also the number of unique orbits with at least one measurement from TDC_CHANNEL=139.

In [75]:
orbits = dv['ORBIT_CNT'].value_counts()
print(orbits)
unique_orbits = dv.loc[dv['TDC_CHANNEL'] == 139]
print(unique_orbits['ORBIT_CNT'].nunique(), 'of unique orbits on 139 TDC_CHANNEL')

3869208772    351
3869207118    337
3869209661    324
3869206967    322
3869206506    305
             ... 
3869204462      3
3869203571      2
3869205800      2
3869206180      1
3869204142      1
Name: ORBIT_CNT, Length: 11001, dtype: int64
10976 of unique orbits on 139 TDC_CHANNEL


7\. **Optional:** Make two occupancy plots (one for each FPGA), i.e. plot the number of counts per TDC channel