This exercise consists in analyzing a dataset containg timing information from a series of Time-to-Digital-Converters (TDC) implemented in a pair of FPGAs. Each measurement (i.e. each row of the input file) consists of a flag that specifies the type of message ('HEAD', which in this case is always 1), two addresses of the TDC providing the signal ('FPGA' and 'TDC_CHANNEL'), and the timing information ('ORBIT_CNT', 'BX_COUNTER', and 'TDC_MEAS'). Each TDC count corresponds to 25/30 ns, whereas a unit of BX_COUNTER corresponds to 25 ns, and the ORBIT_CNT is increased every 'x' BX_COUNTER. This allows to store the time in a similar way to hours, minutes and seconds.

In [None]:
# If haven't downloaded it yet, please get the data file with wget
#!wget https://www.dropbox.com/s/xvjzaxzz3ysphme/data_000637.txt -P ./data/

1\. Create a Pandas DataFrame reading N rows of the `data/data_000637.txt` dataset. Choose N to be smaller than or equal to the maximum number of rows and larger that 10k (check the documentation).

In [5]:
import pandas as pd

nrows = 20000
df = pd.read_csv("data_000637.txt", nrows = nrows)
print(df[:10])

   HEAD  FPGA  TDC_CHANNEL   ORBIT_CNT  BX_COUNTER  TDC_MEAS
0     1     0          123  3869200167        2374        26
1     1     0          124  3869200167        2374        27
2     1     0           63  3869200167        2553        28
3     1     0           64  3869200167        2558        19
4     1     0           64  3869200167        2760        25
5     1     0           63  3869200167        2762         4
6     1     0           61  3869200167        2772        14
7     1     0          139  3869200167        2776         0
8     1     0           62  3869200167        2774        21
9     1     0           60  3869200167        2788         7


2\. Estimate the number of BX in a ORBIT (the value 'x').

In [8]:
orbits = list(set(df["ORBIT_CNT"]))
number_BX = []
for i in range(len(orbits)):
    different_bx = 0
    total_bx_in_an_orbit = 0
    for j in range(len(df)): 
        if df["ORBIT_CNT"][j] == orbits[i]:
            different_bx += 1
            total_bx_in_an_orbit = total_bx_in_an_orbit + df["BX_COUNTER"][j]
    number_BX.append([orbits[i], total_bx_in_an_orbit, different_bx])

print("\n\nFirst 10 orbits:\n")
for i in range(10):
    print("Orbit ", number_BX[i][0],"   TOTAL BX: ", number_BX[i][1],  "   Different BX: ", number_BX[i][2])
    
BX_Orbits = list(zip(*number_BX))
maxBX = max(BX_Orbits[1])
print("\n\nNumber of BX of the orbit that has the highest BX: ", maxBX)



First 10 orbits:

Orbit  3869200167    TOTAL BX:  124133    Different BX:  43
Orbit  3869200168    TOTAL BX:  97201    Different BX:  85
Orbit  3869200169    TOTAL BX:  144343    Different BX:  127
Orbit  3869200170    TOTAL BX:  217462    Different BX:  98
Orbit  3869200171    TOTAL BX:  228822    Different BX:  109
Orbit  3869200172    TOTAL BX:  162033    Different BX:  89
Orbit  3869200173    TOTAL BX:  153284    Different BX:  88
Orbit  3869200174    TOTAL BX:  211070    Different BX:  128
Orbit  3869200175    TOTAL BX:  312015    Different BX:  128
Orbit  3869200176    TOTAL BX:  148031    Different BX:  51


Number of BX of the orbit that has the highest BX:  347968


3\. Find out the duration of the data taking in hours, minutes and seconds. You can either make an estimate based on the fraction of the measurements (rows) you read, or perform this check precisely by reading the whole dataset.

In [10]:

import pandas as pd

data = pd.read_csv('data_000637.txt', sep=",", nrows = 15000)

# 2. Estimate the number of BX in a ORBIT (the value 'x')
orbit_cnt = []
for i in range(len(data)):
    orbit_cnt.append(data["ORBIT_CNT"][i])
 
orbit_cnt = list(set(orbit_cnt))

import numpy as np
count = 0
bx = []
result = []
for i in range(len(orbit_cnt)):
    for j in range(len(data)):
        if orbit_cnt[i] == data["ORBIT_CNT"][j]:
            bx.append(data["BX_COUNTER"][j])
    count = sum(bx)
    #print("ORBIT_CNT:",orbit_cnt[i],"has",count,"BX")
    result.append([orbit_cnt[i],count])
    count = 0
    bx = []

result = list(zip(*result))
print("firs 10 orbit:\n")

for i in range(10):
    print("ORBIT_CNT:",result[0][i],"has",result[1][i],"BX")

firs 10 orbit:

ORBIT_CNT: 3869200167 has 124133 BX
ORBIT_CNT: 3869200168 has 97201 BX
ORBIT_CNT: 3869200169 has 144343 BX
ORBIT_CNT: 3869200170 has 217462 BX
ORBIT_CNT: 3869200171 has 228822 BX
ORBIT_CNT: 3869200172 has 162033 BX
ORBIT_CNT: 3869200173 has 153284 BX
ORBIT_CNT: 3869200174 has 211070 BX
ORBIT_CNT: 3869200175 has 312015 BX
ORBIT_CNT: 3869200176 has 148031 BX


4\. Create a new column with the absolute time in ns (as a combination of the other three columns with timing information) since the beginning of the data acquisition.

5\. Use the `.groupby()` method to find out the noisy channels, i.e. the TDC channels with most counts (print to screen the top 3 and the corresponding counts)

6\. Count the number of non-empty orbits (i.e. the number of orbits with at least one hit). Count also the number of unique orbits with at least one measurement from TDC_CHANNEL=139.

7\. **Optional:** Make two occupancy plots (one for each FPGA), i.e. plot the number of counts per TDC channel