## Pandas analysis

This exercise consists in analyzing a dataset containg timing information from a series of Time-to-Digital-Converters (TDC) implemented in a couple of FPGAs. Each measurement (i.e. each row of the input file) consists of a flag that specifies the type of message ('HEAD', which in this case is always 1), two addresses of the TDC providing the signal ('FPGA' and 'TDC_CHANNEL'), and the timing information ('ORBIT_CNT', 'BX_COUNTER', and 'TDC_MEAS'). Each TDC count corresponds to 25/30 ns, whereas a unit of BX_COUNTER corresponds to 25 ns, and the ORBIT_CNT is increased every 'x' BX_COUNTER. This allows to store the time in a similar way to hours, minutes and seconds.

In [2]:
# If you didn't download it yet, please get the relevant file now!
!wget https://www.dropbox.com/s/xvjzaxzz3ysphme/data_000637.txt -p ~/data/


  9850K .......... .......... .......... .......... .......... 30% 5.55M 6s
  9900K .......... .......... .......... .......... .......... 30% 4.15M 6s
  9950K .......... .......... .......... .......... .......... 30% 4.88M 6s
 10000K .......... .......... .......... .......... .......... 31% 5.78M 6s
 10050K .......... .......... .......... .......... .......... 31% 4.15M 6s
 10100K .......... .......... .......... .......... .......... 31% 5.85M 6s
 10150K .......... .......... .......... .......... .......... 31% 3.19M 6s
 10200K .......... .......... .......... .......... .......... 31% 6.28M 6s
 10250K .......... .......... .......... .......... .......... 31% 4.94M 6s
 10300K .......... .......... .......... .......... .......... 31% 4.41M 6s
 10350K .......... .......... .......... .......... .......... 32% 4.94M 5s
 10400K .......... .......... .......... .......... .......... 32% 4.44M 5s
 10450K .......... .......... .......... .......... .......... 32% 6.13M 5s
 10500K ...

--2022-01-14 13:42:53--  https://www.dropbox.com/s/xvjzaxzz3ysphme/data_000637.txt
Resolving www.dropbox.com (www.dropbox.com)... 162.125.69.18
Connecting to www.dropbox.com (www.dropbox.com)|162.125.69.18|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/xvjzaxzz3ysphme/data_000637.txt [following]
--2022-01-14 13:42:54--  https://www.dropbox.com/s/raw/xvjzaxzz3ysphme/data_000637.txt
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc6534c5c7e37ab3e1ba00643572.dl.dropboxusercontent.com/cd/0/inline/Bdye4RGad3c6csaihBOgz1YtVVcNzK8_mhLdjsJiSGqjry_ChMm4U4P_TRQLUkRFZ-sz2I1ZVmCV3qgGfj806vC82_cgYt9MWL27KDUkvZgqsefvrKU_gQvOmP2KjXgrKzsbJdsi081VZ3s945C1WmTc/file# [following]
--2022-01-14 13:42:55--  https://uc6534c5c7e37ab3e1ba00643572.dl.dropboxusercontent.com/cd/0/inline/Bdye4RGad3c6csaihBOgz1YtVVcNzK8_mhLdjsJiSGqjry_ChMm4U4P_TRQLUkRFZ-sz2I1ZVmCV3qgGfj806vC82_cgYt9MWL27KDUkvZg

 11650K .......... .......... .......... .......... .......... 36% 5.80M 5s
 11700K .......... .......... .......... .......... .......... 36% 4.48M 5s
 11750K .......... .......... .......... .......... .......... 36% 3.72M 5s
 11800K .......... .......... .......... .......... .......... 36% 5.28M 5s
 11850K .......... .......... .......... .......... .......... 36% 4.22M 5s
 11900K .......... .......... .......... .......... .......... 36% 5.07M 5s
 11950K .......... .......... .......... .......... .......... 37% 5.24M 5s
 12000K .......... .......... .......... .......... .......... 37% 4.21M 5s
 12050K .......... .......... .......... .......... .......... 37% 5.60M 5s
 12100K .......... .......... .......... .......... .......... 37% 5.03M 5s
 12150K .......... .......... .......... .......... .......... 37% 3.56M 5s
 12200K .......... .......... .......... .......... .......... 37% 3.95M 5s
 12250K .......... .......... .......... .......... .......... 37% 7.04M 5s
 12300K ....

 26000K .......... .......... .......... .......... .......... 80% 4.82M 1s
 26050K .......... .......... .......... .......... .......... 80% 5.05M 1s
 26100K .......... .......... .......... .......... .......... 80% 3.53M 1s
 26150K .......... .......... .......... .......... .......... 80% 4.86M 1s
 26200K .......... .......... .......... .......... .......... 81% 5.41M 1s
 26250K .......... .......... .......... .......... .......... 81% 4.91M 1s
 26300K .......... .......... .......... .......... .......... 81% 4.67M 1s
 26350K .......... .......... .......... .......... .......... 81% 3.94M 1s
 26400K .......... .......... .......... .......... .......... 81% 5.97M 1s
 26450K .......... .......... .......... .......... .......... 81% 4.52M 1s
 26500K .......... .......... .......... .......... .......... 81% 5.20M 1s
 26550K .......... .......... .......... .......... .......... 82% 3.94M 1s
 26600K .......... .......... .......... .......... .......... 82% 4.83M 1s
 26650K ....

1\. Create a Pandas DataFrame reading N rows of the 'data_000637.txt' dataset. Choose N to be smaller than or equal to the maximum number of rows and larger that 10k.

2\. Find out the number of BX in a ORBIT (the value 'x').

3\. Find out how much the data taking lasted. You can either make an estimate based on the fraction of the measurements (rows) you read, or perform this check precisely by reading out the whole dataset.

4\. Create a new column with the absolute time in ns (as a combination of the other three columns with timing information).

5\. Replace the values (all 1) of the HEAD column randomly with 0 or 1.

6\. Create a new DataFrame that contains only the rows with HEAD=1.

7\. Make two occupancy plots (one for each FPGA), i.e. plot the number of counts per TDC channel

8\. Use the groupby method to find out the noisy channels, i.e. the TDC channels with most counts (say the top 3)

9\. Count the number of unique orbits. Count the number of unique orbits with at least one measurement from TDC_CHANNEL=139

In [2]:
import pandas as pd
import numpy as np
import numpy.random as npr
npr.seed(12345)

In [4]:
! cd


C:\Users\media\Documents\GitHub\LaboratoryOfComputationalPhysics_Y4


In [7]:

nam = "Users\media\Documents\GitHub\LaboratoryOfComputationalPhysics_Y4~\data\data_000637.txt.txt"
df = pd.read_csv(nam )
print(df)


FileNotFoundError: [Errno 2] No such file or directory: 'Users\\media\\Documents\\GitHub\\LaboratoryOfComputationalPhysics_Y4~\\data\\data_000637.txt.txt'