## Pandas analysis

This exercise consists in analyzing a dataset containg timing information from a series of Time-to-Digital-Converters (TDC) implemented in a couple of FPGAs. Each measurement (i.e. each row of the input file) consists of a flag that specifies the type of message ('HEAD', which in this case is always 1), two addresses of the TDC providing the signal ('FPGA' and 'TDC_CHANNEL'), and the timing information ('ORBIT_CNT', 'BX_COUNTER', and 'TDC_MEAS'). Each TDC count corresponds to 25/30 ns, whereas a unit of BX_COUNTER corresponds to 25 ns, and the ORBIT_CNT is increased every 'x' BX_COUNTER. This allows to store the time in a similar way to hours, minutes and seconds.

In [None]:
# If you didn't download it yet, please get the relevant file now!
!wget https://www.dropbox.com/s/xvjzaxzz3ysphme/data_000637.txt -P ~/data/

1\. Create a Pandas DataFrame reading N rows of the 'data_000637.txt' dataset. Choose N to be smaller than or equal to the maximum number of rows and larger that 10k.

2\. Find out the number of BX in a ORBIT (the value 'x').

3\. Find out how much the data taking lasted. You can either make an estimate based on the fraction of the measurements (rows) you read, or perform this check precisely by reading out the whole dataset.

4\. Create a new column with the absolute time in ns (as a combination of the other three columns with timing information).

5\. Replace the values (all 1) of the HEAD column randomly with 0 or 1.

6\. Create a new DataFrame that contains only the rows with HEAD=1.

7\. Make two occupancy plots (one for each FPGA), i.e. plot the number of counts per TDC channel

8\. Use the groupby method to find out the noisy channels, i.e. the TDC channels with most counts (say the top 3)

9\. Count the number of unique orbits. Count the number of unique orbits with at least one measurement from TDC_CHANNEL=139

In [63]:
import numpy as np
import pandas as pd
import random

In [64]:
file_name="/home/marco/LaboratoryOfComputationalPhysics_Y5/data_000637.txt"
data=pd.read_csv(file_name)
data

Unnamed: 0,HEAD,FPGA,TDC_CHANNEL,ORBIT_CNT,BX_COUNTER,TDC_MEAS
0,1,0,123,3869200167,2374,26
1,1,0,124,3869200167,2374,27
2,1,0,63,3869200167,2553,28
3,1,0,64,3869200167,2558,19
4,1,0,64,3869200167,2760,25
...,...,...,...,...,...,...
1310715,1,0,62,3869211171,762,14
1310716,1,1,4,3869211171,763,11
1310717,1,0,64,3869211171,764,0
1310718,1,0,139,3869211171,769,0


In [67]:
N = random.randint(10000, data.shape[0])
data = data[0:N]
data

Unnamed: 0,HEAD,FPGA,TDC_CHANNEL,ORBIT_CNT,BX_COUNTER,TDC_MEAS
0,1,0,123,3869200167,2374,26
1,1,0,124,3869200167,2374,27
2,1,0,63,3869200167,2553,28
3,1,0,64,3869200167,2558,19
4,1,0,64,3869200167,2760,25
...,...,...,...,...,...,...
412510,1,0,59,3869204319,2831,4
412511,1,1,4,3869204319,2920,3
412512,1,1,139,3869204319,2930,0
412513,1,1,3,3869204319,2922,6


In [78]:
orb = data['ORBIT_CNT']
c = 0
for i in range(1, orb.shape[0], 1):
    if(orb[i] == orb[i-1]):
        c+=1
    else:
        #print(c)
        c=0

orb = data.loc[:,['ORBIT_CNT' , 'BX_COUNTER']]


a = data.loc[data['ORBIT_CNT'] == 3869200167, []]



42
84
126
97
108
88
87
127
127
50
110
93
127
84
117
57
122
61
134
94
91
127
37
85
137
66
146
163
75
1
0
56
119
127
127
39
102
112
80
101
85
114
106
131
65
81
109
0
0
0
0
115
103
55
0
2
114
108
45
179
57
118
100
131
48
79
123
52
125
115
114
127
80
77
103
114
123
50
132
69
139
80
133
91
76
109
145
64
0
0
60
127
107
87
63
119
81
162
67
85
109
130
61
72
124
99
111
78
89
130
89
127
118
46
99
139
105
83
81
112
142
54
78
108
140
56
71
148
79
135
88
67
82
161
52
126
90
105
63
143
104
92
112
85
68
57
145
122
0
2
0
1
83
105
63
94
159
85
0
0
3
0
0
0
1
0
116
89
74
92
135
90
93
46
140
118
104
0
14
83
98
126
92
70
59
157
101
124
60
108
76
112
133
78
117
102
69
70
149
69
83
44
125
96
88
105
96
59
96
52
136
95
142
95
166
128
95
107
139
70
106
91
125
101
77
83
96
132
31
122
71
51
66
111
111
120
120
153
114
88
127
87
104
77
112
127
67
99
49
141
90
66
165
100
0
0
92
140
93
88
90
112
120
51
97
112
88
74
99
0
0
140
104
127
84
113
61
73
55
120
115
110
133
127
72
92
80
123
163
28
103
122
121
92
56
97
114
146

111
111
143
27
112
114
127
99
40
114
104
0
1
0
0
91
137
50
120
100
89
165
49
104
64
179
44
94
126
127
78
90
85
127
89
107
89
95
127
94
78
81
102
131
148
25
101
117
69
114
98
74
85
100
134
103
120
97
94
71
109
136
122
52
86
108
108
94
122
79
116
94
87
98
97
91
87
129
86
118
127
90
73
90
124
99
87
118
0
0
1
0
1
0
0
2
0
0
0
1
62
127
114
55
96
115
127
57
109
129
85
112
92
62
114
86
116
108
79
118
127
69
118
113
80
127
95
100
58
127
127
50
117
86
127
127
99
48
106
114
140
59
1
0
70
121
127
78
99
77
126
127
70
69
114
127
106
0
0
129
36
107
109
68
125
78
127
127
86
70
97
127
92
88
92
108
127
84
61
108
63
147
100
144
53
127
109
96
1
0
58
114
127
127
15
116
122
127
127
24
102
125
65
119
88
110
127
127
82
62
109
127
123
14
116
127
70
92
113
87
109
157
61
71
127
88
88
92
112
113
44
117
140
93
127
52
120
81
76
178
70
79
79
128
145
95
65
99
127
51
118
77
134
124
83
73
85
142
127
28
98
117
100
80
159
51
74
107
191
27
108
80
110
93
102
48
141
104
81
124
131
127
43
105
105
112
55
150
68
122
72
54
162
