## SiPMs Analysis
By running this notebook you will obtain diferent histograms ['Pins Height (back)','Pins Height (front)','PCB Width','SiPMs Height'].

(The INPUT information is stored in .xlsx files as returned by the IR02 machine)

## Distributions in xlsx files

Here you can see two examples of the distributions that you expect from the input data

##### ANVERSO

Measurements of the boards for the front side

<img src="../lib/dist_anverso.png" alt="anverso"/>

##### REVERSO

Measurements of the boards for the back side

<img src="../lib/dist_reverso.png" alt="reverso"/>

## RUN ALL

First we import the packages and select the names of the columns that our tables from now on will use

In [None]:
# Importing the function and packages to be used. Make sure you have installed them!
import sys; sys.path.insert(0, '../'); from lib import *

In [None]:
# Labels for the columns
pcbs_labels = ['Diametro Taladro 1 (S1)', 'Posición Y S1', 'Posición X S1', 'Diametro Taladro 2 (S6)', 'Posición Y TS6', 'Posición X TS6', 'Planitud PCB','Anchura (media)', 'Anchura (maxima)', 'Longitud (media)','Longitud (maxima)', 'Longitud rebaje izquierdo', 'Anchura rebaje izquierdo', 'Longitud rebaje derecho', 'Anchura rebaje derecho', 'Distancia entre taladros']
sipm_labels = ['Anchura', 'Longitud', 'Posición X', 'Posición Y', 'Planitud', 'Altura']
pina_labels = ['Posición X', 'Posición Y', 'Altura Pin'] #pin anverso
pinr_labels = ['Altura Pin'] #pin reverso

Next, you need to select the names and locations of the files that you want to analyze.

_Example_

```python
    folder = "../data/2023_03_10/"
    pcbs0,sipm0,pinr0,pina0 = data2npy(folder=folder,pcbs_labels=pcbs_labels,sipm_labels=sipm_labels,pins_labels=pina_labels,mode=10,debug=False)

    df_pcbs_ids = df_display(pcbs0,labels = pcbs_labels+["IDs"],name="pcbs",terminal_output=True,save=True)
    df_sipm_ids = df_display(sipm0,labels = sipm_labels+["IDs"],name="sipm",terminal_output=True,save=True,index=["SiPM #1","SiPM #2","SiPM #3","SiPM #4","SiPM #5","SiPM #6"]*len(df_pcbs_ids))
    df_pina_ids = df_display(pina0,labels = pina_labels+["IDs"],name="pina",terminal_output=True,save=True,index=["Pin #1","Pin #2","Pin #3","Pin #4","Pin #5","Pin #6","Pin #7","Pin #8"]*len(df_pcbs_ids))
    df_pinr_ids = df_display(pinr0,labels = pinr_labels+["IDs"],name="pinr",terminal_output=True,save=True,index=["Pin #1","Pin #2","Pin #3","Pin #4","Pin #5","Pin #6","Pin #7","Pin #8"]*len(df_pcbs_ids))
```

INPUT PARAMETERS YOU MAY NEED TO CHANGE:

```data2npy```
- folder: location of the files
- *_labels: names of the columns that you want to use (prevoiusly stored in these variables)
- mode: number of measurement per file (typically 10)

```df_display```
- If save = True you will have .txts in the fit_data file with these results
- They can also be printed in the notebook with terminal_output = True
- If index is provided it will use it to rename the rows

You need to select the type of SiPM you are analysisng ("HPK","FBK") and make sure the specifications are saved in ```Specifications/*.txt``` folder.

In [None]:
# Generating data_frames for the pcbs, sipm, pins_anverso/reverso. 

# SIPM_MODEL = "HPK" # CHOOSE THE SiPM YOU ARE ANALYSING "HPK" or "FBK" 

# folders = [ 
#             # "../data/2023_03_10/" # NOT CHECKED SET; NEED TO MEASURE AGAIN
#             "../data/2023_05_05/", 
#             "../data/2023_05_05_bunchx3/", 
#             "../data/2023_06_19/", 
#             "../data/2023_07_03/", 
#             "../data/2023_08_30/", 
#             "../data/2023_09_11/", 
#             "../data/2023_09_13/", 
#             "../data/2023_10_17/", 
#             "../data/2023_11_16/"
#           ]

# modes = [10, 3, 10, 10, 10, 10, 10, 10, 10]

SIPM_MODEL = "FBK" # CHOOSE THE SiPM YOU ARE ANALYSING "HPK" or "FBK" 

folders = [ 
            "../data/2024_01_08/", 
          ]

modes = [10]

for f,folder in enumerate(folders):
  p, s, r, a = data2npy(folder=folder, pcbs_labels=pcbs_labels, sipm_labels=sipm_labels, pins_labels=pina_labels, mode=modes[f], debug=False)
  print(p)
  if f == 0: pcbs = p; sipm = s; pinr = r; pina = a
  else:
    pcbs = np.concatenate((pcbs,p),axis=0)
    sipm = np.concatenate((sipm,s),axis=0)
    pinr = np.concatenate((pinr,r),axis=0)
    pina = np.concatenate((pina,a),axis=0)

df_pcbs_ids = df_display(pcbs, labels=pcbs_labels+["IDs"], name="pcbs", terminal_output=True, save=True)
df_sipm_ids = df_display(sipm, labels=sipm_labels+["IDs"], name="sipm", terminal_output=True, save=True, index=["SiPM #1","SiPM #2","SiPM #3","SiPM #4","SiPM #5","SiPM #6"]*len(df_pcbs_ids))
df_pina_ids = df_display(pina, labels=pina_labels+["IDs"], name="pina", terminal_output=True, save=True, index=["Pin #1","Pin #2","Pin #3","Pin #4","Pin #5","Pin #6","Pin #7","Pin #8"]*len(df_pcbs_ids))
df_pinr_ids = df_display(pinr, labels=pinr_labels+["IDs"], name="pinr", terminal_output=True, save=True, index=["Pin #1","Pin #2","Pin #3","Pin #4","Pin #5","Pin #6","Pin #7","Pin #8"]*len(df_pcbs_ids))


In [None]:
# DataFrames with loaded data + IDs (caja, board, sipm, useful for identification)
df_pcbs = df_pcbs_ids[pcbs_labels] 
df_sipm = df_sipm_ids[sipm_labels]
df_pina = df_pina_ids[pina_labels]
# df_pinr = df_pinr_ids[pinr_labels]

pcbs_mean,pcbs_std,pcbs_max,pcbs_min = npy2df(df_pcbs[pcbs_labels], [])
sipm_mean,sipm_std,sipm_max,sipm_min = npy2df(df_sipm[sipm_labels], ["Posición Y"])
pina_mean,pina_std,pina_max,pina_min = npy2df(df_pina[pina_labels], ["Posición Y"])
# pinr_mean,pinr_std,pinr_max,pinr_min = npy2df(df_pinr[pinr_labels], [])

df_pcbs_exp = pd.DataFrame(np.array((pcbs_mean,pcbs_max,pcbs_min)),columns=pcbs_labels, index=["Experimental", "Max", "Min"])
df_sipm_exp = pd.DataFrame(np.array((sipm_mean,sipm_max,sipm_min)),columns=sipm_labels, index=["Experimental", "Max", "Min"])
df_pina_exp = pd.DataFrame(np.array((pina_mean,pina_max,pina_min)),columns=pina_labels, index=["Experimental", "Max", "Min"])
# df_pinr_exp = pd.DataFrame(np.array((pinr_mean,pinr_max,pinr_min)),columns=pinr_labels, index=["Experimental", "Max", "Min"])

If it is the first time you look at this kind of data you may want to see that the loading process is correct. 

You can do this by setting ```debug = True``` in the ```data2npy``` function and also you can use the following cell to check for consistency in the data.

```sanity_check``` looks for repeated data for you to look for a posible pattern "artificially" introduced by your reading algorithm.

In [None]:
###### SANITY CHECK: to confirm we havent duplicated data in the arrays ######
# print("---- PCBs ----")
# sanity_check(df_pcbs,pcbs)

# print("---- SiPM ----")
# sanity_check(df_sipm,sipm)

# # print("---- Pins_anv ----")
# # sanity_check(df_pina,pina) # It is "common" to have duplicated rows

# # print("---- Pins_rev ----")
# # sanity_check(df_pinr,pinr) # It is common to have duplicated rows as there are only two measurements per pin

We now load the values set by the specifications of the SiPMs and the PCBs.

MAKE SURE THESE TXT FILES ARE WHAT YOU EXPECT !

To this new dataframe we add the measurements results we have obtained previosuly.

The final ```df_*_all``` will be used to get the plots.

In [None]:
# Load the *_hpk.txt files as dataframes

df_pcbs_all = pd.read_csv('../Specifications/pcbs_%s.txt'%SIPM_MODEL.lower(), sep='\t')
df_sipm_all = pd.read_csv('../Specifications/sipm_%s.txt'%SIPM_MODEL.lower(), sep='\t')
df_pina_all = pd.read_csv('../Specifications/pina_%s.txt'%SIPM_MODEL.lower(), sep='\t')
df_pinr_all = pd.read_csv('../Specifications/pinr_%s.txt'%SIPM_MODEL.lower(), sep='\t')

# Add the "Experimental", "Max", and "Min" rows
df_pcbs_all = pd.concat((df_pcbs_all,df_pcbs_exp.loc[["Experimental", "Max", "Min"]]))
df_sipm_all = pd.concat((df_sipm_all,df_sipm_exp.loc[["Experimental", "Max", "Min"]]))
df_pina_all = pd.concat((df_pina_all,df_pina_exp.loc[["Experimental", "Max", "Min"]]))
# df_pinr_all = pd.concat((df_pinr_all,df_pinr_exp.loc[["Experimental", "Max", "Min"]]))

# Rename the row names
df_pcbs_all = df_pcbs_all.rename(index={0: "Theoretical", 1: "STD+", 2: "STD-"})
df_sipm_all = df_sipm_all.rename(index={0: "Theoretical", 1: "STD+", 2: "STD-"})
df_pina_all = df_pina_all.rename(index={0: "Theoretical", 1: "STD+", 2: "STD-"})
# df_pinr_all = df_pinr_all.rename(index={0: "Theoretical", 1: "STD+", 2: "STD-"})

pd.set_option('display.float_format', '{:.2f}'.format)
# Display the dataframes
print("\n---- PCBs HPK ----")
display(df_pcbs_all)
# display(pd.DataFrame(np.array(( [2.4,2.375,4,2.4,2.375,4,None,8.0,8.0,119.8,119.8,4.25,2.25,4.25,2.25,115],[0.1,0.1,0.1,0.1,0.1,0.1,None,0.2,0.2,0.25,0.25,0.1,0.1,0.1,0.1,0.2],[0.1,0.1,0.1,0.1,0.1,0.1,None,0.2,0.2,0.2,0.2,0.1,0.1,0.1,0.1,0.2],pcbs_mean,pcbs_max,pcbs_min)),columns=pcbs_labels, index=["Theoretical", "STD+", "STD-", "Experimental", "Max", "Min"]))
print("\n---- SiPM HPK ----")
display(df_sipm_all)
# display(pd.DataFrame(np.array(( [6,6,0,[7.525,27.525,47.525,67.525,87.525,107.525],0,1.4],[0.1,0.1,0.1,[0.1,0.1,0.1,0.1,0.1,0.1],0.1,0.1],[0.1,0.1,0.1,[0.1,0.1,0.1,0.1,0.1,0.1],0.1,0.1], sipm_mean, sipm_max, sipm_min )),columns=sipm_labels, index=["Theoretical", "STD+", "STD-", "Experimental", "Max", "Min"]))
print("\n---- Pins_anv HPK ----")
display(df_pina_all)
# display(pd.DataFrame(np.array(( [0,[14.5,20.5,34.5,40.5,74.5,80.5,94.5,100.5],0],[0.1,[0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1],1],[0.1,[0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1],1], pina_mean, pina_max, pina_min )), columns=pina_labels, index=["Theoretical", "STD+", "STD-", "Experimental", "Max", "Min"]))
print("\n---- Pins_rev HPK ----")
# display(df_pinr_all)
# display(pd.DataFrame(np.array(( [8.7],[0.5],[0.5], pinr_mean, pinr_max, pinr_min )), columns=pinr_labels, index=["Theoretical", "STD+", "STD-", "Experimental", "Max", "Min"]))

The next cell checks that the measured values are within the specifications.

You just need to run ```check_specifications``` and it will return the wrong values with it IDs.

In [None]:
print("PCB")
check_especifications(df_pcbs_ids, df_pcbs_all, pcbs_labels, "../fit_data/errors_pcb")

print("SiPMs")
check_especifications(df_sipm_ids, df_sipm_all, sipm_labels, "../fit_data/errors_sipm")

print("PIN_ANV")
check_especifications(df_pina_ids, df_pina_all, pina_labels, "../fit_data/errors_pina")

# print("PIN_REV")
# check_especifications(df_pinr_ids, df_pinr_all, pinr_labels, "../fit_data/errors_pinr")

### TIME TO PLOT !

Three following cells will plot the histograms combining all the data introduced before.

The first one will plot the distributions of the PCBs, then the SiPMs and finally the pins.

For the burrs we have an special treatment as we add the right and left measurements (take the combined plot obtained in the following independent cell).

The variables defined in these cells are used to make the plots, feel free to change them (see ```plotlytos``` for more info)

In [None]:
## BURRS ##
aux_pcbs1 =  pd.concat([df_pcbs_ids['Longitud rebaje izquierdo'], df_pcbs_ids['Longitud rebaje derecho']], axis=0, ignore_index=True)
aux_pcbs2 =  pd.concat([df_pcbs_ids['Anchura rebaje izquierdo'],  df_pcbs_ids['Anchura rebaje derecho']],  axis=0, ignore_index=True)
aux_pcbs1 = pd.DataFrame(aux_pcbs1, columns=['Length']); aux_pcbs1["IDs"] = df_pcbs_ids["IDs"].to_list()*2
aux_pcbs2 = pd.DataFrame(aux_pcbs2, columns=['Width']);  aux_pcbs2["IDs"] = df_pcbs_ids["IDs"].to_list()*2

aux_mean1 = pd.DataFrame()
aux_mean2 = pd.DataFrame()
aux_mean1["Length"] = df_pcbs_all['Longitud rebaje derecho']
aux_mean2["Width"]  = df_pcbs_all['Anchura rebaje derecho']

titles = ['PCB - burr lenght (R1) ', 'PCB - burr width (R2)']
xlabel = ['Length [mm]', 'Width [mm]']
ylabel = ['Nº PCBs'] * len(titles)
colums = ['Length', 'Width']
colors = ["purple","orange","red"]
df_raw = [aux_pcbs1, aux_pcbs2]
df_fin = [aux_mean1, aux_mean2]

for i in range(len(titles)): fig = plotlytos(titles[i], xlabel[i], ylabel[i], df_raw[i], df_fin[i], colums[i],colors=colors,decimales=3,text_auto=False); fig.show()

In [None]:
## PCBS (ignoring the burrs here) ##
titles = ['PCB - Diameter S1', 'PCB - Distance to Drill (D1)', 'PCB - Position X (S1)', 'PCB - Diameter S6', 'PCB - Distance to Drill (D2)', 'PCB - Position X (S6)', 'PCB - Width','PCB - Length', 'PCB - burr lenght (R2) ', 'PCB - burr width (R1)', 'PCB - Longitud rebaje derecho', 'PCB - Anchura rebaje derecho', 'PCB - Distancia entre taladros']
xlabel = ['Diameter S1 [mm]', 'Position [mm]', 'Position [mm]', 'Diametro Taladro 2 (S6)', 'Position [mm]', 'Position [mm]', 'Width [mm]','Lenght [mm]', 'Longitud rebaje izquierdo', 'Anchura rebaje izquierdo', 'Longitud rebaje derecho', 'Anchura rebaje derecho', 'Distancia entre taladros']
ylabel = ['Nº PCBs'] * len(titles)
colums = ['Diametro Taladro 1 (S1)', 'Posición Y S1', 'Posición X S1', 'Diametro Taladro 2 (S6)', 'Posición Y TS6', 'Posición X TS6', 'Anchura (maxima)','Longitud (maxima)', 'Longitud rebaje izquierdo', 'Anchura rebaje izquierdo', 'Longitud rebaje derecho', 'Anchura rebaje derecho', 'Distancia entre taladros']
colors = ["purple","orange","red"]
df_raw = df_pcbs_ids
df_fin = df_pcbs_all

for i in range(len(titles)): fig = plotlytos(titles[i], xlabel[i], ylabel[i], df_raw, df_fin, colums[i],colors=colors,decimales=3,text_auto=False); fig.show()

# show_html(fig)
# save_html(fig,str(colums[i])+".html")
# fig.write_image(str(colums[i])+".png")

 # NOT SHOWING PLANITUD PORQUE NO ESTA EN LAS ESPECIFICACIONES 

In [None]:
## SiPMs ##
titles = ['SiPMs - Position X', 'Position Y', 'SiPMs - Height']
xlabel = ['Position [mm]', 'Position [mm]', 'Height [mm]']
ylabel = ['Nº SiPMs'] * len(titles)
colums = ['Posición X', 'Posición Y', 'Altura']
colors = ["purple","orange","red"]
df_raw = df_sipm_ids
df_fin = df_sipm_all

for i in range(len(titles)): fig = plotlytos(titles[i], xlabel[i], ylabel[i], df_raw, df_fin, colums[i],colors=colors,decimales=2,text_auto=False); fig.show()

In [None]:
## Pins ##
titles = ['Pins - Position X (front)', 'Position Y (front)', 'Pins - weld height', 'Pins - Height']
xlabel = ['Position [mm]', 'Position [mm]', 'Height [mm]', 'Height [mm]']
ylabel = ['Nº Pins'] * len(titles)
colums = ['Posición X', 'Posición Y', 'Altura Pin', 'Altura Pin']
colors = ["purple","orange","red"]
df_raw = [df_pina_ids] * (len(titles)-1) + [df_pinr_ids]
df_fin = [df_pina_all] * (len(titles)-1) + [df_pinr_all]

for i in range(len(titles)): fig = plotlytos(titles[i], xlabel[i], ylabel[i], df_raw[i], df_fin[i], colums[i],colors=colors,text_auto=False); fig.show()