# Union of the Sources
Search for the MOT by unifying the key data sources in single timeseries. 

## Overview

### Problem
When we look at the data sources like the SSD or the CMOS images separately, then we find many promising hints for the MOT signal, but we cannot know if it is really the Francium MOT signal. Reconstructing each potential signal with the other data would be too time consuming. 


### Idea
We slice the time into intervals and display all relevant sources in a single timeseries plot. When we see a potential Fr MOT signal, then we can directly cross-check the other signals for verification. 

### Relevant sources
We will consider the following sources: 
- Neutralizer heater
- SSD1
- SSD2
- MOT coil
- CMOS images
- Wavemeter
- Linear manipulator position (if available)


## Data sources
As the amount of data is 1) big for pandas 2) distributed across many files, we have to find a schema on how to process the data. For this, we consider how the data from each source is distributed accross files. 

### Heater
The heater log of the neutralizer heater is stored as one file per day. It contains a timestamp, a duration, a voltage and a power. In theory, we can just mark the heating as a vertical line at the timestamp. We can use the heater information to query the data: If there is no heating, the data is not interesting. 

### SSD1 and SSD2
The SSD data is distributed across many files of varying size, each covering one of the non-overlapping timespans. In the SSDAnalysis we combine these files, convert the pulses into a pulse rate, and finds the peaks. In a first step, it would be possible to just work with these peaks or a sub-sampling of the pulse rate. The data is at WEDATA/data/ in files names like -20220919-193809-Slot3-In1.csv. The meaning of the different slots is: 
- SSD1: Comes from the Ion beam irradiation on the Yttrium, so we can use it to cross-check the position of the rod.
- SSD2: Tells us whether the Fr was released during the heating.
- SSD2, Slot 2 ln 1: This is the PMT (photomultiplier tube) signal. Here we want to see peaks when we heat the Yttrium and release the Fr. 

### MOT coil
The status of the MOT coil is distributed across the parent folders of the CMOS images. We can find them in the folder mot_data/20220918-143000/ as all_data.csv. In this log files, we also have the region of interest (ROI) sum. In the ImageAnalysis, all these files are combined and enriched with the fitting result of the CMOS images. Thus, we can deal with this source as just one file. 

### CMOS images
The images are distributed across folders, one for each run. With the ImageAnalysis, we can convert the image to a result table, that contains per image: 
- timestamp 
- coil status
- ROI sum
- fitting results
The images are saved in the folder mot_data/20220918-143000/cmos_roidata/0917141341/ as cmos_000001.csv.  

### Wavemeter
Could not find the data. 

### Linear manipulator position
Could not find the data.

## Imports

In [1]:
from google.colab import drive
drive.mount('/content/drive')

filepath_drive = "/content/drive/.shortcut-targets-by-id/1B48ps8379Krem2Eym3IJNaUK-_pZ9Q2w/NP2012-AVF72-05"

Mounted at /content/drive


In [2]:
cd /content/drive/MyDrive/data-engineering-utokyo/notebooks

/content/drive/MyDrive/data-engineering-utokyo/notebooks


In [3]:
# Update path
import sys
sys.path.insert(0,'..')

In [4]:
# Standard 
from datetime import datetime
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt

In [5]:
# Recorders
from src.recorders.heat_time_recorder import HeatTimeRecorder
from src.recorders.ssd_recorder import SSDRecorder
from src.recorders.file_recorder import FileRecorder

# Analyses
from src.analyses.ssd_analysis import SSDAnalysis

## Data loading

### Heater
The heat time log allows us to find the interesting intervals, namely [time-2min, time+2min] for the time of the heating. 

In [6]:
heat_fp_18 = filepath_drive + "/HeatTimeLog/" + "20220918BT.csv"
heat_fp_19 = filepath_drive + "/HeatTimeLog/" + "20220919BT.csv"
day_18 = "2022-09-18-"
day_19 = "2022-09-19-"

In [7]:
ht_df_18 = HeatTimeRecorder(heat_fp_18, day_18).get_table()
ht_df_19 = HeatTimeRecorder(heat_fp_19, day_19).get_table()

In [8]:
ht_df_18.head(5)

Unnamed: 0,Time,VoltageDurationPower,Coil,datetime,timestamp
0,14:30:52,2V8s 1kW,ON,2022-09-18 14:30:52,1667917852000000000
1,14:36:13,2V10s 1kW,ON,2022-09-18 14:36:13,1667918173000000000
2,14:58:50,4V3s 1kW,ON,2022-09-18 14:58:50,1667919530000000000
3,15:03:27,4V3s 1kW,ON,2022-09-18 15:03:27,1667919807000000000
4,15:12:41,4V3s 1kW,ON,2022-09-18 15:12:41,1667920361000000000


In [9]:
ht_df_19.head(5)

Unnamed: 0,Time,VoltageDurationPower,Coil,datetime,timestamp
0,14:19:36,5V10s 150W,ON,2022-09-19 14:19:36,1667917176000000000
1,14:20:04,4V4s 1kW,ON,2022-09-19 14:20:04,1667917204000000000
2,14:27:11,5V10s 150W,ON,2022-09-19 14:27:11,1667917631000000000
3,14:27:38,4V4s 1kW,ON,2022-09-19 14:27:38,1667917658000000000
4,14:33:18,5V10s 150W,ON,2022-09-19 14:33:18,1667917998000000000


### Image Analysis Results
In the separate notebook beamtime_analysis we applied the ImageAnalysis, which estimates the MOT number from the images. Furthermore, this data contains the Region of interest (ROI) sum. 

In [10]:
image_analysis_result_fp_18 = filepath_drive + "/results" + "/Sunday_Fr_1300_to_2100_min_signal0_nref500_deadp2" + "/image_analysis_results.csv"
image_analysis_result_fp_19 = filepath_drive + "/results" + "/Monday_Fr_1800_2100_min_signal0_nref20_deadp2" + "/image_analysis_results.csv"

In [11]:
iar_df_18 = pd.read_csv(image_analysis_result_fp_18)
iar_df_19 = pd.read_csv(image_analysis_result_fp_19)

In [12]:
iar_df_18.head(5)

Unnamed: 0,filename,filepath,filename_with_extension,Time,ROI Sum,Coil (1:ON 0:OFF),timestamp,datetime,A,A_unc,...,mu_x,mu_x_unc,mu_y,mu_y_unc,C,C_unc,X-squared,p-value,R^2,fit_successful
0,cmos_000066,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_000066.csv,2022/09/18 13:32:07.642,4988707,1,1663507927642000000,2022-09-18 13:32:07.642,,,...,,,,,,,,,,False
1,cmos_000068,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_000068.csv,2022/09/18 13:32:08.211,4995350,1,1663507928211000000,2022-09-18 13:32:08.211,,,...,,,,,,,,,,False
2,cmos_000069,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_000069.csv,2022/09/18 13:32:08.493,5001682,1,1663507928493000000,2022-09-18 13:32:08.493,2367.459153,31327.088482,...,0.000346,6e-06,0.000261,0.001580762,23.186008,0.227139,5749.228983,5.186479999999999e-143,0.003346,True
3,cmos_000071,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_000071.csv,2022/09/18 13:32:09.074,5002216,1,1663507929074000000,2022-09-18 13:32:09.074,493.741945,109.034998,...,0.000353,2e-06,0.000436,2.560902e-06,24.163239,0.147631,5699.19282,2.6528869999999998e-138,0.024132,True
4,cmos_000072,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_000072.csv,2022/09/18 13:32:09.355,5009241,1,1663507929355000000,2022-09-18 13:32:09.355,94005.605291,36865.676273,...,0.000432,1.7e-05,0.00044,6.009432e-07,26.702313,0.284232,4134.434362,1.835077e-24,0.752007,True


In [13]:
iar_df_19.head(5)

Unnamed: 0,filename,filepath,filename_with_extension,Time,ROI Sum,Coil (1:ON 0:OFF),timestamp,datetime,A,A_unc,...,mu_x_unc,mu_y,mu_y_unc,C,C_unc,X-squared,p-value,R^2,signal_sum,fit_successful
0,cmos_033304,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_033304.csv,2022/09/19 18:01:26.801,5652251,1,1663610486801000000,2022-09-19 18:01:26.801,,,...,,,,,,,,,,False
1,cmos_033306,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_033306.csv,2022/09/19 18:01:27.283,5653231,1,1663610487283000000,2022-09-19 18:01:27.283,,,...,,,,,,,,,,False
2,cmos_033307,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_033307.csv,2022/09/19 18:01:27.753,5652151,1,1663610487753000000,2022-09-19 18:01:27.753,,,...,,,,,,,,,,False
3,cmos_033309,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_033309.csv,2022/09/19 18:01:28.234,5648730,1,1663610488234000000,2022-09-19 18:01:28.234,,,...,,,,,,,,,,False
4,cmos_033311,/content/drive/.shortcut-targets-by-id/1B48ps8...,cmos_033311.csv,2022/09/19 18:01:28.707,5638679,1,1663610488707000000,2022-09-19 18:01:28.707,15770260.0,50828470.0,...,0.000114,0.00036,1.4e-05,-31.574592,104.335409,118194.165113,0.0,0.036981,815603.14117,True


In [None]:
!git status