# Data Analysis - Irradiation of a Fibre Optic Sensor
Measuring humidity in radioactive environments is challenging because, as you might expect, the radiation is massively damaging to sensors. Most commercial humidity sensors are electronic and get wiped out by even a small dose of radiation. 
<!-- This presents a problem for measuring humidity in many environments with high radiation - in space, in nuclear industries and in high energy physics experiments. -->
<!-- (What commercial sensors?
How much radiation kills them??) -->

However, recently a new type of technology has emerged for these cases - fibre optic sensors (FOS). These can survive radiation doses up to the Mega-Gray (MGy) level while still giving a clear signal. Even better, the change in the signal itself can actually be used to measure the radiation dose. Combined with recent breakthroughs in FOS for measuring humidity, these sensors present an exciting potential product with applications in areas such as nuclear industry, high energy physics research, and space travel.
<!-- Link to FOS radiation studies
Link to LPG humidity studies
How much damage would MGy do to a person? -->

This project will take you through analysis of a sensor under irradiation, using real data recorded at the [PS-IRRAD facility](https://ps-irrad.web.cern.ch/ps-irrad/) in CERN, Geneva, SWitzerland.

## EDA
- 4 datasets in `irrad-demo-data/`:
    - `spectra.txt` - a series of spectra, each containing 20,000 points
    - `hyperion-transmission-spectrum.txt` - a single spectrum of 20,000 points
    - `SEC_data_06-11_09_2023.csv` - data from the IRRAD facility recording the dose rate over time
    - `times.txt` - A series of time values, each corresponding to a spectrum from `spectra.txt`

In [None]:
# Let's take a look at the data

In [None]:
# Import the data
import pandas as pd
import numpy as np
# Load the data
data_dir = 'irrad-demo-data/'
spectra = np.loadtxt(data_dir + 'spectra.txt', delimiter=',')
times = np.loadtxt(data_dir + 'times.txt')
baseline = np.loadtxt(data_dir + 'hyperion-transmission-spectrum.txt')
irradiation_df = pd.read_csv(data_dir + 'SEC_data_06-11_09_2023.csv')

In [2]:
irradiation_df.head()

Unnamed: 0,SEC_ID,TIMESTAMP,SEC_counts
0,SEC_01,11-SEP-23 11.49.12,25376
1,SEC_01,11-SEP-23 11.49.01,25574
2,SEC_01,11-SEP-23 11.48.53,25340
3,SEC_01,11-SEP-23 11.48.48,25320
4,SEC_01,11-SEP-23 11.48.31,25269


In [6]:
data_dict = {
    'spectra.txt': spectra,
    'times.txt': times,
    'hyperion-transmission-spectrum.txt': baseline,
    'SEC_data_06-11_09_2023.csv': irradiation_df
}
for file in data_dict.keys():
    print(f"{file} shape: {np.shape(data_dict[file])}")

spectra.txt shape: (48, 20000)
times.txt shape: (48,)
hyperion-transmission-spectrum.txt shape: (20000,)
SEC_data_06-11_09_2023.csv shape: (31039, 3)


So it looks like we have 48 time steps, and our irradiation data has 3 columns of over 31k rows