# Muon Separator Voltage Signal Exploration Pre-capacitor installation

In this notebook, we explore the raw voltage signal. This data was sampled with the frequency at 1000Hz and the number of elements at 100, so the DAQ sent a packet of 100 readings to the IOC every 0.1 seconds. 

The aim is to see how noisy the data is and if there is a 50Hz component. The data was taken before a 0.1uF capacitor was fitted so will be extra noisy. More analysis will be needed of data post fitting the capacitor.

In [1]:
import numpy as np
import altair as alt
import os
import pandas as pd

from src.data_processing import create_data_from_entry, calibrate_data, time_period
from src.vizualization import LINE_COLOUR

# Render altair charts correctly
alt.renderers.enable('notebook')

# Store the altair char json separately and not in the notebook to reduce
# notebook size
alt.data_transformers.enable('json')

%load_ext autoreload
%autoreload 2

## Cleaning the data

The raw data takes the following form:

- Column 0: Time since EPOCH the reading was taken given by the python collection script `Muon-data-logger.py`.
- Column 1-100: Data from the `DAQ:_RAW` record. The readings are the DAQ voltage readings. These readings were requested every 0.05 seconds by the `Muon-data-logger.py` python script which collected data from the IOC.

In [2]:
raw = pd.read_csv(os.path.join(os.getcwd(), "..", "data", "raw", "muon_results.csv"), nrows=36000, header=None)

IOError: File C:\Instrument\Dev\separator-signal-analysis\notebooks\..\data\raw\muon_results.csv does not exist

In [None]:
raw.head()

Now we clean the data up by changing the datetime value to a timestamp, drop any rows with duplicate values and then reset the index.

In [None]:
def clean_data(dataframe):
    """
    Sets the columns of the dataframe and removes duplicates

    Args:
        dataframe: Pandas data frame with columns labeled 0-101.
            First column is a datetime EPOCH timestamp and next 100 are voltage readings.

    Returns:
        dataframe: Dataframe with converted columns and duplicates removed.
    """
    dataframe["Datetime"] = pd.to_datetime(dataframe[0], unit="s")
    dataframe = dataframe.drop(0, 1)
    dataframe = dataframe.drop_duplicates(list(range(1, 100 + 1)))
    dataframe = dataframe.reset_index(drop=True)
    return dataframe

data = clean_data(raw)
calibrated_data = calibrate_data(data, 20)
calibrated_data.head()

In [None]:
calibrated_data.tail()

In [None]:
calibrated_data.to_csv(os.path.join(os.getcwd(), "..", "data", "processed", "pre-capactitor-raw-data.csv"), index=False)

In [None]:
def time_difference(row):
    row["Datetime"]

calibrated_data.shape

Lets now look at how the data is spread out.

In [None]:
times = calibrated_data.loc[:, "Datetime"]
differences = [ time2 - time1 for time1, time2 in zip(times[0:-1],times[1:]) ]

pd.Series(differences[:10]).mean()

In [None]:
time_period(calibrated_data, len(calibrated_data.index) - 1)

So we have a new data value every 0.2671 seconds spread across 1964 seconds.



## Visualizing the data

First we plot the first element (column 1) of each row against the time it was collected.

In [None]:
base = alt.Chart().mark_line(color=LINE_COLOUR).encode(
    x = alt.X("Datetime:T", timeUnit="hoursminutesseconds", title="Time (h:m:s)"),
    y = alt.Y("1:Q", title="Voltage (kV)", scale = alt.Scale(domain=[84, 96]))
)

alt.layer(base, data = calibrated_data,
          title="Voltage over {} seconds".format(time_period(calibrated_data, len(calibrated_data.index) - 1)),
          config={"background": "white"},
          width = 850
         )

We now plot 100 elements from the first column of each row against the time it was collected to see how the voltage behaves over a shorter time period.

In [None]:
alt.layer(base, data = calibrated_data[:450],
          title="Voltage readings over {} seconds".format(time_period(calibrated_data, 450)),
          config={"background": "white"},
          width = 850
         )

## Conclusion

It looks like the signal has a sine wave but it is hard to see. More analysis is required.