# MuonDataLib Tutorial 3: Filtering


One of the key advantages of event data is the option to create histograms from a subset of the information. Typically this is done based on the values of sample logs. 

## Sample logs

The sample logs will automatically be loaded if they are present in the event nexus file. However, if they are missing it is possible to add them manually. 
In this example we will load some data, then create a pair of sample logs:

- An oscillation in the temperature
- A linear field

The first step is to create the simulated log values

In [None]:
import numpy as np

def linear(x, m, c):
    return m*x + c

def osc(x, amp, omega, phi):
    return amp*np.sin(omega*x + phi)

Next we will need to load the data.

In [None]:
from MuonDataLib.data.loader.load_events import load_events
from MuonDataLib.plot.basic import Figure
import os

file_name = 'HIFI00195790.nxs'
input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)
data = load_events(input_file, 64)


To create simulated data that matches the actual data, we need to know when each frame of data starts. The `get_frame_start_times` method provides a list of the start times in seconds.

In [None]:
frame_start_times = data.get_frame_start_times()
print(frame_start_times)

From the frame start times we can see that the first frame is from $0$ until $0.02$ seconds. We can also see that the last frame starts at $1345$ seconds. Now we will create the simulated sample logs (includes random noise), which includes $100$ measurments across the whole collection period.

In [None]:
from MuonDataLib.data.utils import create_data_from_function
from MuonDataLib.plot.basic import Figure

x, y = create_data_from_function(t[0], t[-1]+1, (t[-1]-t[0])/100, [3,0.002, 0.91], sin, seed=1)
data.add_sample_log("Temp", x, y)
fig = Figure(y_label='Temp')
fig.plot(x, y, 'Temp data')
fig.show()

x, y = create_data_from_function(t[0], t[-1]+1, (t[-1]-t[0])/100, [3.1, 0.1], linear, seed=1)
data.add_sample_log("field", x, y)
fig2 = Figure(y_label='Field')
fig2.plot(x, y, 'Field data')
fig2.show()

Now we have the sample log data we can look at the different types of filters. The two plots above will be helpful for creating sensible filters.

## Sample log filters

The first type of filter we will look at is one that is directly applied based on a sample log value. 

In [None]:
#data.add_time_filter('early frames', 0.0, 1.98e-2)

We can then add a second filter for the last few frames.

In [None]:
#data.add_time_filter('more frames', 1.1e3, 1.3e3)

If we then notice a mistake, for example the `early filter` should remove the first $100$ seconds instead of the first frame, then we can correct the filter. However, just using the same filter name (in this case `early filter`) leads to an error as it already exists.

To replace a filter requires two steps. The first is to remove the original one using the `remove_time_filter` method, which takes the filter name as an argument. If the filter name does not exist it will produce an error. The second step is to add the filter with the correct values.

In [None]:
#data.remove_time_filter('early frames')
#data.add_time_filter('early frames', 0.0, 100)

To check that the filter has done something, we will produce a histogram of the data. 

In [None]:
#filter_hist, bins = data._events.histogram()

#fig = Figure(y_label='Counts')
#fig.plot(bins, full_hist[21], label='unfiltered')
#fig.plot(bins, filter_hist[21], label='filtered')
#fig.show()

As you can see from the plot, the filtered data has a smaller amplitude as expected. 

It is also possible to get a summary of the current filters by using the `report_filters` method. If you want to remove all of the filters at once there is also a `clear_time_filters` method. The below code shows the summary before and after clearing the filters.

In [None]:
#print('before clear:', data.report_filters())
#data.clear_time_filters()
#print('after clear:', data.report_filters())


For the next example we will add one filter that removes approximately half of the data. This can easily be done by halving the time for the start of the last frame (`frame_start_times[-1]`). We then plot the filtered and unfiltered data.

In [None]:
#data.add_time_filter('half', 0, 0.5*frame_start_times[-1])
#filter_hist, bins = data._events.histogram()

#fig = Figure(y_label='Counts')
#fig.plot(bins, full_hist[21], label='unfiltered')
#fig.plot(bins, filter_hist[21], label='filtered')
#fig.show()

Sometimes it could be helpful to save a complex series of filters to file, so it could be used at a later date. This can be achieved by using the `save_filters` method, which takes the name of the file you want to make as an argument. 

In [None]:
#filter_file = os.path.join('..', 'Output_files', 'filters_tutorial_3.json')
#data.save_filters(filter_file)
#data.clear_time_filters()

The above also cleared the filters to make sure that the loading does not pick up any of the previous filters. 
The loading is done with the `load_filters` method, which takes a file name as an argument. As you can see it reproduces the filters as expected. 

In [None]:
#data.load_filters(filter_file)
#print('loaded', data.report_filters())