# MuonDataLib Tutorial 5: Advanced Filtering


In the previous tutorials we used the sample logs and times to filter the event data. However, there are some additional features that are worth discussing. 

First lets set up the data with a sample log. 


In [None]:
from MuonDataLib.data.loader.load_events import load_events
from MuonDataLib.plot.basic import Figure
from MuonDataLib.data.utils import create_data_from_function
import os
import numpy as np

file_name = 'HIFI00195790.nxs'
input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)
data = load_events(input_file, 64)

frame_start_times = data.get_frame_start_times()

def osc(x, amp, omega, phi):
    return amp*np.sin(omega*x + phi) + amp*1.1

start = frame_start_times[0]
end = frame_start_times[-1]+1
step = (frame_start_times[-1]-frame_start_times[0])/40

x, y = create_data_from_function(start, end, step, [3, 6.1, 0.91], osc, seed=1)
data.add_sample_log("Temp", x, y)

 Next lets create an unfiltered histogram

In [None]:
no_filter_hist, bins = data.histogram()
fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0])
fig.show()

## Mix and match filters

In the previous tutorial we used multiple filters of the same type (e.g. 'only_keep_data_time_between`). However, it is possible to use any combination of filters. Lets consider an experiment and we are only interested in Temperatures between $1$ and $4$ Kelvin, but we also know that the detectors had an error between $2.2$ and $3.3$ seconds. Then we can add the following two filters

In [None]:
data.remove_data_time_between('detector_error', 2.2, 3.3)
data.keep_data_sample_log_between('Temp', 1, 4)

Next lets look at the sample log

In [None]:
hist_mix, bins = data.histogram()
fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')
fig.plot_from_sample_log(data.get_sample_log('Temp'))
fig.show()


The result is as expected. Next lets look at the impact it has on the histograms 

In [None]:
fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')
fig.plot_from_histogram(bins, hist_mix, [0], 'filtered (mix), ')
fig.show()

We can see that the filters have removed some data as expected.

## Managing filters

When comparing two different samples with each other, we would need to make sure that both data sets use the same filters. This can be done by using the `save_filters` command

In [None]:
filter_file = os.path.join('..', 'Output_files', 'filters.json')
data.save_filters(filter_file)

The argument defines where the file will be saved.

The next step would be to load the second data set (we delete the original data set and load in the same one due to storage issues with github, but we will pretend its a second data set).

In [None]:
del data

from MuonDataLib.data.loader.load_events import load_events
from MuonDataLib.plot.basic import Figure
from MuonDataLib.data.utils import create_data_from_function
import os

file_name = 'HIFI00195790.nxs'
input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)
data_2 = load_events(input_file, 64)

frame_start_times = data_2.get_frame_start_times()

def osc(x, amp, omega, phi):
    return amp*np.sin(omega*x + phi) + amp*1.1

start = frame_start_times[0]
end = frame_start_times[-1]+1
step = (frame_start_times[-1]-frame_start_times[0])/50

x, y = create_data_from_function(start, end, step, [3, 6.1, 0.91], osc, seed=2)
data_2.add_sample_log("Temp", x, y)

The next step is to load the filters from earlier

In [None]:
data_2.load_filters(filter_file)


To check if the filters have loaded as expected, we can use the 'report_filters` command

In [None]:
print(data_2.report_filters())

We can see that both filters are present. However, in this second experiment we didn't have the detector error. So we can remove that filter

In [None]:
data_2.delete_remove_data_time_between('detector_error')
print(data_2.report_filters())

We now have one filter as expected. To verify that it worked correctly lets look at the sample log and histogram.

In [None]:
load_data, bins = data_2.histogram()
fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')
fig.plot_from_sample_log(data_2.get_sample_log('Temp'))
fig.show()

fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')
fig.plot_from_histogram(bins, load_data, [0], 'filtered (loaded), ')
fig.show()

The final feature of the filters is the ability to delete all of them in one command. To best demonstrate lets first add some more filters to our data.

In [None]:
data_2.only_keep_data_time_between('one', 1., 1.4)
data_2.only_keep_data_time_between('two', 2., 2.9)
print(data_2.report_filters())

As you can see we now have three filters. To remove all of them we can use the `clear_filters` command

In [None]:
data_2.clear_filters()
print(data_2.report_filters())

To verify that they have all been cleared lets look at the sample log and histogram.

In [None]:
clear_data, bins = data_2.histogram()
fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')
fig.plot_from_sample_log(data_2.get_sample_log('Temp'))
fig.show()

fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')
fig.plot_from_histogram(bins, clear_data, [0], 'cleared, ')
fig.show()

As expected this data is identical to the unfiltered data. 