## SmoothData Demo
<div class="alert alert-block alert-info">
<p><b>Information:</b> This notebook shows that a developed <code>scipp</code> function for smoothing data is equivalent to the original <code>SmoothData</code> found in <code>Mantid</code>.</p>
    
<p>A random dataset is generated from a gaussian distribution with gaussian errors scaled by the squareroot of the signal.  
A second dataset with a few outliers is generated as well to see how the smoothing handle such cases.  
Data is smoothed using both solutions and the results are plotted.</p>

<p> <b>Requirements:</b> To run this notebook, you need <code>mantid</code>, <code>scipp</code>, <code>matplotlib</code> and <code>numpy</code> installed as well as the Python script <code>smoothdata.py</code> placed in the same folder as this notebook.</p>
</div>

In [None]:
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [10, 8]
%matplotlib notebook

import mantid.simpleapi as mapi
import numpy as np
import scipp as sc

import smooth_data

In [None]:
# Create a workspace that has a Gaussian peak, Mantid plot example
x = np.arange(30)
y0 = 20. + 50. * np.exp(-(x - 8.)**2. / 120.)
err0 = np.sqrt(y0)

# Add random noise
y = y0 + err0 * np.random.normal(size=len(err0))
err = np.sqrt(y)

# Add a dataset with outliers
y_outlier = y + 45. * np.around(np.random.sample(size=len(err))*0.54)
err_outlier = np.sqrt(y_outlier)

In [None]:
def plot_comparison_mantid(x, y0, y, err, NPoints):
    """
    Plot input and output of Mantid's SmoothData
    Three curves are displayed:
    - initial distribution
    - initial distribution with added noise (used as input of SmoothData)
    - output of SmoothData
  
    x: array of values for the coordinate
    
    y0: array of values for the counts

    y: array of "noisy" values for the counts
    
    err: std values associated with y-values
    
    NPoints: number of points to use in the mean for SmoothData (odd number)
    """
    fig, ax = plt.subplots()
    # plot the initial distribution with black line
    ax.plot(x, y0, 'k-', label='Original data') 
    # plot initial noisy data with errorbars, using red squares
    ax.errorbar(x,
                y,
                yerr=err,
                fmt='rs',
                label='Original data with noise') 
    
    # Create Mantid workspaces to apply SmoothData
    w = mapi.CreateWorkspace(DataX=x,
                             DataY=y,
                             DataE=err,
                             NSpec=1,
                             UnitX='tof')
    smooth = mapi.SmoothData(w, NPoints)
    
    # Plot output of SmoothData with errorbars, using blue circles
    ax.errorbar(smooth.readX(0),
                smooth.readY(0),
                yerr=smooth.readE(0),
                fmt='bo',
                label=f'Smoothed with {NPoints} points') 
    ax.legend()
    ax.set_xlabel('Time-of-flight ($\mu$s)')
    ax.set_title('Using Mantid')
    ax.grid()

### Reference Mantid plot, 3 and 5 point smoothing
Here we see a comparison between smoothing with 3 and 5 points using the reference Mantid routine.

In [None]:
plot_comparison_mantid(x, y0, y, err, 3)
plot_comparison_mantid(x, y0, y, err, 5)

### Reference Mantid plot, 3 and 5 point smoothing of data with outliers

Here we see a comparison between smoothing with 3 and 5 points using the reference Mantid routine. A few outliers have been added to this dataset.

In [None]:
plot_comparison_mantid(x, y0, y_outlier, err_outlier, 3)
plot_comparison_mantid(x, y0, y_outlier, err_outlier, 5)

In [None]:
def plot_comparison_mantid_scipp(x, y0, y, err, NPoints):
    """ 
    Create plot comparing scipp and mantid's implementation of SmoothData
    
    x: array of values for the coordinate
    
    y0: array of values for the counts

    y: array of "noisy" values for the counts
    
    err: std values associated with y-values
    
    NPoints: number of points to use in the mean for SmoothData (odd number)
    """
    fig, ax = plt.subplots()
    ax.grid()
   
    ax.set_title('Mantid and scipp implementation of SmoothData')
    ax.set_xlabel('Time-of-flight ($\mu$s)')
    
    # plot the workspace with errorbars, using red squares
    ax.errorbar(x, y, yerr=err, fmt='rs', label='Original data') 
    
    # plot the initial noisy distribution with black line
    ax.plot(x, y0,'k-', label='Original data with noise') 
    
    # Calculate scipp's version of smoothdata and add output to plot
    input_y = sc.Variable(dims=['tof'], values=y, variances=err**2, unit=sc.units.us)
    output = smooth_data.smooth_data(input_y, dim='tof', NPoints=NPoints)
    # plot with errorbars, using blue circles
    ax.errorbar(x, output.values, yerr=np.sqrt(output.variances), fmt='bo', label=f'Smoothed in scipp with {NPoints} points')
    
    # Calculate Mantid's version of SmoothData and add output to plot
    w = mapi.CreateWorkspace(DataX=x, DataY=y, DataE=err, NSpec=1, UnitX='Tof')
    smooth_mtd = mapi.SmoothData(w, NPoints)
    # plot with errorbars, using green crosses
    ax.errorbar(smooth_mtd.readX(0),
                smooth_mtd.readY(0),
                yerr=smooth_mtd.readE(0),
                fmt='gx',
                label=f'Smoothed in Mantid with {NPoints} points') 
    ax.legend()

### Scipp plot, 3 and 5 point smoothing
Here we see a comparison between smoothing with 3 and 5 points with the developed scipp routine and Mantid's version. Results from scipp (blue curve) are visually similar to the Mantid version (green curve).

In [None]:
plot_comparison_mantid_scipp(x, y0, y, err, 3)
plot_comparison_mantid_scipp(x, y0, y, err, 5)

### Scipp plot, 3 and 5 point smoothing of data with outliers

Here we see a comparison between smoothing with 3 and 5 points using the developed scipp routine  and Mantid's version. A few outliers have been added to this dataset. The scipp routine seems to handle this in the same way as the reference.

In [None]:
plot_comparison_mantid_scipp(x, y0, y_outlier, err_outlier, 3)
plot_comparison_mantid_scipp(x, y0, y_outlier, err_outlier, 5)

## Numerical test for identical results
Here the two different methods are used on the same data set, and instead of plotting the results the numpy routine allclose is used to check that the returned data is identical within reasonable tolerances.

In [None]:
# Scipp smooth
input_y = sc.Variable(dims=['tof'], values=y, variances=err**2, unit=sc.units.us)
output = smooth_data.smooth_data(input_y, dim='tof', NPoints=3)

# Mantid smooth reference
w = mapi.CreateWorkspace(DataX=x,
                             DataY=y,
                             DataE=err,
                             NSpec=1,
                             UnitX='tof')
smooth = mapi.SmoothData(w, 3)

np.allclose(output.values, smooth.readY(0))

In [None]:
np.allclose(np.sqrt(output.variances), smooth.readE(0))

In [None]:
# Scipp smooth
output = smooth_data.smooth_data(input_y, dim='tof', NPoints=5)

# Mantid smooth reference
smooth = mapi.SmoothData(w, 5)

np.allclose(output.values, smooth.readY(0))

In [None]:
np.allclose(np.sqrt(output.variances), smooth.readE(0))