Drift correction based on fiducials is implemented in B-Store in two separate parts:

1. the `FiducialDriftCorrect` processor for finding fiducials and correcting a dataset, and
2. a `ComputeTrajectories` class that describes the algorithm for fitting smoothed curves to the fiducial localizations.

Fiducial-based drift correction was split into these two parts for extensibility. You can use the interactive search feature provided by `FiducialDriftCorrect` but write your own algorithm for creating smoothed curves from the data if the default algorithm is not suitable.

In [1]:
# Be sure not to use the %pylab inline option
%pylab
from bstore import processors as proc
from pathlib import Path
import pandas as pd

Using matplotlib backend: Qt4Agg
Populating the interactive namespace from numpy and matplotlib


# Load the test data
The test data for this example is in the [B-Store test files repository](https://github.com/kmdouglass/bstore_test_files). Download or clone this repository, and set the variable below to point to */processor_test_files/test_localizations_with_fiducials.csv*

In [2]:
pathToData = Path('../../bstore_test_files/processor_test_files/test_localizations_with_fiducials.csv')

# Load the test data
with open(str(pathToData), 'r') as f:
    df = pd.read_csv(f)

In [3]:
df.describe()

Unnamed: 0,x [nm],y [nm],frame,z [nm],uncertainty [nm],intensity [photon],offset [photon],loglikelihood,sigma [nm]
count,40000.0,40000.0,40000.0,40000.0,40000.0,40000.0,40000.0,40000.0,40000.0
mean,635.003427,624.977005,4999.5,0.0,10.001933,999.510493,99.913865,180.180078,139.885318
std,600.039317,592.486494,2886.787417,0.0,0.99813,100.241879,9.979632,30.054977,20.089461
min,0.0,0.0,0.0,0.0,5.854368,600.233502,55.167737,44.681219,59.516892
25%,150.0,150.0,2499.75,0.0,9.329977,931.249431,93.15644,159.870738,126.406587
50%,468.395694,466.950576,4999.5,0.0,9.998947,999.041524,99.875153,180.32281,140.063141
75%,1021.781722,981.370269,7499.25,0.0,10.67276,1066.81176,106.621118,200.593788,153.214561
max,1590.525406,1562.316093,9999.0,0.0,14.349163,1410.296374,138.635389,297.525426,226.003164


The test dataset has four clusters of localizations. Two are clusters are stationary and the other two have drifted over time.

In [4]:
plt.scatter(df['x [nm]'], -df['y [nm]'])
plt.xlabel('x-position')
plt.ylabel('y-position')
plt.axis('equal')
plt.grid(True)
plt.show()

# Basic Drift Correction
Drift corrrection based on fiducials is implemented using the `FiducialDriftCorrect` processor. This works just like other processors: we first create a `FiducialDriftCorrect` instance and then apply it to the DataFrame.

Drift correction is performed in two steps. In the first step, the user selects the fiducials by clicking and dragging a rectangle around the features in the 2D histogram that is displayed when the drift correction processor is called. The 2D histogram simply groups localizations in close proximity into square bins. The color of the bin encodes the number of localizations within it. Fiducials tend to lie in bins that have higher localization counts, which appear more yellow than red in the histogram.

After a region is drawn, **press the space bar to add it to the processor**. You may then select another region in the same manner. To finish searching for fiducials, simply close the window.

Try selecting the fiducial that lies between 750 and 950 nm on the x-axis and 750 and 850 nm on the y-axis by running the code below:

In [5]:
# coordCols = ['x', 'y'] by default
dc = proc.FiducialDriftCorrect(coordCols = ['x [nm]', 'y [nm]'])
processed_df = dc(df)

Performing spline fits...


In [6]:
processed_df.describe()

Unnamed: 0,x [nm],y [nm],frame,z [nm],uncertainty [nm],intensity [photon],offset [photon],loglikelihood,sigma [nm],dx,dy
count,30000.0,30000.0,30000.0,30000.0,30000.0,30000.0,30000.0,30000.0,30000.0,30000.0,30000.0
mean,539.330613,554.38452,4999.5,0.0,10.004109,999.707243,99.94794,180.196789,139.956211,40.674675,20.601365
std,684.158062,676.796187,2886.799445,0.0,0.995834,100.432718,9.987884,30.124589,19.998153,28.984676,14.4795
min,-90.873228,-45.680475,0.0,0.0,5.854368,600.233502,55.167737,44.681219,59.516892,-9.64132,-5.00124
25%,-15.604841,-8.097425,2499.75,0.0,9.336187,931.078126,93.182242,159.932164,126.580223,15.609873,8.099959
50%,159.295955,179.412412,4999.5,0.0,10.00168,999.313689,99.901629,180.239357,140.166669,40.704045,20.587588
75%,1493.787031,1500.974928,7499.25,0.0,10.670678,1066.920403,106.694537,200.634685,153.220725,65.754171,33.096438
max,1528.857359,1523.50618,9999.0,0.0,14.349163,1410.296374,138.635389,297.525426,226.003164,90.873228,45.680475


If you performed the steps correctly, then you should notice two important changes to the DataFrame.

1. The total number of localizations decreased from 40,000 to 30,000. This happens because the processor removes fiducials from the dataset by default. To retain all localizations in the processed DataFrame, you can set the `removeFiducials` argument to the processor's constructor to `False`: 

```python
dc = proc.FiducialDriftCorrect(coordCols = ['x [nm]', 'y [nm]'], removeFiducials = False)
```

2. Two new columns were added: **dx** and **dy**. These contain the distance by which the x- and y-columns were shifted during the drift correction. To get the original coordinate values back, you can add the values in these columns to the x- and y-coordinates.

In [7]:
plt.scatter(df['x [nm]'], -df['y [nm]'], label = 'Original')
plt.scatter(processed_df['x [nm]'], -processed_df['y [nm]'], marker = 'd', color = 'green', label = 'Corrected')
plt.xlabel('x-position')
plt.ylabel('y-position')
plt.axis('equal')
plt.grid(True)
plt.legend()
plt.show()

# Modifying the fiducial fits
## Changing which fiducials are used in the average trajectory
We can plot the individual fiducial localizations, their fits, and even choose which fiducials to use after they were selected. This is done using methods found inside `FiducialDriftCorrect`'s driftComputer.

In [8]:
dc.driftComputer.plotFiducials()

If you only had one fiducial selected, you should see one plot with the individual localizations in blue and the smoothed drift trajectory in red.

Now, **select the two fiducials in the middle and at the lower right corner.**

In [9]:
processed_df = dc(df)

Performing spline fits...


In [10]:
dc.driftComputer.plotFiducials()

This should open two windows, one for each fiducial. The average drift trajectory is in red and the localizations are again in blue.

Let's suppose now that the fiducial labeled with index `1` is too noisy or not very good. We can recompute the average trajectory by telling the drift computer to use only the fiducial at index `0`.

In [11]:
dc.interactiveSearch = False
dc.driftComputer.useTrajectories = [0]
processed_df = dc(df)
dc.driftComputer.plotFiducials()

Performing spline fits...


Here's what we have just done: we first told the drift correction processor to disable the interactive search. This means that it will retain the fiducial localization information instead of allowing the user to choose fiducials. Next, we tell the driftComputer to only use the trajectory labeled with index `0`. We perform the drift correction again on the original DataFrame (`df`) and plot the fiducials.

This time, the localizations in the fiducial track labeled with a `1` are grayed out. Additionally, the red curve should lie perfectly over the fiducials in track `0`. In this way, we correct the input localizations using only fiducial trajectory `0`.

If we want to use both fiducials again, set dc.driftComputer.useTrajectories to either an `[0,1]` or an empty list, `[]`, which means "use all fiducials."

In [12]:
# dc.interactiveSearch should still be False.
dc.driftComputer.useTrajectories = [] # Empty list means use all fiducials
processed_df = dc(df)
dc.driftComputer.plotFiducials()

Performing spline fits...


## Changing the smoothing spline parameters
`smoothingWindowSize` and `smoothingFilterSize` change the size of the moving average window and the Gaussian smoothing filter width, respectively. These can be adjusted to capture fine detail in the trajectories if desired.

In [13]:
dc.driftComputer.smoothingWindowSize = 50 # units are frames
dc.driftComputer.smoothingFilterSize = 25  # units are frames
processed_df = dc(df)
dc.driftComputer.plotFiducials()

Performing spline fits...


In [14]:
# Reset to default values
dc.driftComputer.smoothingWindowSize = 600 # units are frames
dc.driftComputer.smoothingFilterSize = 400  # units are frames

## Changing the zero-frame
Sometimes the fiducial beads may shift relative to one another during the first few frames of an acqusition. In this case, we can set the frame number at which all fiducial trajectories are set to zero by modifying the driftComputer's `zeroFrame`. The default value for `zeroFrame` is `0`.

In [15]:
dc.driftComputer.zeroFrame = 2500 # Trajectories will equal zero at frame 2500 instead of frame 0.
processed_df = dc(df)
dc.driftComputer.plotFiducials()

Performing spline fits...


Examining the plots, you can see that the fiducial trajectories are now zero at frame 2500 instead of frame 0. This feature can better help to align noisy fiducial tracks.

## Eliminating outlier localizations

If outliers were selected when you manually identified fiducial regions, you can prevent them from being included in the spline fits using the `maxRadius` attribute of the drift computer. Setting maxRadius to 50, for example, will remove all localizations further than 50 x-y units from the center of the cluster of localizations.

In [16]:
# This should be done before computing any spline fits
dc.driftComputer.maxRadius = 50

processed_df = dc(df)
dc.driftComputer.plotFiducials()

Performing spline fits...


The outliers are the localizations that are now grayed out and have an `x` as a marker. Localizations can have x- or y-values that are greater than an outlier's x- or y-value because the rejection is performed on the *distance* from the cluster's center, i.e. \\( \sqrt{x^2 + y^2} \\).

In [17]:
# Setting it to None includes all localizations
dc.driftComputer.maxRadius = None

# Modifying the trajectory-fitting algorithm
*You may skip this section if you do not want to program your own drift computer.*

By default, B-Store uses a curve fitting algorithm based on a cubic smoothing spline. The algorithm is implemented in a class called `DefaultDriftComputer` which uses the `ComputeTrajectories` interface. You can write your own driftComputer by inheriting this interface.

In [18]:
import inspect
print(inspect.getsource(proc.ComputeTrajectories))

class ComputeTrajectories(metaclass = ABCMeta):
    """Basic functionality for computing drift trajectories from fiducials.
    
    Attributes
    ----------
    fiducialLocs : Pandas DataFrame
        The localizations for individual fiducials.  
    
    """
    def __init__(self):
        """Initializes the trajectory computer.
        
        """
        self._fiducialData = None
        
    @property
    def fiducialLocs(self):
        """DataFrame holding the localizations for individual fiducials.
        
        """
        return self._fiducialData
        
    @fiducialLocs.setter
    def fiducialLocs(self, fiducialData):
        """Checks that the fiducial localizations are formatted correctly.
        
        """
        if fiducialData is not None:
            assert 'region_id' in fiducialData.index.names, \
                'fiducialLocs DataFrame requires index named "region_id"'
                          
            # Sort the multi-index to allow slicing
        

The `ComputeTrajectories` interface provides a property and a method:

1. `fiducialLocs` contains a DataFrame with all of the fiducial localizations. It must have at least one index with the label 'region_id' that identifies which region the localizations came from.

2. `clearFiducialLocs()` removes the localization information that is held by the drift computer.

In addition, there is one abstract method called `computeDriftTrajectory`. Any class that implements this interface must define a function with this name. As inputs, the method must accept:

1. the DataFrame containing the fiducial localizations,
2. the starting frame number in the dataset (for datasets whose first frame is not zero)
3. the last frame number in the dataset.

As an example, the actual implementation of this interface by the `DefaultDriftComputer` is printed below:

In [19]:
print(inspect.getsource(proc.DefaultDriftComputer.computeDriftTrajectory))

    def computeDriftTrajectory(self, fiducialLocs, startFrame, stopFrame):
        """Computes the final drift trajectory from fiducial localizations.
        
        Parameters
        ----------
        fiducialLocs    : Pandas DataFrame
            DataFrame containing the localizations belonging to fiducials.
        startFrame      : int
            The minimum frame number in the full dataset.
        stopFrame       : int
            The maximum frame number in the full dataset.
            
        Returns
        -------
        self.avgSpline : Pandas DataFrame
            DataFrame with 'frame' index column and 'xS' and 'yS' position
            coordinate columns representing the drift of the sample during the
            acquisition.
            
        Notes
        -----
        computeDriftTrajectory() requires the start and stop frames
        because the fiducial localizations may not span the full range
        of frames in the dataset.
        
        """
       

The method uses a few other methods that are not required but used by the `DefaultDriftComputer` to fit the individual curves and combine them. Finally, it returns the averaged spline, which is a Pandas DataFrame with an index column named `frame` and regular columns named `xS` and `yS`. Any custom implementation should return the drift trajectory in this format.

In [20]:
# Print the first five values of the DataFrame returned by the drift computer
dc.driftComputer.avgSpline.head()

Unnamed: 0_level_0,xS,yS
frame,Unnamed: 1_level_1,Unnamed: 2_level_1
0,-18.568223,-10.750095
1,-18.566531,-10.749974
2,-18.562302,-10.748264
3,-18.558069,-10.74655
4,-18.553832,-10.744832


To set the drift computer used by the `FiducialDriftCorrect` processor, you can either set its `driftComputer` property to the new computer instance or specify the `driftComputer` argument in its constructor:

```python
newDC = proc.FiducialDriftCorrect(driftComputer = myCustomComputer)
```

# Summary
+ Fiducial-based drift correction is implemented in two parts: a `FidudicalDriftCorrect` processor and an interface known as `ComputeTrajectories`.
+ The default drift computer in B-Store is called `DefaultDriftComputer`. It implements the `ComputeTrajectories` interface.
+ Fiducials are manually identified by setting `interactiveSearch` to True and applying the drift correction processor to a DataFrame containing your localizations.
+ Select fidicuals by dragging a square around them and hitting the space bar.
+ You can investigate the individual fiducial trajectories with the `plotFiducials()` method belonging to the drift computer.
+ You can change which fiducials are used by setting `interactiveSearch` to False, and then sending a list of fiducial indexes to `useTrajectories`.
+ The zero frame can be modified by setting the `zeroFrame` property of the drift computer. This may help in aligning trajectories.
+ Setting `maxRadius` to a smaller value can help reduce the effects outlier localizations on the spline fits. Setting it to None includes all localizations in the manually-selected region.
+ If you wish, you can write your own drift correction algorithm by implementing the `ComputeTrajectories` interface and specifying a `computeDriftTrajectory()` method.
+ Your custom drift computer may be specified in the `driftComputer` property of the `FiducialDriftCorrect` processor.