In [1]:
from imgseries import ImgSeries
from imgseries.analysis import Analysis, Results, Formatter
from imgseries.analysis import PandasFormatter, PandasTsvResults
from imgseries.viewers import AnalysisViewer

import pandas as pd
import matplotlib.pyplot as plt

%matplotlib tk

# Create custom analysis

Here, we will show a minimal example where we will analyze the minimum and maximum pixel value in the image across the whole image series.

Two classes have to be redefined, deriving from the following classes:
- `Analysis`: defines the calculation that is made on the images,
- `Formatter`: defines how to format and store the analysis (e.g. in a pandas dataframe.

There are also optional classes that can be subclassed:
- `Results` allows to access the results and analysis metadata, and potentially store them into (and load them from) files
- `AnalysisViewer` class can be defined in order to show live analysis and inspect results afterwards.

Below we will show step by step how to construct the analysis classes, using a simple image sequence to analyze:

In [2]:
images = ImgSeries('data/img1')

## 1) Define how the analysis is made (on a single image): `Analysis`

In [3]:
class MinMax(Analysis):
    """Analysis of max pixel value in imgseries"""

    measurement_type = 'min_max'

    def _analyze(self, img):
        """What to do on the image. Must return a dict of data"""
        val_min = img.min()
        val_max = img.max()
        return {'min': val_min, 'max': val_max}

Now the analysis can be tested on any image of the image sequence, identified by its number (index) in the sequence (`num`); Info about the image number is automatically added in the data dictionnary:

In [4]:
minmax = MinMax(images)
minmax.analyze(num=20)

{'min': 27, 'max': 255, 'num': 20}

## 2) Define how to store the sequence of results in a table or data structure: `Formatter`

This is the role of the `Formatter` class. This class must define three methods (+1 optional):
- `_prepare_data_storage()`: How to create the data structure
- `_store_data()`: How to include the raw analysis data generated by `_analyze()` (see above) in the data structure
- `_to_results()`: How to store final data structure in an `Analysis.Results` object.
- `_regenerate_data()`: Basically the inverse function to `_store_data()` [optional, see Viewer section below]

In [5]:
class MinMaxFormatter(Formatter):

    def _prepare_data_storage(self):
        """Prepare structure(s) that will hold the analyzed data"""
        self.min_data = []
        self.max_data = []

    def _store_data(self, data):
        """How to store data generated by analysis on a single image.

        Input
        -----
        data is a dictionary, output of Analysis.analyze()
        """
        self.min_data.append(data['min'])
        self.max_data.append(data['max'])

    def _to_results(self):
        """How to pass stored data into an AnlysisResults class/subclass.
        
        For most simple cases, just store the final version of your data
        structure in
        """
        df = pd.DataFrame(
            {'min': self.min_data, 'max': self.max_data}
        )
        self.analysis.results.data = df
        
        # OPTIONAL: metadata saving (dict); typically analysis parameters
        # metadata is saved to JSON when calling results.save()
        self.analysis.results.metadata = {'info': 'Add your metadata here'}
        
    def _regenerate_data(self, num):
        """OPTIONAL, how to move back from data structure do data dict.
        
        Basically the inverse of _store_data()
        """
        data = {}
        data['min'] = self.analysis.results.data.loc[num, 'min']
        data['max'] = self.analysis.results.data.loc[num, 'max']
        return data

Now you can run the analysis on a (sub-)sequence of the images and see the results:

In [6]:
minmax = MinMax(images, Formatter=MinMaxFormatter)
minmax.run(skip=2)
minmax.results.data.head()

100%|████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 371.65it/s]


Unnamed: 0,min,max
0,26,255
1,27,255
2,28,255
3,28,255
4,29,255


When the data is easily stored into a pandas dataframe, it is often more convenient to use a pre-defined pandas formatter, which will add automatically information about the images (num, name, etc.); note for example that in the results below, the index of the dataframe is not the image ID (num); but this information is added with `PandasFormatter`.

To use `PandasFormatter`, replace the `_to_results()` method by the `_to_pandas()` method:

In [7]:
class MinMaxFormatter_Pandas(PandasFormatter):

    def _prepare_data_storage(self):
        """SAME AS ABOVE"""
        self.min_data = []
        self.max_data = []

    def _store_data(self, data):
        """SAME AS ABOVE"""
        self.min_data.append(data['min'])
        self.max_data.append(data['max'])

    def _to_pandas(self):
        """(Almost) SAME AS ABOVE, but return the DataFrame instead. 
        
        (storing in analysis.results.data is managed by _to_results()
        which is itself managed by PandasFormatter behunid the scenes.)
        """
        df = pd.DataFrame(
            {'min': self.min_data, 'max': self.max_data}
        )
        return df
    
    def _regenerate_data(self, num):
        """SAME AS ABOVE"""
        data = {}
        data['min'] = self.analysis.results.data.loc[num, 'min']
        data['max'] = self.analysis.results.data.loc[num, 'max']
        return data

In [8]:
minmax = MinMax(images, Formatter=MinMaxFormatter_Pandas)
minmax.run(skip=2)
minmax.results.data.head()

100%|████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 387.88it/s]


Unnamed: 0_level_0,folder,filename,time (unix),min,max
num,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,data/img1,img-00610.png,1696408000.0,26,255
2,data/img1,img-00612.png,1696408000.0,27,255
4,data/img1,img-00614.png,1696408000.0,28,255
6,data/img1,img-00616.png,1696408000.0,28,255
8,data/img1,img-00618.png,1696408000.0,29,255


In [9]:
data = minmax.formatter._to_pandas()
nums = minmax.formatter.analysis.nums

In [10]:
data.index = nums

Note that in the results above, the unix time is extracted automatically from the image files. To import real time data of the images (if available), see `ImgSeries.load_time()`

Finally, once you have decided in a Formatter, it is possible to include it as a default formatter within your Analysis class, so that you don't have to pass it every time:

In [11]:
class MinMax_Pandas(MinMax):
    """Version of MinMax with PandasFormatter as default."""

    DefaultFormatter = MinMaxFormatter_Pandas

In [12]:
minmax = MinMax_Pandas(images)
minmax.run(skip=2)
minmax.results.data.head()

100%|████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 402.20it/s]


Unnamed: 0_level_0,folder,filename,time (unix),min,max
num,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,data/img1,img-00610.png,1696408000.0,26,255
2,data/img1,img-00612.png,1696408000.0,27,255
4,data/img1,img-00614.png,1696408000.0,28,255
6,data/img1,img-00616.png,1696408000.0,28,255
8,data/img1,img-00618.png,1696408000.0,29,255


## 3) How to save/load results and metadata to/from files: `Results` [optional]

By default, the `Results` class saves only metadata (`Results.metadata`, `images` active transforms, and code versions) to a JSON file (by dafault, *Results.json*) when calling `Results.save()`), but not the data because it depends how it is formatted. Same for `Results.load()`.

In order to be able to use `save()` / `load()` with actual data, either:
- subclass the `Results` class to define the `_save_data()` and `_load_data()` methods (automatically called by `save()` and `load()`, or
- use a pre-defined `Results` subclass (e.g. `PandasTsvResults`, which saves pandas data to .tsv files)

In both cases, it is also possible to set a class attribute `default_filename` that sets the filename (without extension) that is used when calling `load()` or `save()` without arguments; the filename impacts both the data file and the metadata file.

Below is an example of use of `PandasTsvResults` as the results class.

In [13]:
class MinMaxResults_PandasTsv(PandasTsvResults):
    """Results class that uses pandas to save to .tsv files"""
    
    default_filename = 'MinMax_Results'


class MinMax_PandasTsv(MinMax):
    """Analysis class which uses the above Results class"""

    DefaultFormatter = MinMaxFormatter_Pandas
    DefaultResults = PandasTsvResults

In [14]:
minmax = MinMax_PandasTsv(images, savepath='data/untracked_data')
minmax.run(skip=2)
minmax.results.data.head()

100%|████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 398.01it/s]


Unnamed: 0_level_0,folder,filename,time (unix),min,max
num,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,data/img1,img-00610.png,1696408000.0,26,255
2,data/img1,img-00612.png,1696408000.0,27,255
4,data/img1,img-00614.png,1696408000.0,28,255
6,data/img1,img-00616.png,1696408000.0,28,255
8,data/img1,img-00618.png,1696408000.0,29,255


In [15]:
minmax.results.save()









## 4) How to view and inspect results: `AnalysisViewer` [optional]

It is often convenient to view the analysis in real time, or inspect the results afterward in an interactive manner. In order to do so, it is possible to subclass `AnalysisViewer`.

**IMPORTANT NOTE**: if the live view does not appear, try using `plt.ion()` before, or use `plt.show()` after the commands necessiting interactive matplotlib graphs. Try also changing matplotlib's backend.

### *Live view of analysis*

Here is a minimal example where we will plot the images and the detected minimum of the image in real time during the analysis

In [16]:
class MinMaxViewer(AnalysisViewer):

    def _create_figure(self):
        """Must define self.figs and self.axs"""
        self.fig, self.axs = plt.subplots(2, 1)
        self.ax_img, self.ax_analysis = self.axs

    def _first_plot(self, data):
        """What to do when the first frame is displayed
        --> create curves and image objects etc.
        
        data is what comes out of Analysis.analyze() (dict);
        
        the 'image' and 'num' keys are automatically added by
        Analysis.analyze()
        (compared to the raw results of _analyze()).
        
        Must define self.updated_artists as an iterable of 
        matplotlib artists that will be updated in subsequent
        frames.
        """
        img = data['image']
        num = data['num']

        # image (we use the imgseries _imshow() method for convenience
        # (then the display is already calibrated by imgseries)
        self.ax_img.set_title(f'img #{num}')
        self.imshow = self.analysis.img_series._imshow(img, ax=self.ax_img)
        
        # analysis data
        self.nums = [num,]
        self.min_data = [data['min'],]
        self.min_pts, = self.ax_analysis.plot(num, self.min_data, 'o')
        
        self.updated_artists = (self.min_pts, self.imshow)

    def _update_plot(self, data):
        """What to do upon iterations of the plot after the first time."""
        img = data['image']
        num = data['num']
        self.nums.append(num)
        self.min_data.append(data['min'])

        # Update displayed image and image number
        self.ax_img.set_title(f'img #{num}')
        self.imshow.set_array(img)
    
        # Update plot of analysis data
        self.min_pts.set_data(self.nums, self.min_data)
        
        # Adapt analysis axes to fit new data
        self.ax_analysis.relim()  # without this, axes limits change don't work
        self.ax_analysis.autoscale(axis='both')
        

class MinMax_WithViewer(MinMax):
    """Analysis class with live view option"""

    DefaultFormatter = MinMaxFormatter_Pandas
    DefaultResults = PandasTsvResults
    DefaultViewer = MinMaxViewer

In [17]:
minmax = MinMax_WithViewer(images, savepath='data/untracked_data')
minmax.run(live=True)

### *Interactive inspection after analysis*

In order to be able to use the `analysis.show()`, `analysis.inspect()` and `analysis.animate()` tools after the analysis has run, the `Formatter` used by the analysis must have the `_regenerate_data()` method defined (see above). This method created a dict of data similar to that made by `analysis.analyze()`, but from stored data instead of live analysis data.

The Viewer will be the same viewer as used for live view of analysis (see above)

In [19]:
minmax.show(num=10)

array([<Axes: title={'center': 'img #10'}>, <Axes: >], dtype=object)

In [20]:
minmax.animate()

<matplotlib.animation.FuncAnimation at 0x2991fb2b0>

In [21]:
minmax.inspect()

<imgseries.viewers.KeyPressSlider at 0x299806cb0>

**NOTE**: it is possible to load results and inspect them directly by using `analysis.regenerate()`, see examples done in the Contour Tracking and Grey Level Analysis notebooke