Copyright (c) 2020 Ryan Cohn and Elizabeth Holm. All rights reserved. <br />
Licensed under the MIT License (see LICENSE for details) <br />
Written by Ryan Cohn

# Instance segmentation performance evaluation and sample characterization

In this example we will do the following:

  * Evaluate how well the predicted masks agree with the hand-drawn annotations
  * Perform basic sample measurements (ie particle size)
  * Match satellites to particles to measure the satellite content of samples
 
 
## Note: 
We lump the predictions on training images with the validation images. This is because our available data so far is very limited, so we just 
want to show the process for analyzing the results. The process is exactly the same for analyzing larger quantities of data, so after generating predictions
you can replace the filepath with more validation or even unlabeled images to get a better representation of the performance of the model.

In [1]:
import json
import matplotlib.pyplot as plt
import numpy as np
import os
from pathlib import Path
import pandas as pd
import pickle
import pycocotools.mask as RLE
import seaborn as sns
import skimage
import skimage.io
from IPython.display import display
from math import pi

ampis_root = str(Path('..','..','..'))
import sys
if ampis_root not in sys.path:
    sys.path.append(ampis_root)

from ampis import analyze, data_utils
from ampis.applications import powder
from ampis.structures import InstanceSet
from ampis.visualize import display_iset

%matplotlib inline

# Loading Data
For evaluating the segmentation performance, we need to load back the original ground truth labels.

You can use your own predictions generated from before by replacing the paths, but as an example I am including mine from the fully trained model.

In [2]:
## load predicted labels

particles_path = Path('..','data','sample_particle_outputs.pickle')
assert particles_path.is_file()

satellites_path = Path('..','data','sample_satellite_outputs.pickle')
assert satellites_path.is_file()

with open(particles_path, 'rb') as f:
    particle_pred = pickle.load(f)

with open(satellites_path, 'rb') as f:
    satellites_pred = pickle.load(f)

## Load data to InstanceSet objects
To standardize the format of the ground truth and predicted instances, and for convenient analysis, everything is loaded into an InstanceSet class object.

In [3]:
# Predicted instance sets
iset_particles_pred = [InstanceSet().read_from_model_out(x, inplace=False) for x in particle_pred]
iset_satellites_pred = [InstanceSet().read_from_model_out(x, inplace=False) for x in satellites_pred]

# Powder Characterization- Size Distribution

Once we have the masks it is pretty trivial to compute various properties. With binary masks we can use [skimage regionprops] (https://scikit-image.org/docs/dev/api/skimage.measure.html#skimage.measure.regionprops), which provides many convenient measurements out of the box. If there are any additional measurements you need, you can also access the masks directly and define your own methods. 

In [4]:
k = ['area', 'equivalent_diameter', 'major_axis_length', 'minor_axis_length', 'perimeter', 'eccentricity']
for iset in iset_satellites_pred:
    if iset.rprops is None:
        iset.compute_rprops(keys=k)
for iset in iset_particles_pred:
    if iset.rprops is None:
        iset.compute_rprops(keys=k)
for i in range(len(iset_satellites_pred)):
    display(iset_satellites_pred[i].rprops)
#print(mean_sat_area)


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[152],[13.911592676604096],[17.375878095794715],[11.359672389597044],[44.970562748477136],[0.7567010528645579],0
1,[388],[22.22649192646566],[36.619606797474795],[15.413003403113409],[89.84062043356595],[0.9071094404190556],0
2,[322],[20.248040236149336],[22.332318022508556],[18.877750349097003],[67.45584412271572],[0.534274831630293],0
3,[261],[18.229523339239496],[22.265857654370528],[15.53806070342821],[60.8700576850888],[0.7162512911402337],0
4,[1054],[36.63324282875953],[40.83619449896972],[33.64672414374235],[123.29646455628165],[0.5666717051577475],0
...,...,...,...,...,...,...,...
100,[159],[14.228319915326999],[17.03961715372614],[12.245633881067388],[47.21320343559643],[0.6953651161143564],0
101,[180],[15.138795132120961],[18.56416076202329],[12.622186400253424],[52.04163056034262],[0.7332841886444574],0
102,[19],[4.918490759365935],[6.954590648492471],[3.5273975077709983],[15.071067811865476],[0.8618259168425948],0
103,[9],[3.385137501286538],[4.312594463574638],[2.660928660904759],[8.82842712474619],[0.7869526281849772],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[467],[24.384480051691096],[33.11091147674296],[22.20206448267687],[100.91168824543142],[0.7418768197807244],0
1,[848],[32.85889733292062],[34.6392523481299],[31.182720388111765],[106.32590180780451],[0.43544892818348385],0
2,[533],[26.05065598682386],[32.23533222705127],[21.931676149917266],[90.18376618407356],[0.7328767625439622],0
3,[410],[22.847936741452536],[27.06193314945405],[19.36035420554165],[75.69848480983501],[0.6987057810830929],0
4,[407],[22.764193258431348],[23.77966150201993],[21.825206218486475],[71.35533905932738],[0.3970203804673073],0
...,...,...,...,...,...,...,...
77,[65],[9.09728368293446],[11.076195667183065],[7.657584449514528],[28.727922061357855],[0.7225151312719631],0
78,[73],[9.640875829802336],[14.15464484520288],[7.08855719574817],[34.14213562373095],[0.8655666662117659],0
79,[148],[13.727325035155395],[18.360779697723014],[10.917485242133752],[47.21320343559643],[0.8040149949241654],0
80,[44],[7.48482063701911],[10.692532195080702],[5.590014995852598],[25.106601717798213],[0.8524577134528233],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[913],[34.09498063268556],[37.3177970594049],[31.37148613236023],[112.08326112068522],[0.5415671342459734],0
1,[722],[30.319613310508885],[38.31052904207784],[24.313480451362388],[101.94112549695427],[0.7728060630941946],0
2,[2866],[60.407818494057345],[68.01844418803408],[56.61618431085523],[222.06601717798213],[0.5542275820718369],0
3,[251],[17.876888032555495],[18.01014552615005],[17.83446504130898],[56.384776310850235],[0.13933379685266373],0
4,[501],[25.25654394235911],[35.43802764572966],[19.336199521864756],[96.18376618407356],[0.8380234664540639],0
...,...,...,...,...,...,...,...
86,[50],[7.978845608028654],[8.577104426560998],[7.383960973344033],[23.071067811865476],[0.5087876631528436],0
87,[585],[27.29185104880338],[40.74145187581437],[19.841748375307485],[107.01219330881975],[0.8733929183656493],0
88,[177],[15.012108426804138],[17.14471814700666],[13.29044995254972],[47.55634918610404],[0.6317255877371419],0
89,[73],[9.640875829802336],[10.334568393025341],[9.03259452607181],[29.899494936611664],[0.48589428706710264],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[244],[17.625846048215095],[23.033818728256488],[14.114193624063933],[62.62741699796952],[0.790269364639981],0
1,[168],[14.625465582862905],[19.500750446051477],[11.604825686168493],[50.14213562373095],[0.8036543520235548],0
2,[2899],[60.75460015659091],[64.99329237136658],[56.90482917218646],[198.8528137423857],[0.4831288542109779],0
3,[241],[17.517155313611116],[24.48467401508363],[12.928152439139765],[60.970562748477136],[0.8492381599425521],0
4,[2058],[51.18912954002017],[60.90098124596868],[43.43566716789164],[175.53910524340094],[0.7009424315405651],0
...,...,...,...,...,...,...,...
77,[53],[8.214724333230155],[11.08245814648832],[6.4027670710494995],[27.071067811865476],[0.8162216455510621],0
78,[45],[7.569397566060481],[8.98209551620838],[6.379712028752402],[23.071067811865476],[0.7039294552051439],0
79,[200],[15.957691216057308],[19.968636309852773],[12.820825399514243],[52.62741699796952],[0.7666643409898004],0
80,[18],[4.787307364817192],[5.741271865845617],[4.015497119353468],[13.242640687119286],[0.7147216758158795],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[697],[29.790064831759068],[35.31079234995361],[25.456244281246814],[99.15432893255071],[0.6930186500359038],0
1,[270],[18.5411616971131],[20.1346973737354],[17.44537223513878],[60.76955262170047],[0.4992928413015595],0
2,[563],[26.77375326109316],[28.325246207171592],[25.456111286696647],[87.01219330881975],[0.43854854270195975],0
3,[552],[26.510907730475957],[31.74771167166308],[22.427018691793787],[87.94112549695427],[0.7077991766849692],0
4,[1678],[46.2222452512381],[47.8288270677604],[44.76782029530789],[151.4386001800126],[0.35199787845623],0
...,...,...,...,...,...,...,...
76,[73],[9.640875829802336],[10.883293370592105],[8.552519724622865],[29.071067811865476],[0.6184307422326941],0
77,[69],[9.373021315815206],[11.993493495282522],[7.511990802583782],[30.727922061357855],[0.77955093328846],0
78,[79],[10.02925341359355],[11.611562436081087],[8.764687735735935],[31.31370849898476],[0.6559272093071936],0
79,[64],[9.0270333367641],[11.981077270830037],[6.987759829187025],[29.31370849898476],[0.8123048992108851],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[182],[15.222667215103916],[16.41586639438671],[14.335865377184499],[47.89949493661167],[0.4871949229014828],0
1,[2390],[55.16377898510071],[58.23666991809482],[52.37985898895272],[181.19595949289334],[0.4370630399234085],0
2,[350],[21.11004122822376],[22.030429995762322],[20.341858099409446],[67.94112549695427],[0.3839527325284285],0
3,[1707],[46.61995176813166],[47.40864738882738],[45.922899829307944],[151.29646455628165],[0.24838718227418966],0
4,[895],[33.757212452126],[34.63166478841384],[32.94766554313486],[108.5685424949238],[0.30803815893369174],0
...,...,...,...,...,...,...,...
64,[51],[8.058239062071396],[10.193521727243967],[6.6774880623142225],[26.242640687119284],[0.7555666809223188],0
65,[38],[6.955796338302048],[10.056245781487952],[5.036367299408325],[23.071067811865476],[0.8655515572217096],0
66,[11],[3.742410318509555],[6.410747983365109],[1.9649396885733192],[11.207106781186548],[0.9518683762700559],0
67,[32],[6.383076486422923],[7.641967428360681],[5.333299525047738],[18.82842712474619],[0.7161989849411932],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[306],[19.738573927438622],[31.113992572390213],[13.056564186579278],[74.2842712474619],[0.9076922838988873],0
1,[462],[24.253590861306396],[33.556387876207026],[18.364484837959395],[87.01219330881976],[0.8369545526170283],0
2,[321],[20.216574731145414],[21.717598304719477],[19.076153073784226],[65.11269837220809],[0.4779757149138484],0
3,[329],[20.466944330257718],[26.42757677850081],[16.354696870913465],[69.11269837220809],[0.7855094524367009],0
4,[715],[30.172276587716105],[46.860135060070434],[20.906270735349445],[117.39696961966999],[0.8949622255484394],0
...,...,...,...,...,...,...,...
64,[130],[12.865501965161373],[13.940805879326946],[11.994435201943498],[40.72792206135786],[0.5096476680897968],0
65,[322],[20.248040236149336],[21.10879746337997],[19.475208849493878],[63.698484809834994],[0.3857316849838435],0
66,[47],[7.735777827895049],[8.16232809822246],[7.47182629324302],[23.414213562373096],[0.4025367717383524],0
67,[51],[8.058239062071396],[11.105839276049776],[6.618787953006264],[27.899494936611664],[0.8030039647241264],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[1247],[39.846326208130506],[46.84187195581838],[34.802108282635274],[136.5685424949238],[0.6693246553225767],0
1,[699],[29.83277462405867],[31.02556786700215],[28.745841222516255],[95.84062043356595],[0.376242971282452],0
2,[1429],[42.65512055341712],[45.299785378076045],[40.42592947717148],[139.29646455628165],[0.45122770348930924],0
3,[574],[27.034043328329254],[29.56461914883415],[25.078780930635173],[87.94112549695429],[0.5295638999831588],0
4,[208],[16.273715780512877],[20.026794550292085],[13.577794155282787],[54.384776310850235],[0.7350790601946711],0
...,...,...,...,...,...,...,...
78,[310],[19.867165345562018],[23.071531700088983],[17.8874906858028],[67.698484809835],[0.6315861206478512],0
79,[29],[6.076507779746499],[6.77151066707085],[5.582089479111626],[17.899494936611667],[0.5660815996723955],0
80,[48],[7.817640190446719],[9.058514056019789],[6.7262124712205615],[23.31370849898476],[0.669813539693384],0
81,[40],[7.136496464611085],[8.25179785493205],[6.1618042943068865],[21.071067811865476],[0.6651355471914897],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[609],[27.846056861676377],[28.764470793693373],[27.765804679898256],[94.76955262170047],[0.2612127188070624],0
1,[530],[25.977239243415305],[29.774002010044388],[23.174996587253748],[86.32590180780451],[0.6278138679460951],0
2,[320],[20.185060176161283],[25.931463970186492],[16.09074718939179],[68.04163056034261],[0.784198091594483],0
3,[514],[25.582125126616702],[26.7633948075632],[24.486381139607683],[81.59797974644665],[0.4036338715261486],0
4,[415],[22.98683125324351],[23.599366870484634],[22.543524220279348],[73.11269837220809],[0.2957683167376257],0
...,...,...,...,...,...,...,...
80,[331],[20.529059630371258],[22.69035803951461],[19.359125745448974],[71.11269837220809],[0.5216046381165209],0
81,[78],[9.965574970333758],[11.849392974830101],[8.40631782388618],[31.31370849898476],[0.7047756522307735],0
82,[422],[23.179885415554555],[29.973682984166462],[18.095189380997887],[77.21320343559643],[0.7972096397756279],0
83,[264],[18.333991376950163],[24.289904742474267],[14.618128880691554],[64.87005768508881],[0.7986324295451243],0


Unnamed: 0,area,equivalent_diameter,major_axis_length,minor_axis_length,perimeter,eccentricity,class_idx
0,[855],[32.99423905394037],[33.45181835559577],[32.57743022012233],[106.32590180780451],[0.2271436371258412],0
1,[419],[23.09734550211416],[23.994406314509163],[22.284828902518633],[73.698484809835],[0.37070416322260125],0
2,[290],[19.21560480373171],[19.781538481634882],[18.68657909496144],[60.04163056034262],[0.32808729104373185],0
3,[398],[22.511093682995384],[25.371733761911816],[20.32276814838881],[74.42640687119285],[0.5986638507137036],0
4,[367],[21.616635097022034],[24.381629656750036],[19.335094669842015],[69.698484809835],[0.6091969501848311],0
...,...,...,...,...,...,...,...
86,[94],[10.940041919714261],[13.48260566719357],[9.039386975851484],[34.72792206135785],[0.7419564560520797],0
87,[267],[18.437867513470433],[25.944760071937434],[13.240468182394737],[65.11269837220809],[0.8599768555651484],0
88,[2337],[54.54870132318528],[59.18644035299608],[50.46536745642868],[180.61017305526642],[0.5224811605382117],0
89,[1286],[40.464627201166934],[45.14174681564298],[38.35249326655088],[155.53910524340094],[0.5274252428839842],0


Note that the measurements are in terms of pixels. If the conversion for pixels to microns is known, then we can generate equivalent measurements.

# Satellite content measurements
The process here is fairly straightforward. We have masks for powder particles and masks for satellites. To match the satellites to their corresponding particles, we simply overlay the masks and look for intersections. Then, it is trivial to count the number of particles containing satellites. 



We have more labeled satellite images than particle images. We only want to keep images that have labels for both particles and satellites.
To help with the implementation, we can combine the masks for particles and satellites in the PowderSatelliteImage class

In [5]:
iset_particles_pred_ss, iset_satellites_pred_ss = analyze.align_instance_sets(iset_particles_pred, iset_satellites_pred)
psi_pred = []
for pp, sp in zip(iset_particles_pred_ss, iset_satellites_pred_ss):
    files = [Path(x).name for x in [pp.filepath, sp.filepath]]
    assert all([x == files[0] for x in files])  # the files are in the same order and there are no excess files
    psi_pred.append(powder.PowderSatelliteImage(particles=pp, satellites=sp))
for i in psi_pred:
    i.compute_matches()
print(psi_pred)

[<ampis.applications.powder.PowderSatelliteImage object at 0x147a4d560820>, <ampis.applications.powder.PowderSatelliteImage object at 0x147a4d5602e0>, <ampis.applications.powder.PowderSatelliteImage object at 0x147a4d560a30>, <ampis.applications.powder.PowderSatelliteImage object at 0x147a4d560af0>, <ampis.applications.powder.PowderSatelliteImage object at 0x147a4d5609d0>]


# Computing Powder Characteristics
This takes all of the satellite and particle mask data and computes mean statistics. 

In [6]:
data = {'mean_sat_eccentricty': [], 'mean_sat_area': [], 'mean_sat_perimeter': [],
        'mean_par_eccentricty': [], 'mean_par_area': [], 'mean_par_perimeter': []}
        
for i in range(len(psi_pred)):
    data['mean_sat_eccentricty'].append(float("{0:.4f}".format(psi_pred[i].satellites.rprops.eccentricity.mean()[0])))
    data['mean_sat_area'].append(float("{0:.4f}".format(psi_pred[i].satellites.rprops.area.mean()[0])))
    data['mean_sat_perimeter'].append(float("{0:.4f}".format(psi_pred[i].satellites.rprops.perimeter.mean()[0])))
    data['mean_par_eccentricty'].append(float("{0:.4f}".format(psi_pred[i].particles.rprops.eccentricity.mean()[0])))
    data['mean_par_area'].append(float("{0:.4f}".format(psi_pred[i].particles.rprops.area.mean()[0])))
    data['mean_par_perimeter'].append(float("{0:.4f}".format(psi_pred[i].particles.rprops.perimeter.mean()[0])))
print('Mean_Sat_Eccentricity ', data['mean_sat_eccentricty'], '\nMean_sat_area: ', data['mean_sat_area'], '\nMean_sat_perimeter: ', data['mean_sat_perimeter'])
print('Mean_Par_Eccentricity ', data['mean_par_eccentricty'], '\nMean_par_area: ', data['mean_par_area'], '\nMean_par_perimeter: ', data['mean_par_perimeter'])

Mean_Sat_Eccentricity  [0.617, 0.6097, 0.6459, 0.6234, 0.6162] 
Mean_sat_area:  [361.6341, 319.0706, 291.6857, 324.2899, 411.9878] 
Mean_sat_perimeter:  [64.3735, 60.7984, 52.2803, 63.6438, 65.7459]
Mean_Par_Eccentricity  [0.5363, 0.4905, 0.5153, 0.4883, 0.4929] 
Mean_par_area:  [5821.405, 3829.444, 5096.335, 4810.9958, 4812.0134] 
Mean_par_perimeter:  [260.7941, 201.6097, 243.9873, 232.571, 229.603]


# Computing/Graphing Satellite Area Ratio Distributio

In [60]:
sat_area_ratio = []
for i in psi_pred:
    for j in i.matches['match_pairs']:
        #print(f"Particle {i} Area: ", psi_pred[0].particles.rprops.iloc[i]['area'])
        for k in i.matches['match_pairs'][j]:
            #print(f"Satellite {j} Area for Particle {i}: ", psi_pred[0].satellites.rprops.iloc[j]['area'])
            sat_area_ratio.append(i.satellites.rprops.iloc[k]['area']/i.particles.rprops.iloc[j]['area'])
'''            
print('Average Satellite Area Ratio: ', sum(sat_area_ratio)/len(sat_area_ratio))
num_bins = 100
n, bins, patches = plt.hist(sat_area_ratio, num_bins, facecolor='blue', alpha=0.5)
plt.show()'''
#sat_area_ratio
df = pd.DataFrame(sat_area_ratio,columns=['Area_Ratios'])
print(df)

     Area_Ratios
0       0.038708
1       0.006495
2       0.023236
3       0.083859
4       0.015089
..           ...
398     0.127739
399     0.105324
400     1.271715
401     1.083529
402     0.552354

[403 rows x 1 columns]


The matches for psi are stored in psi.matches as a dictionary. The key 'match_pairs' returns a dictionary whose keys are indices of particle masks that contain satellites. The value corresponding to each key is a list of indices of satellite masks that matched (note that multiple satellites can match with a single particle.)

To compute the ratio of satellited particles, we can get the total number of 

## Final satellite measurements

The number of satellites in each set, fraction of satellited particles, and some other information can be displayed with one command.
The results can be printed directly, and are also returned

In [51]:
print('predicted results')
results_pred = powder.satellite_measurements(psi_pred, True, True)


predicted results
number of images                   	5
number of particles                	1138
number of matched satellites       	403
number of unmatched satellites     	20
number of satellited particles     	269
fraction of satellited particles   	0.23637961335676624
median number of satellites per
satellited particle             	1.0


The results are (optionally) returned as a dictionary in case you need to store them for further analysis or post processing.

In [22]:
results_gt

{'n_images': 5,
 'n_particles': 1360,
 'n_satellites': 585,
 'n_satellites_unmatched': 2,
 'n_satellited_particels': 315,
 'sat_frac': 0.23161764705882354,
 'mspp': 1.0,
 'unique_satellites_per_particle': array([ 1,  2,  3,  4,  5,  6,  7, 14]),
 'counts_satellites_per_particle': array([0.53333333, 0.79047619, 0.9047619 , 0.96507937, 0.98095238,
        0.99047619, 0.9968254 , 1.        ])}