# Example of how to use the nuQLOUD framework
In this notebook, we demonstrate how to use the nuQLOUD framework on real data. The used data is part of the publication and consists of one table containing x, y, z coordinates of three samples aged 12 hpf, 24 hpf and 48 hpf. The data also contains annotations of the data.
* `sample`: name of the sample. Corresponds to the image file name.
* `cell id`: unique number identifying the cell. Has to be > 0 for use with voro++.
* `age`: age of the sample in hours post fertilisation (hpf)
* `archetype`: boolean; True=Amorphous, False=Crystalline
* `cdh1`, `cdh2`: binary expression classification of cells. Expression of E- and N-cadherin. Quantification from transgenic reporter fish lines.

In [25]:
import nuqloud
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tqdm
import vedo
vedo.settings.notebookBackend = 'k3d'
vedo.settings.k3dPointShader = '3d'

## Load data

In [26]:
df = pd.read_csv('example_data/df_time.csv').drop(['Unnamed: 0'], axis=1)
df

Unnamed: 0,x,y,z,sample,cell id,age,archetype,cdh2,cdh1
0,390.987,469.611,294.366,pseudo_timelapse_cdh12_24hpf_fish01,973,24.0,False,False,True
1,186.227,435.218,535.213,pseudo_timelapse_cdh12_12hpf_fish02,27478,12.0,True,True,False
2,117.212,651.048,425.325,pseudo_timelapse_cdh12_48hpf_fish02,113393,48.0,False,True,False
3,353.837,243.890,215.778,pseudo_timelapse_cdh12_48hpf_fish02,104129,48.0,False,True,False
4,605.802,157.672,558.818,pseudo_timelapse_cdh12_12hpf_fish02,36584,12.0,False,False,False
...,...,...,...,...,...,...,...,...,...
280040,315.597,1665.090,427.916,pseudo_timelapse_cdh12_48hpf_fish02,146389,48.0,True,True,False
280041,321.760,1814.840,418.610,pseudo_timelapse_cdh12_48hpf_fish02,146390,48.0,True,True,False
280042,453.660,2401.690,434.272,pseudo_timelapse_cdh12_48hpf_fish02,146391,48.0,True,True,False
280043,398.821,1519.660,404.287,pseudo_timelapse_cdh12_48hpf_fish02,146394,48.0,False,False,False


In [27]:
df['cell id']

0            973
1          27478
2         113393
3         104129
4          36584
           ...  
280040    146389
280041    146390
280042    146391
280043    146394
280044    146395
Name: cell id, Length: 280045, dtype: int64

In [28]:

vedo.show(vedo.Points(df.loc[df['sample']=='pseudo_timelapse_cdh12_48hpf_fish02', list('xyz')].values))

Plot(antialias=True, axes=['x', 'y', 'z'], axes_helper=1.0, background_color=16777215, camera=[2, -3, 0.2, 0.0…

3D rendering of the initial point distribution. All points are used in the following processing. Colours are just to illustrate the difference in point density in one corner.

## Generate restricted Voronoi diagram and organisational features
First, we generate a restricted Voronoi diagram using our modified version of voro++. The modification is that we can call a radially restricted Voronoi diagram from the command line (this is not implemented in the original voro++ code).
Then we evaluate the Voronoi diagram and generate features from it. Moreover, we generate a kernel density estimation at different length scales (multi scale density). 
We do this for every sample individually.

In [29]:
sdf['neighbour boundaries']

0        1
1        0
2        1
3        0
4        0
        ..
81993    0
81994    0
81995    1
81996    0
81997    2
Name: neighbour boundaries, Length: 81998, dtype: int64

In [None]:
list_df = []
for sid in df['sample'].unique():
    sdf = df.loc[df['sample'] == sid].copy()
    sdf = nuqloud.Voronoi.voronoi_restricted(sdf)
    sdf = nuqloud.FeatureGeneration.voronoi_features(sdf)
    nuqloud.FeatureGeneration.multi_scale_density(sdf, np.arange(5,44,5))
    list_df.append(sdf)
df = pd.concat(list_df)

Voronoi cell creation: 100%|██████████| 81998/81998 [00:11<00:00, 6889.14it/s]
Adaptive radial restriction: 100%|██████████| 49423/49423 [31:53<00:00, 25.82it/s]  
Voronoi cell creation: 100%|██████████| 178707/178707 [00:34<00:00, 5128.13it/s]
number of neighbours: 100%|██████████| 81998/81998 [04:09<00:00, 328.04it/s]
  out=out, **kwargs)
  ret = ret.dtype.type(ret / rcount)
  keepdims=keepdims, where=where)
  subok=False)
  ret = ret.dtype.type(ret / rcount)
voronoi density: 100%|██████████| 81998/81998 [00:49<00:00, 1655.14it/s]
neighbourhood voronoi volume: 100%|██████████| 81998/81998 [00:40<00:00, 2041.58it/s]
neighbourhood voronoi sphericity: 100%|██████████| 81998/81998 [00:40<00:00, 2048.96it/s]
neighbourhood n neighbours: 100%|██████████| 81998/81998 [00:40<00:00, 2028.41it/s]
neighbourhood centroid offset: 100%|██████████| 81998/81998 [00:40<00:00, 2046.45it/s]
100%|██████████| 8/8 [00:06<00:00,  1.16it/s]
Voronoi cell creation: 100%|██████████| 51655/51655 [00:11<00:00, 46

## Visualisation
Here we illustrate the distributions of organisational features on our test data in 3D by colouring the points according to their feature values.

In [19]:
df.columns

Index(['x', 'y', 'z', 'cell id', 'sample', 'vertex number', 'edge number',
       'edge distance', 'face number', 'voronoi surface area',
       'voronoi volume', 'voronoi sphericity', 'x centroid', 'y centroid',
       'z centroid', 'centroid offset', 'neigbour cell ids',
       'neighbour boundaries', 'coordinates vertices', 'vertices per face',
       'point type', 'n neighbours', 'density voronoi mean',
       'density voronoi std', 'neighbourhood voronoi volume mean',
       'neighbourhood voronoi volume std',
       'neighbourhood voronoi sphericity mean',
       'neighbourhood voronoi sphericity std',
       'neighbourhood n neighbours mean', 'neighbourhood n neighbours std',
       'neighbourhood centroid offset mean',
       'neighbourhood centroid offset std', 'shell 5', 'shell 10', 'shell 15',
       'shell 20', 'shell 25', 'shell 30', 'shell 35', 'shell 40'],
      dtype='object')

In [20]:
vedo.show(nuqloud.Visualisation.show_features(
    df.loc[df['sample'] == 'noisy'],
    ['voronoi volume']),
)

Plot(antialias=True, axes=['x', 'y', 'z'], axes_helper=1.0, background_color=16777215, camera=[2, -3, 0.2, 0.0…