# Tutorial about adding properties for each localization in LocData

In [1]:
import numpy as np
import pandas as pd

import locan as sp

In [2]:
sp.show_versions(system=False, dependencies=False, verbose=False)


Locan:
   version: 0.7.dev3+gb9aca40

Python:
   version: 3.8.8


## Some localization data

In [3]:
path = sp.ROOT_DIR / 'tests/Test_data/rapidSTORM_dstorm_data.txt'
dat = sp.load_rapidSTORM_file(path=path)
dat.print_summary()

identifier: "1"
comment: ""
creation_date: "2021-03-04 13:47:30 +0100"
modification_date: ""
source: EXPERIMENT
state: RAW
element_count: 999
frame_count: 48
file_type: RAPIDSTORM
file_path: "c:\\users\\soeren\\mydata\\programming\\python\\projects\\locan\\locan\\tests\\Test_data\\rapidSTORM_dstorm_data.txt"



In [4]:
print(dat.properties)

{'localization_count': 999, 'position_x': 16066.234912912912, 'position_y': 17550.369092792796, 'region_measure_bb': 1064111469.8204715, 'localization_density_bb': 9.388114199807877e-07, 'subregion_measure_bb': 130483.2086}


## Adding a property to each localization in LocData.data

In case you have processed your data and come up with a new property for each localization in the LocData object, this property can be added to data. In this example we compute the nearest neighbor distance for each localization and add *nn_distance* as new property.

### Nearest-neighbor distance for each localization

In [5]:
nn = sp.NearestNeighborDistances().compute(dat)
nn.results

Unnamed: 0,nn_distance,nn_index
0,909.435242,276
1,54.429771,38
2,5.385165,30
3,5.047187,706
4,3.544009,27
...,...,...
994,5.935697,858
995,1077.190438,781
996,566.894355,626
997,115.443579,690


In [6]:
nn.results['nn_distance']

0       909.435242
1        54.429771
2         5.385165
3         5.047187
4         3.544009
          ...     
994       5.935697
995    1077.190438
996     566.894355
997     115.443579
998       4.707441
Name: nn_distance, Length: 999, dtype: float64

### Adding nn_distance as new property to each localization in LocData object

In [7]:
dat.dataframe = dat.dataframe.assign(nn_distance= nn.results['nn_distance'])
dat.data.head()

Unnamed: 0,position_x,position_y,frame,intensity,chi_square,local_background,nn_distance
0,9657.4,24533.5,0,33290.1,1192250.0,767.733,909.435242
1,16754.9,18770.0,0,21275.4,2106810.0,875.461,54.429771
2,14457.6,18582.6,0,20748.7,526031.0,703.37,5.385165
3,6820.58,16662.8,0,8531.77,3179190.0,852.789,5.047187
4,19183.2,22907.2,0,14139.6,448631.0,662.77,3.544009


### Adding nn_distance as new property to each localization in LocData object with dataset=None

In case the LocData object was created with LocData.from_selection() the LocData.dataset attribute is None and LocData.data is generated from the referenced locdata and the index list. 

In this case LocData.dataset can still be filled with additional data that is merged upon returning LocData.data.

In [8]:
dat_selection = sp.LocData.from_selection(dat, indices=[1, 3, 4, 5])
dat_selection.data

Unnamed: 0,position_x,position_y,frame,intensity,chi_square,local_background,nn_distance
1,16754.9,18770.0,0,21275.4,2106810.0,875.461,54.429771
3,6820.58,16662.8,0,8531.77,3179190.0,852.789,5.047187
4,19183.2,22907.2,0,14139.6,448631.0,662.77,3.544009
5,31972.6,18136.8,0,10936.8,327028.0,659.928,5.703508


In [9]:
dat_selection.dataframe

In [10]:
nn_selection = sp.NearestNeighborDistances().compute(dat_selection)
nn_selection.results

Unnamed: 0,nn_distance,nn_index
0,4797.193422,2
1,10155.343702,0
2,4797.193422,0
3,13650.108737,2


Just make sure the indices in nn.results match those in dat_selection.data:

In [11]:
dat_selection.data.index

Int64Index([1, 3, 4, 5], dtype='int64')

In [12]:
nn_selection.results.index = dat_selection.data.index
nn_selection.results

Unnamed: 0,nn_distance,nn_index
1,4797.193422,2
3,10155.343702,0
4,4797.193422,0
5,13650.108737,2


Then assign the corresponding result to dataframe:

In [13]:
dat_selection.dataframe = dat_selection.dataframe.assign(nn_distance= nn_selection.results['nn_distance'])
dat_selection.dataframe

Unnamed: 0,nn_distance
1,4797.193422
3,10155.343702
4,4797.193422
5,13650.108737


Calling `data` will return the complete dataset.

In [14]:
dat_selection.data

Unnamed: 0,position_x,position_y,frame,intensity,chi_square,local_background,nn_distance_x,nn_distance_y
1,16754.9,18770.0,0,21275.4,2106810.0,875.461,54.429771,4797.193422
3,6820.58,16662.8,0,8531.77,3179190.0,852.789,5.047187,10155.343702
4,19183.2,22907.2,0,14139.6,448631.0,662.77,3.544009,4797.193422
5,31972.6,18136.8,0,10936.8,327028.0,659.928,5.703508,13650.108737
