# Tutorial about loading localization data from file

Localization data is typically provided as text file with different formats depending on the fitting software.

In [1]:
from pathlib import Path

import surepy as sp

In [2]:
sp.show_versions(system=False, dependencies=False, verbose=False)


Surepy:
   version: 0.7.dev3+gb9aca40

Python:
   version: 3.8.8


Throughout this manual it might be helpful to use pathlib to provided path information. In all cases a string with the path is also usable.

## Load rapidSTORM data file

Here we identify some data in the Test_data directory and provide a path using pathlib:

In [3]:
path = sp.ROOT_DIR / 'tests/Test_data/rapidSTORM_dstorm_data.txt'
print(path, '\n')

c:\users\soeren\mydata\programming\python\projects\surepy\surepy\tests\Test_data\rapidSTORM_dstorm_data.txt 



The data is then loaded from a rapidSTORM localization file. The file header is read to provide correct property names. The number of localisations to be read can be limited by *nrows*

In [4]:
dat = sp.load_rapidSTORM_file(path=path, nrows=1000)

Print information about the data: 

In [5]:
print('Data head:')
print(dat.data.head(), '\n')
print('Summary:')
dat.print_summary()
print('Properties:')
print(dat.properties)

Data head:
   position_x  position_y  frame  intensity  chi_square  local_background
0     9657.40     24533.5      0   33290.10   1192250.0           767.733
1    16754.90     18770.0      0   21275.40   2106810.0           875.461
2    14457.60     18582.6      0   20748.70    526031.0           703.370
3     6820.58     16662.8      0    8531.77   3179190.0           852.789
4    19183.20     22907.2      0   14139.60    448631.0           662.770 

Summary:
identifier: "1"
comment: ""
creation_date: "2021-03-04 13:46:57 +0100"
modification_date: ""
source: EXPERIMENT
state: RAW
element_count: 999
frame_count: 48
file_type: RAPIDSTORM
file_path: "c:\\users\\soeren\\mydata\\programming\\python\\projects\\surepy\\surepy\\tests\\Test_data\\rapidSTORM_dstorm_data.txt"

Properties:
{'localization_count': 999, 'position_x': 16066.234912912912, 'position_y': 17550.369092792796, 'region_measure_bb': 1064111469.8204715, 'localization_density_bb': 9.388114199807877e-07, 'subregion_measure_bb'

## Load Zeiss Elyra data file

The Elyra super-resolution microscopy system from Zeiss uses as lightly different file format. Elyra column names are exchanged with surepy property names.

In [6]:
path_Elyra = sp.ROOT_DIR / 'tests/Test_data/Elyra_dstorm_data.txt'
print(path_Elyra, '\n')

c:\users\soeren\mydata\programming\python\projects\surepy\surepy\tests\Test_data\Elyra_dstorm_data.txt 



In [7]:
dat_Elyra = sp.load_Elyra_file(path=path_Elyra, nrows=1000)

In [8]:
print('Data head:')
print(dat_Elyra.data.head(), '\n')
print('Summary:')
dat_Elyra.print_summary()
print('Properties:')
print(dat_Elyra.properties)

Data head:
   original_index  frame  frames_number  frames_missing  position_x  \
0               1      1              1               0     15850.6   
1               2      1              1               0     25617.3   
2               3      1              1               0     20155.8   
3               4      1              1               0     10776.9   
4               5      1              1               0     28966.9   

   position_y  uncertainty  intensity  local_background_sigma  chi_square  \
0     23502.1          8.6        472                    5.33        0.28   
1     24310.2          9.5        529                    4.38        0.31   
2     24039.1         13.0        306                    3.06        0.23   
3     10047.4         13.4        369                    3.98        0.25   
4      8731.6         18.1        428                   14.73        0.41   

   psf_half_width  channel  slice_z  
0           110.0        1        1  
1           129.8      

## Localization data from a custom text file

Other custom text files can be read with a function that wraps the pandas.read_table() method.

In [9]:
path_csv = sp.ROOT_DIR / 'tests/Test_data/five_blobs.txt'
print(path_csv, '\n')

c:\users\soeren\mydata\programming\python\projects\surepy\surepy\tests\Test_data\five_blobs.txt 



Here data is loaded from a comma-separated-value file. Column names are read from the first line and a warning is given if the naming does not comply with surepy conventions. Column names can also be provided as *column*. The separater, e.g. a tab '\t' can be provided as *sep*. The number of localisations to be read can be limited by *nrows*.

In [10]:
dat_csv = sp.load_txt_file(path=path_csv, sep=',', columns=None, nrows=100)

In [11]:
print('Data head:')
print(dat_csv.data.head(), '\n')
print('Summary:')
dat_csv.print_summary()
print('Properties:')
print(dat_csv.properties)

Data head:
   index  position_x  position_y  cluster_label
0      0         624         919              3
1      1         611         873              3
2      2         388        1015              0
3      3         209         465              2
4      4        1001         851              4 

Summary:
identifier: "3"
comment: ""
creation_date: "2021-03-04 13:46:58 +0100"
modification_date: ""
source: EXPERIMENT
state: RAW
element_count: 50
frame_count: 0
file_type: CUSTOM
file_path: "c:\\users\\soeren\\mydata\\programming\\python\\projects\\surepy\\surepy\\tests\\Test_data\\five_blobs.txt"

Properties:
{'localization_count': 50, 'position_x': 608.68, 'position_y': 777.34, 'region_measure_bb': 493145, 'localization_density_bb': 0.00010139005769094282, 'subregion_measure_bb': 2892}


## Load localization data file

A general function for loading localization data is provided. Targeting specific localization file formats is done through the `file_format` parameter.

In [12]:
path = sp.ROOT_DIR / 'tests/Test_data/rapidSTORM_dstorm_data.txt'
print(path, '\n')

c:\users\soeren\mydata\programming\python\projects\surepy\surepy\tests\Test_data\rapidSTORM_dstorm_data.txt 



In [13]:
dat = sp.load_locdata(path=path, file_type=sp.FileType.RAPIDSTORM, nrows=1000)

In [14]:
dat.print_summary()

identifier: "4"
comment: ""
creation_date: "2021-03-04 13:46:58 +0100"
modification_date: ""
source: EXPERIMENT
state: RAW
element_count: 999
frame_count: 48
file_type: RAPIDSTORM
file_path: "c:\\users\\soeren\\mydata\\programming\\python\\projects\\surepy\\surepy\\tests\\Test_data\\rapidSTORM_dstorm_data.txt"



The file type can be specified by using the enum class `FileType` and use tab control to make a choice.

In [15]:
sp.FileType.__members__

mappingproxy({'UNKNOWN_FILE_TYPE': <FileType.UNKNOWN_FILE_TYPE: 0>,
              'CUSTOM': <FileType.CUSTOM: 1>,
              'RAPIDSTORM': <FileType.RAPIDSTORM: 2>,
              'ELYRA': <FileType.ELYRA: 3>,
              'THUNDERSTORM': <FileType.THUNDERSTORM: 4>,
              'ASDF': <FileType.ASDF: 5>,
              'NANOIMAGER': <FileType.NANOIMAGER: 6>})

In [16]:
sp.FileType.RAPIDSTORM

<FileType.RAPIDSTORM: 2>