# Pickprops & Filter

Notebook to analyze 'locs_picked.hdf5' file in order to get 'locs_picked_props.hdf5' file containing kinetic parameters according to lbFCS and qPAINT analysis. Kinetic filter will be applied automatically. Resulting file can be dragged and dropped into picasso.filter module.

![](../../docs/figures/pickprops&filter.png)


### Define paths
Define paths to data, i.e. '_locs_picked.hdf5' as obtained by picasso.render module. Multiple files can be processed by extending the list via `dir_names.extend(['directory path'])` and `file_names.extend(['file name'])`. `dir_names` and `file_names` must be of equal length.

In [33]:
dir_names=[]
dir_names.extend(['/fs/pool/pool-schwille-paint/Data/p12.ACAB/19-08-07_id143+144/114_Pm2_5nM_p35uW_1/19-08-07_PS'])

file_names=[]
file_names.extend(['114_Pm2_5nM_p35uW_1_MMStack_Pos0.ome_locs_render_picked.hdf5']) 

### Define input variables
Define input variables for lbFCS and qPAINT analysis and data saving:
* Set imager concentration (in nM) of each measurement in `conc`.
* Set ignore-dark (in frames) value of qPAINT analysis in `ignore`, e.g. `ignore=1` means that two bright events interrupted by 1 dark frame will be treated as one bright event of combined duration.

The 'advanced' variables mean the following:
* The result will be saved in the (input path) + (file extension) that can be changed via `savename_ext`.
* If `omit_dist=False` the complete (arrays!) autocorrelation, autocorrelation-lagtime, bright time distribution, dark-time dsitribution will be part of the output and the data will not be saved automatically! But single traces and autocorrelations can be plotted neatly at the end of the script. If `omit_dist=True` these will be omitted in the results an the result is saved automatically.
* `kin_filter=True` applies automatic filtering as described in the paper.
* `NoPartitions` sets the number of partitions for parallel computing using [dask](https://docs.dask.org/en/latest/). If `NoPartitions=1` props are computed with non parallelized version which is generally faster for machines with low computing power. 

In [37]:
#### Standard
conc=[5]
ignore=1

#### Advanced
savename_ext='_props_ig%i'%(ignore) # File extension for processed file

omit_dist=False
kin_filter=True
NoPartitions=30

### Run pickprops & filter

In [38]:
#################################################### Load packages
import os #platform independent paths
import importlib
from IPython.display import clear_output
import warnings
warnings.filterwarnings("ignore")

# Load user defined functions
import lbfcs.pickprops as props
import lbfcs.io as io
import lbfcs.pickprops_calls as props_call
# Reload modules
importlib.reload(props)
importlib.reload(props_call)

############################################################# Read locs, apply props & save locs
#### Create list of paths
path=[os.path.join(dir_names[i],file_names[i]) for i in range(0,len(file_names))]

#### Dictonary added content for info '.yaml' file
props_info={'Generated by':'pickprops.get_props',
            'ignore':ignore,
            'omit_dist':omit_dist,
            'kin_filter': kin_filter}

#### Read-Apply-Save loop
for i in range(0,len(path)):
    #### File read in
    print('File read in ...')
    locs,locs_info=io.load_locs(path[i])
    
    #### Get number of frames
    NoFrames=locs_info[0]['Frames']
    
    #### Apply props
    print('Calculating kinetics ...')
    if NoPartitions==1:
        print('... non-parallel')
        locs_props=props.apply_props(locs,conc[i],NoFrames,ignore)
    elif NoPartitions>1:
        print('... in parallel')
        locs_props=props.apply_props_dask(locs,conc[i],NoFrames,ignore,NoPartitions)
    
    #### Drop objects for saving if omit=True
    if omit_dist:
        print('Removing distribution-lists from output ...')
        locs_props=locs_props.drop(['trace','tau','g','tau_b_dist','tau_d_dist'],axis=1)
    
    if kin_filter:
        print('Applying kinetic filter ...')
        locs_props=props._kin_filter(locs_props)
    
    #### Add nearest neigbour pick and distance
    print('Calculating nearest neighbour ...')
    locs_props=props_call.props_add_nn(locs_props)
    
    #### Save .hdf5 and .yaml of locs_props
    if omit_dist:
        print('File saving ...')
        io.save_locs(path[i].replace('.hdf5',savename_ext+'.hdf5'),
                        locs_props,
                        [locs_info,props_info],
                        mode='picasso_compatible')

File read in ...
Calculating kinetics ...
... in parallel
[########################################] | 100% Completed | 24.9s
Applying kinetic filter ...



  0%|          | 0/6110 [00:00<?, ?it/s][A
  1%|▏         | 79/6110 [00:00<00:07, 781.47it/s]

Calculating nearest neighbour ...


[A
  2%|▏         | 114/6110 [00:00<00:11, 503.05it/s][A
  2%|▏         | 143/6110 [00:00<00:14, 407.88it/s][A
  3%|▎         | 184/6110 [00:00<00:14, 406.74it/s][A
  4%|▍         | 240/6110 [00:00<00:13, 441.25it/s][A
  5%|▍         | 295/6110 [00:00<00:12, 467.84it/s][A
  6%|▌         | 344/6110 [00:00<00:12, 473.35it/s][A
  6%|▋         | 394/6110 [00:00<00:11, 480.95it/s][A
  7%|▋         | 455/6110 [00:00<00:11, 511.75it/s][A
  9%|▉         | 538/6110 [00:01<00:09, 576.76it/s][A
 10%|█         | 621/6110 [00:01<00:08, 633.10it/s][A
 12%|█▏        | 705/6110 [00:01<00:07, 682.30it/s][A
 13%|█▎        | 791/6110 [00:01<00:07, 724.44it/s][A
 14%|█▍        | 874/6110 [00:01<00:06, 751.54it/s][A
 16%|█▌        | 952/6110 [00:01<00:06, 738.05it/s][A
 17%|█▋        | 1039/6110 [00:01<00:06, 771.01it/s][A
 18%|█▊        | 1126/6110 [00:01<00:06, 795.61it/s][A
 20%|█▉        | 1213/6110 [00:01<00:06, 814.95it/s][A
 21%|██▏       | 1300/6110 [00:01<00:05, 830.24it/s][A
 2

## Further usage
### What are the results?
Here is an overview of all the computed variables per pick saved in the pandas.DataFrame `locs_props`.

In [43]:
locs_props.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6109 entries, 0 to 6108
Data columns (total 33 columns):
group           6109 non-null int64
g               6109 non-null object
mono_A          6109 non-null float64
mono_A_lin      6109 non-null float64
mono_chi        6109 non-null float64
mono_tau        6109 non-null float64
mono_tau_lin    6109 non-null float64
mono_taub       6109 non-null float64
mono_taud       6109 non-null float64
tau             6109 non-null object
trace           6109 non-null object
n_events        6109 non-null float64
tau_b           6109 non-null float64
tau_b_dist      6109 non-null object
tau_b_lin       6109 non-null float64
tau_b_mean      6109 non-null float64
tau_d           6109 non-null float64
tau_d_dist      6109 non-null object
tau_d_lin       6109 non-null float64
tau_d_mean      6109 non-null float64
mean_frame      6109 non-null float64
mean_x          6109 non-null float64
mean_y          6109 non-null float64
mean_photons    6109 non-n

Hence we can e.g. access the number of bright events `n_events` for all picks (i.e. `group`) by typing:

In [46]:
locs_props.loc[:,['group','n_events']]

Unnamed: 0,group,n_events
0,10,164.0
1,12,153.0
2,13,157.0
3,18,121.0
4,23,122.0
5,24,109.0
6,25,138.0
7,26,111.0
8,27,172.0
9,29,98.0


Here is a complete list of the meaning of all variables:
* `group` : ID for specific pick as assigned by picasso.render when '_picks.yaml' file is loaded and saved with >File>Save picked localizations