# Tether particles preprocessing - Example

Contains procedure for extracting and preprocessing tether particles, and related boundary and segment particles for subtomogram averaging, as presented in: https://doi.org/10.1101/2024.12.18.629213 .

Performs the following tasks:  
- Extracts tethers detected by Pyto hierarchical connectivity procedure
- Extracts boundary and segment particles
- Filters tether particles
- Selectively smooths (filters) membranes of tethers
- Randomizes outide-of-mask pixels in particle images
- Generates star files for refinement by relion

Prerequisites:
- Tomograms for munc13-snap25 project (EMPIAR-12512)
- Presnaptic detection and analysis by Pyto
- The above were published in Papantoniou and Laugks et al (2023), https://doi.org/10.1126/sciadv.adf6222

Preprocessing workflows for tether averaging:

(i) Plain tether particles: Task 1, Task 2e, Task 5a  
(ii) Filtered SV: Task 1, Task 2a, Task 2e, Task 3, Task 4, Task 5a  
(iii) Filtered SV&PM: Task 1, Task 2a, Task 2e, Task 3, Task 4, Task 5a  
(iv) Focused: Task 1, Task 2a, Task 2d, Task 2e, Task 3, Task 3, Task 5b, Tack 5c, Task 5a  

Common: Initialization and Parameters sections needed to be exected first for all workflows  

Boundaries set (for boundary averiging): Task 1, Task 2b, Task 2e  



## Initialization

In [1]:
import sys
import os
import pickle
from copy import copy, deepcopy
import re
import itertools

import numpy as np
import scipy as sp
import pandas as pd 
import skimage

import matplotlib.pyplot as plt

import pyto
from pyto.io.pandas_io import PandasIO
from pyto.segmentation.neighborhood import Neighborhood
from pyto.geometry.cylinder import Cylinder
from pyto.geometry.rigid_3d import Rigid3D
from pyto.particles.set_path import SetPath
import pyto.particles.relion_tools as relion_tools
from pyto.particles.relion_tools import get_array_data, write_table
from pyto.particles.set import Set
from pyto.particles.label_set import LabelSet
from pyto.particles.boundary_set import BoundarySet
import pyto.particles.extract_mps as extract_mps
from pyto.particles.extract_mps import ExtractMPS, Paths
from pyto.spatial.multi_particle_sets import MultiParticleSets
import pyto.spatial.coloc_functions as col_func
from pyto.spatial.coloc_functions import get_tomo_id
import pyto.projects.presynaptic as presynaptic
from pyto.projects.presynaptic import Presynaptic, tomo_generator
from pyto.spatial.particle_sets import ParticleSets
from pyto.spatial.multi_particle_sets import MultiParticleSets

%autosave 0
sys.version

Autosave disabled


'3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:33:10) \n[GCC 12.3.0]'

In [2]:
# Settings

# force showing all rows in a table (default 60 rows)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)

In [3]:
# set hostname
hostname = os.uname()[1]
print(f"Host: {hostname}")

Host: rauna


## Parameters

Need to be executed each time

In [4]:
#
# General parameters that may need to be changed
#

# particle (image) size [pix]
particle_size = 64

# distance in z between subtomo center and tether position [pix]
particle_size_center = {64: 10, 48: 8, 32: 6}
tether_to_center = particle_size_center[particle_size]

# invert_contract (from density low to density high pixel values)
invert_contrast = True

# select tomos
if hostname == 'rauna':
    #morse_root = '../../columns_munc13/morse'
    #pre_star_path = '../external/extended_pre_pick-3_v2.star'
    tomo_ids = ['m13_ctrl_204']
else:
    #morse_root = '../../columns_munc13/morse'
    #pre_star_path = '../external/extended_pre_pick-3_v2.star'
    tomo_ids = None  # use all tomos
    #tomo_ids = ['m13_ctrl_204']  # use specified tomos
    
# paricle classes that are considered for splitting into subclasses
#class_names = ['tethi']
class_names = None  # keep all classes

# subclasses codes and names
class_code = {0: 'short', 1: 'medium', 2: 'long'}

# print info
verbose = True

In [5]:
#
# General parameters that should not be changed
#

# name given to the original (plain) tether particle set 
name = 'original'
if not invert_contrast:
    name = f"{name}_tomo_contrast"

# ids assigned to active zone (plasma membrane), cytosole and synaptic vesicles 
# of region particles generated in task 2a and used for smoothing (task 4)
# and task 2e
# have to be all different
az_mem_id = 2
presyn_cyto_id = 3
sv_id = 5

# id assigned to all segments of segment particles generated in tasks 2c and 2d,
# used in tasks 2e 
seg_id = 4

# comment written in star files
star_comment = 'All hierarchical connectivity tethers'

# plain (original) particle paths, good to keep the default values
root_template = '../particles_example_bin-2_size-{size}'
tables_dir = 'tables'
paths = Paths(
    name=name, root_template=root_template, size=particle_size, tables=tables_dir)
preliminary_tables_dir = 'tables_preliminary'
preliminary_paths = Paths(
    name=name, root_template=root_template, size=particle_size, tables=preliminary_tables_dir)

# clean initial (input) tethers
clean_initial = True
    
# randomize rot angle
randomize_rot = True

# out star file labels
label_format = {
    'rlnMicrographName': '%s', 'rlnCtfImage': '%s', 
    'rlnImageName': '%s', 'rlnCoordinateX': '%d', 
    'rlnCoordinateY': '%d', 
    'rlnCoordinateZ': '%d', 
    'rlnAngleTilt': '%8.3f', 'rlnAngleTiltPrior': '%8.3f', 
    'rlnAnglePsi': '%8.3f', 'rlnAnglePsiPrior': '%8.3f', 
    'rlnAngleRot': '%8.3f'}

# ctf label
ctf_label = 'rlnCtfImage'

# tomo and boundary related columns
tomo_particle_col = 'tomo_particle'
region_particle_col = 'reg_particle'
in_tomo_particle_col = 'in_tomo_particle'

# particle center coordinates in reg frame
center_reg_frame_cols = [
    'x_center_reg_frame', 'y_center_reg_frame', 'z_center_reg_frame']

# particle center coordinates in full tomo frame
center_init_frame_cols = [
    'x_center_init_frame', 'y_center_init_frame', 'z_center_init_frame']    

# columns containing membrane normal angles
normal_angle_cols = ['normal_theta', 'normal_phi']

# columns related to box corners 
tomo_l_corner_cols = ['x_l_corner_tomo', 'y_l_corner_tomo', 'z_l_corner_tomo']
tomo_r_corner_cols = ['x_r_corner_tomo', 'y_r_corner_tomo', 'z_r_corner_tomo']
reg_l_corner_cols = ['x_l_corner_reg', 'y_l_corner_reg', 'z_l_corner_reg']
reg_r_corner_cols = ['x_r_corner_reg', 'y_r_corner_reg', 'z_r_corner_reg']

# columns indicating whether particles are inside tomos 
tomo_inside_col = 'tomo_inside'
reg_inside_col = 'region_inside'

# column that indicates if regions and segments are present
found_ids_col = 'found_ids'


## Task 1: Make plain (original) particles from mps

### Prerequisite - Input data

* Shows the fomat of input tables
* Template for making input object

In [6]:
# path to the input tomos and particles MPS object
mps_path = 'example_input_mps/input.pkl'

# read input object
mps_in = MultiParticleSets.read(path=mps_path)

Read  MPS object example_input_mps/input.pkl


In [7]:
mps_in.tomos.columns

Index(['tomo_id', 'tomo', 'region', 'region_offset_x', 'region_offset_y',
       'region_offset_z', 'pixel_nm', 'coord_bin', 'region_id', 'region_bin',
       'rlnCtfImage'],
      dtype='object')

In [8]:
mps_in.tomos.head()

Unnamed: 0_level_0,tomo_id,tomo,region,region_offset_x,region_offset_y,region_offset_z,pixel_nm,coord_bin,region_id,region_bin,rlnCtfImage
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0,m13_ctrl_108,/fs/pool/pool-lucic2/columns_munc13/morse/syn_...,/fs/pool/pool-lucic2/columns_munc13/spatial_an...,980,820,356,0.878,4,-1,2,/fs/pool/pool-lucic2/columns_munc13/morse/trac...
1,m13_ctrl_109,/fs/pool/pool-lucic2/columns_munc13/morse/syn_...,/fs/pool/pool-lucic2/columns_munc13/spatial_an...,1064,1256,476,0.878,4,-1,2,/fs/pool/pool-lucic2/columns_munc13/morse/trac...
2,m13_ctrl_110,/fs/pool/pool-lucic2/columns_munc13/morse/syn_...,/fs/pool/pool-lucic2/columns_munc13/spatial_an...,320,928,456,0.878,4,-1,2,/fs/pool/pool-lucic2/columns_munc13/morse/trac...
3,m13_ctrl_114,/fs/pool/pool-lucic2/columns_munc13/morse/syn_...,/fs/pool/pool-lucic2/columns_munc13/spatial_an...,1188,880,436,0.878,4,-1,2,/fs/pool/pool-lucic2/columns_munc13/morse/trac...
4,m13_ctrl_202,/fs/pool/pool-lucic2/columns_munc13/morse/syn_...,/fs/pool/pool-lucic2/columns_munc13/spatial_an...,1184,1028,256,0.878,4,-1,2,/fs/pool/pool-lucic2/columns_munc13/morse/trac...


In [9]:
mps_in.particles.columns

Index(['group', 'tomo_id', 'particle_id', 'class_number', 'class_name',
       'pixel_nm', 'x_orig', 'y_orig', 'z_orig', 'rlnAngleTilt',
       'rlnAngleTiltPrior', 'rlnAnglePsi', 'rlnAnglePsiPrior', 'rlnAngleRot'],
      dtype='object')

In [10]:
mps_in.particles.head()

Unnamed: 0,group,tomo_id,particle_id,class_number,class_name,pixel_nm,x_orig,y_orig,z_orig,rlnAngleTilt,rlnAngleTiltPrior,rlnAnglePsi,rlnAnglePsiPrior,rlnAngleRot
0,m13_ctrl,m13_ctrl_108,37,0,tethi,0.878,636,471,193,17.99377,10.101216,5.31244,351.63832,208.17333
1,m13_ctrl,m13_ctrl_108,38,2,tethi,0.878,650,461,347,114.488287,106.882497,353.920785,348.855454,174.59049
2,m13_ctrl,m13_ctrl_108,100,1,tethi,0.878,537,699,266,82.453127,81.204421,338.82046,327.325796,171.52008
3,m13_ctrl,m13_ctrl_108,144,0,tethi,0.878,544,666,224,67.721503,63.767802,335.326754,330.217158,177.09833
4,m13_ctrl,m13_ctrl_108,163,2,tethi,0.878,629,477,186,35.949287,29.870639,342.492736,354.655863,174.75474


In [None]:
# Template for making input object

# input object path
mps_path_example = "..."

# make tomos table like above
tomos = pd.DataFrame(...)

# make particles table like above
particles = pd.DataFrame(...)

# make an oject and save
mps_in = MultiParticleSets()
mps_in.tomos = tomos
mps_in.particles = particles
mps_in.write(path=mps_path_example, verbose=True)


### Execute task

In [8]:
# Parameters - no reason to change them really

# path to the input tomos and particles MPS object
mps_path = 'example_input_mps/input.pkl'

# particle write flags
write_parts = True
write_regions = False

# invert direction of input particle Euler angles
reverse = False  # angles define direction from membrane to cytoplasm
#reverse = True  # angles define direction from membrane to extracelular space

# user input prior angles
use_priors = True

# tomo column name
tomo_col = 'tomo'

# particle normalization parameters
mean = 0
std = 1

# useful when running on different machines or when file system changes 
# used to convert path beginning up to (including) convert_path_common by convert_path_helper
if hostname.startswith('hpcl'):
    convert_path_common = 'pool-lucic2'
    convert_path_helper = '/fs/pool/pool-lucic2'
elif hostname == 'rauna':
    convert_path_common = 'morse'
    #morse_root = '../../columns_munc13/morse'
    convert_path_helper = os.path.abspath('../../columns_munc13/morse')
else:
    convert_path_common = None
    convert_path_helper = None
    
# setup and execute task
ex_mps = ExtractMPS(
    tomo_ids=tomo_ids, box_size=particle_size, root_template=root_template, 
    normal_angle_cols=normal_angle_cols,
    center_init_frame_cols=center_init_frame_cols, center_reg_frame_cols=center_reg_frame_cols,
    tomo_col=tomo_col, 
    ctf_label=ctf_label, check_ctf=False,
    tomo_l_corner_cols=tomo_l_corner_cols, tomo_r_corner_cols=tomo_r_corner_cols,
    reg_l_corner_cols=reg_l_corner_cols, reg_r_corner_cols=reg_r_corner_cols,
    tomo_inside_col=tomo_inside_col, reg_inside_col=reg_inside_col,
    tomo_particle_col=tomo_particle_col, region_particle_col=region_particle_col,
    tables_dir=preliminary_tables_dir, class_names=class_names, class_code=class_code)

ex_mps.extract_particles_task(
    input_mode='mps', mps_path=mps_path, name=name, reverse=reverse,
    randomize_rot=randomize_rot, particle_to_center=tether_to_center, 
    expand_particle=False, expand_region=True,
    mean=mean, std=std, invert_contrast=invert_contrast,
    name_prefix='tether_', name_suffix='',
    convert_path_common=convert_path_common, convert_path_helper=convert_path_helper,     
    write_particles=write_parts, write_regions=write_regions, morse_regions=False,
    star_comment=star_comment, verbose=True
)

Read  MPS object ../simple_mps/example

All particles:
Wrote particles to ../particles_example_bin-2_size-64/original
Pickled preliminary MPS object to ../particles_example_bin-2_size-64/original/tables_preliminary/original_tmp.pkl
Wrote all particles star file ../particles_example_bin-2_size-64/original/tables_preliminary/original_all.star
Pickled json converted DataFrame version of all particles star file to ../particles_example_bin-2_size-64/original/tables_preliminary/original_all_star_json.pkl

Particle classes:
Wrote class short particles star file ../particles_example_bin-2_size-64/original/tables_preliminary/original_short.star
Pickled json converted DataFrame version of class short particles star file to ../particles_example_bin-2_size-64/original/tables_preliminary/original_short_star_json.pkl
Wrote class medium particles star file ../particles_example_bin-2_size-64/original/tables_preliminary/original_medium.star
Pickled json converted DataFrame version of class medium parti

## Task 2: Make regions (boundaries) and segments

* Extracts bundary and segment particles (subtomos) from presynaptic m13-snap25 project
* Reads output of task 1
* Note: 'regions' and 'boundaries' are used interchangedly

### Prerequisite - Read presynaptic analysis data

Execute for Tasks 2a and 2b

In [9]:
# Parameters

# path to the presynaptic analysis work file 
if hostname.startswith('hpcl5'):
    work_file_rel = 'pool-lucic2/christos/MunC13/analysis/work_vl/work_vl.py'
elif hostname == 'rauna':
    work_file_rel = 'pool/munc13/analysis/work_vl/work_vl.py'
else:
    work_file_rel = 'Presynaptic analysis work file'
work_path = os.path.join(os.getenv("HOME"), work_file_rel)

# columns of the presynaptic scalar tables that contain image paths
boundary_path_col = 'sv_membrane_file'
tethers_path_col = 'tethers_file'
connectors_path_col = 'connectors_file'

# path conversion rules for the above paths that allow running on different machines 
# used to convert path beginning up to (including) convert_path_common by convert_path_helper
if hostname == 'rauna':
    common_path = 'segmentation'
    helper_path = '/home/vladan/pool/munc13/segmentation'
elif hostname.startswith('hpcl5'):
    common_path = 'pool-lucic2'
    helper_path = '/fs/pool/pool-lucic2'
else:
    convert_path_common = None
    convert_path_helper = None

In [10]:
# Import the work file from the presynaptic analysis and preprocess

# import work
presyn = Presynaptic()
work = presyn.load(path=work_path)

# save absolute path to work
work_path_norm = os.path.normpath( os.path.join(os.getcwd(), work.__file__) )

# Read presynaptic munc13-snap25 project tether data
tether_indexed = work.tether.indexed_data.sort_values(by=['group', 'identifiers'])
tether_scalar = work.tether.scalar_data.sort_values(by=['group', 'identifiers'])
tether_scalar, tether_indexed = presyn.format(
    scalar=tether_scalar, indexed=tether_indexed)

# convert paths 
set_path = SetPath(common=common_path, helper_path=helper_path)
tether_scalar[boundary_path_col] = tether_scalar[boundary_path_col].map(
    lambda x: set_path.convert_path(x))
tether_scalar[tethers_path_col] = tether_scalar[tethers_path_col].map(
    lambda x: set_path.convert_path(x))
tether_scalar[connectors_path_col] = tether_scalar[connectors_path_col].map(
    lambda x: set_path.convert_path(x))

In [11]:
# Somewhat heuristic determination of AZ and SV ids
# ToDo: Use presyn object to determine ids for each tomo

print(f"Id values used in the following section are:")
bound_ids = np.vstack(tether_indexed['boundaries'].to_numpy())
az_ids = np.unique(bound_ids[:, 0])
if len(az_ids) == 1:
    az_mem_id_old = az_ids[0]
    print(f"\t- AZ membrane (var name az_mem_id_old): {az_mem_id_old} (determined)")
else:
    print(f"\t- AZ membrane (var name az_mem_id_old): could not be determined: "
          + f"values found {az_ids}")
presyn_cyto_id_old = 3 
print(f"\t- Presynaptic cytoplasm (var name presyn_cyto_id_old): "
          + f"{presyn_cyto_id_old} (guessed)")
min_sv_id_old = work.sv.indexed_data['ids'].to_numpy().min()
if min_sv_id_old > bound_ids[:, 1].min():
    print("Could not determine the smallest SV id")
else:
    print(f"\t- Smallest SV id (var name min_sv_id_old): {min_sv_id_old} (determined)")


Id values used in the following section are:
	- AZ membrane (var name az_mem_id_old): 2 (determined)
	- Presynaptic cytoplasm (var name presyn_cyto_id_old): 3 (guessed)
	- Smallest SV id (var name min_sv_id_old): 9 (determined)


### Task 2a: Make regions where all ids are different

Requires a previous execution of Task 1  
Used for further tasks that require filtering (Filtered and Focused preprocessing)

In [12]:
# read particles; execute if making original particles (task 1) was
# not executed in this session
tet = MultiParticleSets.read(path=preliminary_paths.mps_path_tmp)

Read  MPS object ../particles_example_bin-2_size-64/original/tables_preliminary/original_tmp.pkl


In [13]:
# Set parameters and run - 

# name given to the regions (boundaries) generated here
regions_name = 'regions'

# flag indicating if region particles are written
write_regions = True

# bin factor of regions (boundaries) tomo from presynaptic project
region_bin = 4

# bin factor needed to bring region_bin to particles
# (>1 means magnify, that is transform to lower bin)
region_bin_factor = 2

# determines how are the images obtained, values:
#   - 'pkl_boundary': boundaries (regions) from structure pickles
#   - 'pkl_segment': segments from structure pickles
region_path_mode = 'pkl_boundary'

# rules that specify boundary ids, given as argument to function ex_mps.prepare_func(),
# (see above how the old id values were determined) 
normalize_bound_fun_kwargs = {
    'min_id_old': min_sv_id_old, 'id_new': sv_id, 
    'id_conversion': {
        az_mem_id_old: az_mem_id, presyn_cyto_id_old: presyn_cyto_id}} 

# dilate regions
dilate = None

# output dtype
out_dtype = np.float32

# remove region related columns 
# (so that they don't get confused with the new values)
#if hostname != 'slava':
#    remove_region_initial = True
#else:
remove_region_initial = False

# execute task
ex_mps = ExtractMPS(
    tomo_ids=tomo_ids, box_size=particle_size, root_template=root_template,
    region_bin=region_bin, region_bin_factor=region_bin_factor,
    remove_region_initial=remove_region_initial,
    init_coord_cols=center_init_frame_cols, center_reg_frame_cols=center_reg_frame_cols,    
    tomo_l_corner_cols=tomo_l_corner_cols, tomo_r_corner_cols=tomo_r_corner_cols,
    reg_l_corner_cols=reg_l_corner_cols, reg_r_corner_cols=reg_r_corner_cols,
    tomo_inside_col=tomo_inside_col, reg_inside_col=reg_inside_col,
    tomo_particle_col=tomo_particle_col, region_particle_col=region_particle_col,
    tables_dir=preliminary_tables_dir, class_names=class_names, class_code=class_code
)
ex_mps.extract_regions_task(
    mps=tet, scalar=tether_scalar, indexed=tether_indexed, struct_path_col='tethers_file', 
    region_path_mode=region_path_mode,
    convert_path_common=common_path, convert_path_helper=helper_path,
    path_col=tet.region_col, offset_cols=tet.region_offset_cols, 
    shape_cols=tet.region_shape_cols, bin_col=tet.region_bin_col,
    normalize_kwargs=normalize_bound_fun_kwargs, 
    dilate=dilate, out_dtype=out_dtype,
    fun=None, fun_kwargs={},
    expand=True, name_prefix='seg_', name_suffix='', 
    write_regions=write_regions, regions_name=regions_name, 
    mps_path=preliminary_paths.mps_path, verbose=verbose)

All regions:
Wrote regions (pkl_boundary) to ../particles_example_bin-2_size-64/regions
Pickled preliminary MPS object to ../particles_example_bin-2_size-64/regions/tables_preliminary/regions.pkl
Pickled preliminary MPS object to ../particles_example_bin-2_size-64/original/tables_preliminary/original.pkl
Wrote all regions star file ../particles_example_bin-2_size-64/regions/tables_preliminary/regions_all.star
Pickled json converted DataFrame version of all regions star file to ../particles_example_bin-2_size-64/regions/tables_preliminary/regions_all_star_json.pkl

Individual classes of regions:
Wrote class short regions star file ../particles_example_bin-2_size-64/regions/tables_preliminary/regions_short.star
Pickled json converted DataFrame version of class short regions star file to ../particles_example_bin-2_size-64/regions/tables_preliminary/regions_short_star_json.pkl
Wrote class medium regions star file ../particles_example_bin-2_size-64/regions/tables_preliminary/regions_medium.

### Task 2b: Make regions where membrane ids are the same 

Requires a previous execution of Task 1  
Used for averaging regions (Boundaries set)

In [14]:
# read particles, execute if making original particles (task 1) 
# was not executed in this session
tet = MultiParticleSets.read(path=preliminary_paths.mps_path_tmp)

Read  MPS object ../particles_example_bin-2_size-64/original/tables_preliminary/original_tmp.pkl


In [15]:
# Set parameters and run

# name given to the regions (boundaries) generated here
regions_name = 'regions_az-sv-2_cyto-1_float'

# flag indicating if region particles are written
write_regions = True

# bin factor of regions (boundaries) tomo from presynaptic project
region_bin = 4

# bin factor needed to bring region_bin to particles
# (>1 means magnify, that is transform to lower bin)
region_bin_factor = 2

# determines how are the images obtained, values:
#   - 'pkl_boundary': boundaries (regions) from structure pickles
#   - 'pkl_segment': segments from structure pickles
region_path_mode = 'pkl_boundary'

# ids assigned to active zone (plasma membrane), cytosole and synaptic vesicles 
# of region particles generated here
az_mem_id_loc = 2
presyn_cyto_id_loc = 1
sv_id_loc = 2

# rules that specify boundary ids, given as argument to ex_mps.prepare_func()
# (see above how the old id values were determined) 
normalize_bound_fun_kwargs = {
    'min_id_old': min_sv_id_old, 'id_new': sv_id_loc, 
    'id_conversion': {
        az_mem_id_old: az_mem_id_loc, presyn_cyto_id_old: presyn_cyto_id_loc}}
    
# dilate regions
dilate = None

# output dtype
out_dtype = np.float32

# remove region related columns 
# (so that they don't get confused with the new values)
#if hostname != 'slava':
#    remove_region_initial = True
#else:
remove_region_initial = False

# execute task
ex_mps = ExtractMPS(
    tomo_ids=tomo_ids, box_size=particle_size, root_template=root_template,
    region_bin=region_bin, region_bin_factor=region_bin_factor,
    remove_region_initial=remove_region_initial,
    init_coord_cols=center_init_frame_cols, center_reg_frame_cols=center_reg_frame_cols,    
    tomo_l_corner_cols=tomo_l_corner_cols, tomo_r_corner_cols=tomo_r_corner_cols,
    reg_l_corner_cols=reg_l_corner_cols, reg_r_corner_cols=reg_r_corner_cols,
    tomo_inside_col=tomo_inside_col, reg_inside_col=reg_inside_col,
    tomo_particle_col=tomo_particle_col, region_particle_col=region_particle_col,
    tables_dir=preliminary_tables_dir, class_names=class_names, class_code=class_code
)
ex_mps.extract_regions_task(
    mps=tet, scalar=tether_scalar, indexed=tether_indexed, struct_path_col='tethers_file', 
    region_path_mode=region_path_mode,
    convert_path_common=common_path, convert_path_helper=helper_path,
    path_col=tet.region_col, offset_cols=tet.region_offset_cols, 
    shape_cols=tet.region_shape_cols, bin_col=tet.region_bin_col,
    normalize_kwargs=normalize_bound_fun_kwargs,
    dilate=dilate, out_dtype=out_dtype,
    expand=True, name_prefix='seg_', name_suffix='', 
    write_regions=write_regions, regions_name=regions_name, mps_path=None,
    verbose=verbose)

All regions_az-sv-2_cyto-1_float:
Wrote regions_az-sv-2_cyto-1_float (pkl_boundary) to ../particles_example_bin-2_size-64/regions_az-sv-2_cyto-1_float
Pickled preliminary MPS object to ../particles_example_bin-2_size-64/regions_az-sv-2_cyto-1_float/tables_preliminary/regions_az-sv-2_cyto-1_float.pkl
Wrote all regions_az-sv-2_cyto-1_float star file ../particles_example_bin-2_size-64/regions_az-sv-2_cyto-1_float/tables_preliminary/regions_az-sv-2_cyto-1_float_all.star
Pickled json converted DataFrame version of all regions_az-sv-2_cyto-1_float star file to ../particles_example_bin-2_size-64/regions_az-sv-2_cyto-1_float/tables_preliminary/regions_az-sv-2_cyto-1_float_all_star_json.pkl

Individual classes of regions_az-sv-2_cyto-1_float:
Wrote class short regions_az-sv-2_cyto-1_float star file ../particles_example_bin-2_size-64/regions_az-sv-2_cyto-1_float/tables_preliminary/regions_az-sv-2_cyto-1_float_short.star
Pickled json converted DataFrame version of class short regions_az-sv-2_cyto

### Task 2c: Make tether segments 

Reads (saved output of) a previous execution of Task 1  
Just an example, not used for further processing (use task 2d instead)

In [16]:
# read particles, execute if making original particles (task 1) 
# was not executed in this session
tet = MultiParticleSets.read(path=preliminary_paths.mps_path_tmp)

Read  MPS object ../particles_example_bin-2_size-64/original/tables_preliminary/original_tmp.pkl


In [17]:
# Set parameters and run

# name given to segment particles generated here 
regions_name = 'segments'

# flag indicating if region particles are written
write_regions = True

# bin factor of regions (boundaries) tomo from presynaptic project
region_bin = 4

# bin factor needed to bring region_bin to particles
# (>1 means magnify, that is transform to lower bin)
region_bin_factor = 2

# determines how are the images obtained, values:
#   - 'pkl_boundary': boundaries (regions) from structure pickles
#   - 'pkl_segment': segments from structure pickles
region_path_mode = 'pkl_segment'

# smallest segment id of segment tomos from presynaptic project
min_seg_id_old = 1

# id assigned to all segments of segments  particles generated here
seg_id = 4

# rules for changing segment ids, given as argument to ex_mps.prepare_func()
normalize_bound_fun_kwargs = {
    'min_id_old': min_seg_id_old, 'id_new': seg_id}

# dilate segments
dilate = None

# output dtype
out_dtype = np.float32

# remove region related columns 
# (so that they don't get confused with the new values)
#if hostname != 'slava':
#    remove_region_initial = True
#else:
remove_region_initial = False

# execute task
ex_mps = ExtractMPS(
    tomo_ids=tomo_ids, box_size=particle_size, root_template=root_template,
    region_bin=region_bin, region_bin_factor=region_bin_factor,
    remove_region_initial=remove_region_initial,
    init_coord_cols=center_init_frame_cols, center_reg_frame_cols=center_reg_frame_cols,    
    tomo_l_corner_cols=tomo_l_corner_cols, tomo_r_corner_cols=tomo_r_corner_cols,
    reg_l_corner_cols=reg_l_corner_cols, reg_r_corner_cols=reg_r_corner_cols,
    tomo_inside_col=tomo_inside_col, reg_inside_col=reg_inside_col,
    tomo_particle_col=tomo_particle_col, region_particle_col=region_particle_col,
    tables_dir=preliminary_tables_dir, class_names=class_names, class_code=class_code
)
ex_mps.extract_regions_task(
    mps=tet, scalar=tether_scalar, indexed=tether_indexed, struct_path_col='tethers_file', 
    region_path_mode=region_path_mode,
    convert_path_common=common_path, convert_path_helper=helper_path,
    path_col=tet.region_col, offset_cols=tet.region_offset_cols, 
    shape_cols=tet.region_shape_cols, bin_col=tet.region_bin_col,
    normalize_kwargs=normalize_bound_fun_kwargs,
    dilate=dilate, out_dtype=out_dtype,
    expand=True, name_prefix='seg_', name_suffix='', 
    write_regions=write_regions, regions_name=regions_name, mps_path=None, 
    verbose=verbose)

All segments:
Wrote segments (pkl_segment) to ../particles_example_bin-2_size-64/segments
Pickled preliminary MPS object to ../particles_example_bin-2_size-64/segments/tables_preliminary/segments.pkl
Wrote all segments star file ../particles_example_bin-2_size-64/segments/tables_preliminary/segments_all.star
Pickled json converted DataFrame version of all segments star file to ../particles_example_bin-2_size-64/segments/tables_preliminary/segments_all_star_json.pkl

Individual classes of segments:
Wrote class short segments star file ../particles_example_bin-2_size-64/segments/tables_preliminary/segments_short.star
Pickled json converted DataFrame version of class short segments star file to ../particles_example_bin-2_size-64/segments/tables_preliminary/segments_short_star_json.pkl
Wrote class medium segments star file ../particles_example_bin-2_size-64/segments/tables_preliminary/segments_medium.star
Pickled json converted DataFrame version of class medium segments star file to ../par

### Task 2d: Make dilated tether segments 

Reads (saved output of) a previous execution of Task 1  
Used for Filtered preprocessing

In [18]:
# read particles, execute if making original particles (task 1) 
# was not executed in this session
tet = MultiParticleSets.read(path=preliminary_paths.mps_path_tmp)

Read  MPS object ../particles_example_bin-2_size-64/original/tables_preliminary/original_tmp.pkl


In [19]:
# Set parameters and run

# extent of dilation (in pixels)
dilate = 5

# name given to segment particles generated here 
regions_name = f'segments-dilate-{dilate}'

# flag indicating if region particles are written
write_regions = True

# bin factor of regions (boundaries) tomo from presynaptic project
region_bin = 4

# bin factor needed to bring region_bin to particles
# (>1 means magnify, that is transform to lower bin)
region_bin_factor = 2

# determines how are the images obtained, values:
#   - 'pkl_boundary': boundaries (regions) from structure pickles
#   - 'pkl_segment': segments from structure pickles
region_path_mode = 'pkl_segment'

# image magnification function
mag_fun = sp.ndimage.zoom
mag_fun_kwargs = {'zoom': region_bin_factor, 'order': 0}

# smallest segment id of segment tomos from presynaptic project
min_seg_id_old = 1

# rules for changing segment ids, given as argument to ex_mps.prepare_func()
# (see above how the old id values were determined) 
normalize_bound_fun = ExtractMPS.normalize_bound_ids
normalize_bound_fun_kwargs = {
    'min_id_old': min_seg_id_old, 'id_new': seg_id}

# dilation function and arguments
dilate_fun = sp.ndimage.grey_dilation
dilate_fun_kwargs = {'footprint': skimage.morphology.ball(dilate)}

# put functions together (unlike in Tasks 2a-c, explicit function form is used,
# just to show how other functions can be added)
fun = (normalize_bound_fun, mag_fun, dilate_fun, np.asarray)
fun_kwargs = (
    normalize_bound_fun_kwargs, mag_fun_kwargs, dilate_fun_kwargs,
    {'dtype': np.float32})

# remove region related columns 
# (so that they don't get confused with the new values)
#if hostname != 'slava':
#    remove_region_initial = True
#else:
remove_region_initial = False

# execute task
ex_mps = ExtractMPS(
    tomo_ids=tomo_ids, box_size=particle_size, root_template=root_template,
    region_bin=region_bin, region_bin_factor=region_bin_factor,
    remove_region_initial=remove_region_initial,
    init_coord_cols=center_init_frame_cols, center_reg_frame_cols=center_reg_frame_cols,    
    tomo_l_corner_cols=tomo_l_corner_cols, tomo_r_corner_cols=tomo_r_corner_cols,
    reg_l_corner_cols=reg_l_corner_cols, reg_r_corner_cols=reg_r_corner_cols,
    tomo_inside_col=tomo_inside_col, reg_inside_col=reg_inside_col,
    tomo_particle_col=tomo_particle_col, region_particle_col=region_particle_col,
    tables_dir=preliminary_tables_dir, class_names=class_names, class_code=class_code
)
ex_mps.extract_regions_task(
    mps=tet, scalar=tether_scalar, indexed=tether_indexed, struct_path_col='tethers_file', 
    region_path_mode=region_path_mode,
    convert_path_common=common_path, convert_path_helper=helper_path,
    path_col=tet.region_col, offset_cols=tet.region_offset_cols, 
    shape_cols=tet.region_shape_cols, bin_col=tet.region_bin_col,
    fun=fun, fun_kwargs=fun_kwargs,
    expand=True, name_prefix='seg_', name_suffix='', 
    write_regions=write_regions, regions_name=regions_name, mps_path=None, 
    verbose=verbose)

All segments-dilate-5:
Wrote segments-dilate-5 (pkl_segment) to ../particles_example_bin-2_size-64/segments-dilate-5
Pickled preliminary MPS object to ../particles_example_bin-2_size-64/segments-dilate-5/tables_preliminary/segments-dilate-5.pkl
Wrote all segments-dilate-5 star file ../particles_example_bin-2_size-64/segments-dilate-5/tables_preliminary/segments-dilate-5_all.star
Pickled json converted DataFrame version of all segments-dilate-5 star file to ../particles_example_bin-2_size-64/segments-dilate-5/tables_preliminary/segments-dilate-5_all_star_json.pkl

Individual classes of segments-dilate-5:
Wrote class short segments-dilate-5 star file ../particles_example_bin-2_size-64/segments-dilate-5/tables_preliminary/segments-dilate-5_short.star
Pickled json converted DataFrame version of class short segments-dilate-5 star file to ../particles_example_bin-2_size-64/segments-dilate-5/tables_preliminary/segments-dilate-5_short_star_json.pkl
Wrote class medium segments-dilate-5 star fil

### Task 2e: Clean particles

Reads (saved output of) previous execution of Tasks 1 and 2a-d.  

Removes particles where:
* The corresponding region particles do not contain all expected regions (AZ membrane, vesicle and cytoplasmic region)
* The corresponding segment particles do not contain (tether) segment

This particle removal ammounts to:
* Adding a column in particle tables that indicate which particles are to be kept or removed
* Writing new star files that contain only particles that are to be kept

The resulting tables and star files are saved in tables/ directories, as opposed to tables_preliminary/, which contain all particles (before cleaning) 

Used for all preprocessing cases

In [20]:
# Set parameters and run

# processing results that are to be cleaned
processing_cases = [
    'original', 'regions', 'regions_az-sv-2_cyto-1_float',
    'segments', 'segments-dilate-5']

# expected ids of regions particles
expected_regions = [az_mem_id, presyn_cyto_id, sv_id]

# expected ids of segment particles
expected_segments = [seg_id]

# star file comment
star_comment = "Cleaned all hierarhical tethers"

# execute task
ex_mps = ExtractMPS(
    tables_dir=tables_dir, class_names=class_names, class_code=class_code,
    box_size=particle_size, root_template=root_template)
ex_mps.clean_particles_task(
    processing_cases=processing_cases, preliminary_tables_dir=preliminary_tables_dir,
    expected_regions=expected_regions, expected_segments=expected_segments,
    found_col=found_ids_col, verbose=verbose)

Read  MPS object ../particles_example_bin-2_size-64/regions/tables_preliminary/regions.pkl
Read  MPS object ../particles_example_bin-2_size-64/segments/tables_preliminary/segments.pkl

Processing original: 
Read  MPS object ../particles_example_bin-2_size-64/original/tables_preliminary/original.pkl
Pickled  MPS object to ../particles_example_bin-2_size-64/original/tables/original.pkl
Wrote  star file ../particles_example_bin-2_size-64/original/tables/original_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/original/tables/original_all_star_json.pkl
Wrote class short  star file ../particles_example_bin-2_size-64/original/tables/original_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/original/tables/original_short_star_json.pkl
Wrote class medium  star file ../particles_example_bin-2_size-64/original/tables/original_medium.star
Pickled json converted DataFrame versio

## Task 3: Filter tether particles

Reads saved outputs of tasks 1 and 2a

Used for Filtered and Focused preprocessing

In [6]:
# read initial particles, execute if needed
tet = MultiParticleSets.read(path=paths.mps_path)

Read  MPS object ../particles_example_bin-2_size-64/original/tables/original.pkl


In [7]:
# Gauss low-pass filtering sigma [nm]
sigma_nm = 5

# name given to the output image, MPS and star files 
name_local = f'gauss-{sigma_nm}nm'
if not invert_contrast:
    name_local = f"{name_local}_tomo_contrast"

# comment
star_comment = 'All hierarchical connectivity tethers, Gauss low-pass filtered'

# execute task
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.filter_particles_task(
    mps=tet, fun=sp.ndimage.gaussian_filter, fun_kwargs={}, sigma_nm=sigma_nm, 
    name_init=name, name_filtered=name_local, 
    star_comment=star_comment)


All particles
Wrote particles to ../particles_example_bin-2_size-64/gauss-5nm
Pickled  MPS object to ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm.pkl
Wrote  star file ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm_all_star_json.pkl

Individual classes of gauss-5nm:
Wrote class short  star file ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm_short_star_json.pkl
Wrote class medium  star file ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm_medium.star
Pickled json converted DataFrame version of class medium  star file to ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm_medium_star_json.pkl
Wrote class long  star file ../particles_example_bin-2_size-6

In [8]:
# Gauss low-pass filtering sigma [nm]
sigma_nm = 0.5

# name given to the output image, MPS and star files 
name_local = f'gauss-{sigma_nm}nm'
if not invert_contrast:
    name_local = f"{name_local}_tomo_contrast"

# comment
star_comment = 'All hierarchical connectivity tethers, Gauss low-pass filtered'

# execute task
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.filter_particles_task(
    mps=tet, fun=sp.ndimage.gaussian_filter, fun_kwargs={}, sigma_nm=sigma_nm, 
    name_init=name, name_filtered=name_local, 
    star_comment=star_comment)


All particles
Wrote particles to ../particles_example_bin-2_size-64/gauss-0.5nm
Pickled  MPS object to ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm.pkl
Wrote  star file ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm_all_star_json.pkl

Individual classes of gauss-0.5nm:
Wrote class short  star file ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm_short_star_json.pkl
Wrote class medium  star file ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm_medium.star
Pickled json converted DataFrame version of class medium  star file to ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm_medium_star_json.pkl
Wrote class long  star file .

## Task 4: Smooth membranes in tether particles

Reads saved output of tasks 1, 2a and 3 

Used for Filtered and Focused preprocessing

In [6]:
# smooth definitions
region_filters = {'sv': 'gauss-5nm'}
region_ids = {'sv': sv_id, "az": az_mem_id}

# names
name_init = 'original'
smooth_prefix = ''

# comment
star_comment = 'All hierarchical connectivity tethers, membranes smoothed'

# execute task
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size,     
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.smooth_regions_task(
    region_other=region_filters, region_ids=region_ids, 
    name_init=name_init, name_smooth=None, 
    prefix=smooth_prefix, star_comment=star_comment)
    

Read  MPS object ../particles_example_bin-2_size-64/original/tables/original.pkl
Read  MPS object ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/sv-gauss-5nm
Pickled  MPS object to ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm.pkl
Wrote  star file ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm_all_star_json.pkl

Individual classes of sv-gauss-5nm:
Wrote class short  star file ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm_short_star_json.pkl
Wrote class medium  star file ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm_medium.star
Pickled 

In [7]:
# smooth definitions
region_filters = {'sv': 'gauss-5nm', 'az': 'gauss-0.5nm'}
region_ids = {'sv': sv_id, "az": az_mem_id}

# names
name_init = 'original'
smooth_prefix = ''

# comment
star_comment = 'All hierarchical connectivity tethers, membranes smoothed'

# execute task
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size,     
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.smooth_regions_task(
    region_other=region_filters, region_ids=region_ids, 
    name_init=name_init, name_smooth=None, 
    prefix=smooth_prefix, star_comment=star_comment)
    

Read  MPS object ../particles_example_bin-2_size-64/original/tables/original.pkl
Read  MPS object ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm.pkl
Read  MPS object ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm
Pickled  MPS object to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm/tables/sv-gauss-5nm_az-gauss-0.5nm.pkl
Wrote  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm/tables/sv-gauss-5nm_az-gauss-0.5nm_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm/tables/sv-gauss-5nm_az-gauss-0.5nm_all_star_json.pkl

Individual classes of sv-gauss-5nm_az-gauss-0.5nm:
Wrote class short  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm/tables/sv-gauss-5nm_az-gauss-0.5nm_short.star
Pickled json converted DataFrame versio

## Task 5: Randomize pixels

Reads saved outputs of task 1 and possibly of task 4

### Task 5a: Cylindrical mask for Plain and Filtered

Reads saved outputs of tasks 1 or 4

Used for Plain and Filtered preprocessing

In [17]:
# Short Plain tethers randomization mask

# name of the particles that are randomized
init_name = 'original'

# cylindrical mask params 
#cylinder_z = [12, 42]
cylinder_z = [12, 34]
cylinder_rho = 10

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, test=test,
    name_random=f'{init_name}_{cylinder_name}{test_suffix}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/original/tables/original.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-34
Pickled  MPS object to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-34/tables/original_cyl-r-10-z-12-34.pkl
Wrote  star file ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-34/tables/original_cyl-r-10-z-12-34_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-34/tables/original_cyl-r-10-z-12-34_all_star_json.pkl

Individual classes of original_cyl-r-10-z-12-34:
Wrote class short  star file ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-34/tables/original_cyl-r-10-z-12-34_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-34/tables/original_cyl-r-10-z-12-34_short_star_json.pkl
Wrote class medium  star file ../particles_ex

In [18]:
# Short Filtered SV tethers randomization mask

# name of the particles that are randomized
init_name = 'sv-gauss-5nm'

# cylindrical mask params (bottom_z, top z and radius)
cylinder_z = [12, 34]
cylinder_rho = 15

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, name_random=f'{init_name}_{cylinder_name}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-34
Pickled  MPS object to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_cyl-r-15-z-12-34.pkl
Wrote  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_cyl-r-15-z-12-34_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_cyl-r-15-z-12-34_all_star_json.pkl

Individual classes of sv-gauss-5nm_cyl-r-15-z-12-34:
Wrote class short  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_cyl-r-15-z-12-34_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_cyl-r-15-z-12-34_short_sta

In [19]:
# Short Filtered SV&PM tethers randomization mask

# name of the particles that are randomized
init_name = 'sv-gauss-5nm_az-gauss-0.5nm'

# cylindrical mask params (bottom_z, top z and radius)
cylinder_z = [12, 34]
cylinder_rho = 15

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, name_random=f'{init_name}_{cylinder_name}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm/tables/sv-gauss-5nm_az-gauss-0.5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34
Pickled  MPS object to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34.pkl
Wrote  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34_all_star_json.pkl

Individual classes of sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34:
Wrote class short  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34_short.star
P

In [14]:
# Intermediate Plain tethers randomization mask

# name of the particles that are randomized
init_name = 'original'

# cylindrical mask params 
cylinder_z = [12, 42]
#cylinder_z = [12, 34]
cylinder_rho = 10

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, test=test,
    name_random=f'{init_name}_{cylinder_name}{test_suffix}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/original/tables/original.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-42
Pickled  MPS object to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-42/tables/original_cyl-r-10-z-12-42.pkl
Wrote  star file ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-42/tables/original_cyl-r-10-z-12-42_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-42/tables/original_cyl-r-10-z-12-42_all_star_json.pkl

Individual classes of original_cyl-r-10-z-12-42:
Wrote class short  star file ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-42/tables/original_cyl-r-10-z-12-42_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/original_cyl-r-10-z-12-42/tables/original_cyl-r-10-z-12-42_short_star_json.pkl
Wrote class medium  star file ../particles_ex

In [15]:
# Intermediate Filtered SV tethers randomization mask

# name of the particles that are randomized
init_name = 'sv-gauss-5nm'

# cylindrical mask params (bottom_z, top z and radius)
cylinder_z = [12, 42]
cylinder_rho = 15

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, name_random=f'{init_name}_{cylinder_name}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/sv-gauss-5nm/tables/sv-gauss-5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-42
Pickled  MPS object to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_cyl-r-15-z-12-42.pkl
Wrote  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_cyl-r-15-z-12-42_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_cyl-r-15-z-12-42_all_star_json.pkl

Individual classes of sv-gauss-5nm_cyl-r-15-z-12-42:
Wrote class short  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_cyl-r-15-z-12-42_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_cyl-r-15-z-12-42_short_sta

In [16]:
# Intermediate Filtered SV&PM tethers randomization mask

# name of the particles that are randomized
init_name = 'sv-gauss-5nm_az-gauss-0.5nm'

# cylindrical mask params (bottom_z, top z and radius)
cylinder_z = [12, 42]
cylinder_rho = 15

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, name_random=f'{init_name}_{cylinder_name}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm/tables/sv-gauss-5nm_az-gauss-0.5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42
Pickled  MPS object to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42.pkl
Wrote  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42_all_star_json.pkl

Individual classes of sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42:
Wrote class short  star file ../particles_example_bin-2_size-64/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42_short.star
P

### Task 5b: Segments mask

Reads outputs of tasks 1 and 2d 

Used for Filtered preprocessing

In [12]:
# Focused tethers randomization mask

# name of the particles that are randomized
init_name = 'original'

# name of the segments to use as masks
segment_name = 'segments-dilate-5'

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=ex_mps.region_particle_col, mask_mode='path_col', 
    name_init=init_name, name_segment=segment_name,
    name_random=f'{init_name}_{segment_name}{test_suffix}', test=test, 
    star_comment=f'Randomized {init_name} using {segment_name} segmentation mask.')

Read  MPS object ../particles_example_bin-2_size-64/original/tables/original.pkl
Read  MPS object ../particles_example_bin-2_size-64/segments-dilate-5/tables/segments-dilate-5.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/original_segments-dilate-5
Pickled  MPS object to ../particles_example_bin-2_size-64/original_segments-dilate-5/tables/original_segments-dilate-5.pkl
Wrote  star file ../particles_example_bin-2_size-64/original_segments-dilate-5/tables/original_segments-dilate-5_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/original_segments-dilate-5/tables/original_segments-dilate-5_all_star_json.pkl

Individual classes of original_segments-dilate-5:
Wrote class short  star file ../particles_example_bin-2_size-64/original_segments-dilate-5/tables/original_segments-dilate-5_short.star
Pickled json converted DataFrame version of class short  star file to ../particles_example_bin-2_size-64/original_segmen

### Task 5c: Segments mask dilated on cytoplasm

Reads output of tasks 2a, 2d, 4 and 5b 

Used for Filtered preprocessing

In [13]:
# Put filtered SV and PM on segment randomized

# smooth definitions
region_filters = {'sv': 'gauss-5nm', 'az': 'gauss-0.5nm'}
region_ids = {'sv': sv_id, "az": az_mem_id}

# names
name_init = 'original_segments-dilate-5'
name_regions = 'regions'
prefix = 'segments-5_'

# comment
star_comment = 'All hierarchical connectivity tethers, segments dilated, membranes smoothed'

# execute task
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size,     
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.smooth_regions_task(
    region_other=region_filters, region_ids=region_ids, 
    name_init=name_init, name_smooth=None, name_regions=name_regions,
    prefix=prefix, star_comment=star_comment)
    

Read  MPS object ../particles_example_bin-2_size-64/original_segments-dilate-5/tables/original_segments-dilate-5.pkl
Read  MPS object ../particles_example_bin-2_size-64/gauss-5nm/tables/gauss-5nm.pkl
Read  MPS object ../particles_example_bin-2_size-64/gauss-0.5nm/tables/gauss-0.5nm.pkl
Read  MPS object ../particles_example_bin-2_size-64/regions/tables/regions.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm
Pickled  MPS object to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm.pkl
Wrote  star file ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_all_star_json.pkl

Individual classes of segments-5_sv-

### Task 5a: Cylindrical mask for Focused

Reads saved outputs of task 5c

Used for filtered preprocessing

In [22]:
# Short Focused tethers cylindrical randomization mask

# Needed to restrict the size of filtered SV and PM, so that the the filtered 
# SV and PM imposed in 5c do not reach out of the randomization cylinder used in 5a  

# name of the particles that are randomized
init_name = 'segments-5_sv-gauss-5nm_az-gauss-0.5nm'

# cylindrical mask params (bottom_z, top z and radius)
#cylinder_z = [12, 42]
cylinder_z = [12, 34]
cylinder_rho = 15

# cylinder name (appended to init_name)
#cylinder_name = 'short'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, test=test,
    name_random=f'{init_name}_{cylinder_name}{test_suffix}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34
Pickled  MPS object to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34.pkl
Wrote  star file ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34_all_star_json.pkl

Individual classes of segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-34:
Wrote class short  star file ../particles_example_bin-2_size-64

In [23]:
# Intermediate Focused tethers cylindrical randomization mask

# Needed to restrict the size of filtered SV and PM, so that the the filtered 
# SV and PM imposed in 5c do not reach out of the randomization cylinder used in 5a  

# name of the particles that are randomized
init_name = 'segments-5_sv-gauss-5nm_az-gauss-0.5nm'

# cylindrical mask params (bottom_z, top z and radius)
cylinder_z = [12, 42]
#cylinder_z = [12, 34]
cylinder_rho = 15

# cylinder name (appended to init_name)
#cylinder_name = 'medium'
cylinder_name = f'cyl-r-{cylinder_rho}-z-{cylinder_z[0]}-{cylinder_z[1]}'

# make mask
cylinder = Cylinder.make_image(
    z_min=cylinder_z[0], z_max=cylinder_z[1], rho=cylinder_rho, 
    shape=particle_size, axis_xy='center')

# testing only, if True sets pixels that should be randomized to 0
test = False
test_suffix = ''
if test:
    test_suffix = '_test'

#randomize
ex_mps = ExtractMPS(
    root_template=root_template, box_size=particle_size, 
    in_tomo_particle_col=in_tomo_particle_col, ctf_label=ctf_label,
    class_names=class_names, class_code=class_code)
ex_mps.randomize_task(
    mask=cylinder, name_init=init_name, test=test,
    name_random=f'{init_name}_{cylinder_name}{test_suffix}',
    star_comment=f'Randomized {init_name} using {cylinder_name} cylinder mask.')

Read  MPS object ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm.pkl

All particles
Wrote particles to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42
Pickled  MPS object to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42.pkl
Wrote  star file ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42_all.star
Pickled json converted DataFrame version of  star file to ../particles_example_bin-2_size-64/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42/tables/segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42_all_star_json.pkl

Individual classes of segments-5_sv-gauss-5nm_az-gauss-0.5nm_cyl-r-15-z-12-42:
Wrote class short  star file ../particles_example_bin-2_size-64

## Axiliary task: Star file mainupulations

Helps making relion star files

In [5]:
# Modifes an existing star file:
#   - copies the current angles to priors
#   - changes particle sets by replacing particle set directories (useful
#   - when the same operation has to be done on different particle sets)

# input star
in_star_path = '../particles_bin-2_size-64/original/tables/original_all.star'

# output star
out_star_path = 'test.star'

# indicates whether prior angles are updated
update_priors = True

# original string (None for no replacement)
path_pattern = 'particles_bin-2_size-64/original'

# replacement string
path_replace = 'particles_bin-2_size-64/sv-gauss-3nm'

# star column where pattern repacement is made
path_update_label = 'rlnImageName'

# comment written in the output star file
star_comment = "Updated priors and image paths"

relion_tools.update_priors_replace(
    in_star_path=in_star_path, out_star_path=out_star_path, 
    update_priors=True, 
    pattern=path_pattern, replace=path_replace, star_comment=star_comment)