### Dependencies

In [None]:
%load_ext autoreload
%autoreload 2

from cryocat import cryomap
from cryocat import surfsamp
from cryocat import cryomotl
import numpy as np
import pandas as pd
import os
from skimage.transform import resize

### Expected time

All steps should be completed within seconds/minutes. The boundary_sampling and inner_and_outer_pc might took longer depend on the amount of points.

### Input data

The data and expected output for this tutorial was available in 'tests/test_data/point_clouds,'. The expected output of muti-tomograms example can be found in 'cryoCAT/tests/test_data/point_clouds/motls.

### Loading shape labels and sampling the surface

To load a shape label stored in a CSV file containing layer number and point coordinates (V, 4) into the SamplePoints class and then sample the surface. Sample points were positioned along the edges of the convex hull formed by the input coordinates. Each point will be assigned a normal vector perpendicular to the surface. The sampling distance between points can be specified by the user based on their needs, with a default value of 1 unit if not provided.

In [None]:
spcsv = surfsamp.SamplePoints.load("../../../../tests/test_data/point_clouds/040_1_shape.csv")
spcsv.boundary_sampling(sampling_distance = 10)

### Loading mask and sampling the surface

To load a mask into the SamplePoints class and then sample the mask boundary. Sample points were positioned at the edge of the mask with a normal vector perpendicular to the surface.

In [None]:
spmask = surfsamp.SamplePoints.load("../../../../tests/test_data/point_clouds/masks/040_generated_mask_2.mrc")
spmask.boundary_sampling(sampling_distance = 10)

### Writing the sampling points into motl_list.em file

Sample points will be written into a motl_list file, where only the x, y, z, phi, psi, and theta columns are filled, while all other columns are set to 0 by default. Users can specify additional columns by providing an input_dict, where the keys correspond to the column names to be filled, and the values are NumPy arrays of shape (V, 1), where V matches the number of points.\
In this example,
- spmask example saves the sample points with tomo_id and object_id filled.
- spcsv example only saves the sample points.

In [None]:
tomo_id = '040'
obj_id = '001'
input_dict = {'tomo_id':np.ones(spmask.vertices.shape[0])*float(tomo_id), 'object_id':np.ones(spmask.vertices.shape[0])*float(obj_id)}
spmask.write("../../../../tests/test_data/point_clouds/motl_040mask_sp10.em", input_dict)
spcsv.write("../../../../tests/test_data/point_clouds/motl_040csv_sp10.em")

### Reset normal of points

The point cloud generated by `boundary_sampling` can also be used to replace the normals in a motl_list. The normal of the nearest point to each motl point will be used to update its angle

In [None]:
# parameters for making the oversample points in right binning
b_factor = 4
pal_thickness = 20
samp_dist = 10

mask = cryomap.read("../../../../tests/test_data/point_clouds/masks/040_generated_mask_2.mrc")
# bin mask base on b_factor
size = tuple(b_factor*i for i in mask.shape)
bin_mask = resize(mask, size, order=0, mode='constant')
# generate oversample points using bin mask
spmask40 = surfsamp.SamplePoints.load(bin_mask)
spmask40.boundary_sampling(sampling_distance = samp_dist)
spmask40out,_ = spmask40.inner_and_outer_pc(thickness = pal_thickness*b_factor)
# load motl which you want to modified
motl = cryomotl.Motl.load('../../../../tests/test_data/point_clouds/motl_040_STAexample.em')
spmask40out.reset_normals(motl)
motl.write_out('../../../../tests/test_data/point_clouds/motl_040_STArenormal.em')

### Calculate surface area

In [None]:
# You may notice that spmask has twice the area of spcsv.
# This is because the mask input represents a shell, whereas spcsv only considers the outer surface.
print(spmask.area)
print(spcsv.area)

### Oversampling the mask surface and generating a motl_list for multiple tomograms.

For multiple tomograms, here is an example script to process them all together.\
_User inputs_
- _mask_folder: Path to the folder containing input masks._
- _output_folder: Path to the folder where the output motl_list files will be saved._
- _mask_list: An (V, 3) excel file with information of masks. V is equals to number of ojects and there's one mask for each object._
- _pal_thickness: Thickness of the shell in the mask, used to separate the inner and outer layers of the point clouds._
- _sampling_dist: the distance between sampling points in pixels._

In [None]:
# Sampling and create point clouds base on the surface of the mask. Keep only the outer shell of the pointcloud and save into a motl_list
mask_folder = f'../../../../tests/test_data/point_clouds/masks'
mask_list = f'../../../../tests/test_data/point_clouds/mask_list.csv'
output_folder = f'../../../../tests/test_data/point_clouds/motls'
pal_thickness = 20
sampling_dist = 5
shift_dist = -6

# read in mask_list
mask_array = pd.read_csv(mask_list, header=None).to_numpy().astype(str)
# zero-padded tomo number to 3 digits
mask_array[:, 0] = np.char.zfill(mask_array[:, 0], 3)
for i in mask_array:
    tomo_id, obj_id = i
    # file name of input and output
    mask_file = f'{tomo_id}_generated_mask_{obj_id}.mrc'
    motl_file = f'{tomo_id}_{obj_id}_pointcloud.em'
    mask = cryomap.read(f'{mask_folder}/{mask_file}')
    # sampling at the surface of mask and keeping only the outer sample points of the shell
    sp = surfsamp.SamplePoints.load(mask)
    sp.boundary_sampling(sampling_distance = sampling_dist)
    outer_sp,_ = sp.inner_and_outer_pc(thickness = pal_thickness)
    # shifting coordinates 6 pixels in opposite normal vectors direction. negative value for shifting into direction opposite of the normal vectors
    outer_sp.shift_points(shift_dist)
    # create input_dict to fill in 'tomo_id' and 'object_id'
    input_dict = {'tomo_id':np.ones(outer_sp.vertices.shape[0])*float(tomo_id), 'object_id':np.ones(outer_sp.vertices.shape[0])*float(obj_id)}
    outer_sp.write(f'{output_folder}/{motl_file}', input_dict)
    print(f'{motl_file} was written')

In case you would like to shift different distance for different normals use method `shift_points_in_groups` with shift_dict input. The shift_dict is a dictionary where keys are tuples representing normal vectors and values are the shift magnitudes in pixels. Here's an example of shifting points with normals pointing out/into the tomogram with a distance in 20 pixels and others in -6 pixels

In [None]:
shift_dist = -6
shift_dist_tb = 10
# To create a shift_dict with keys from all normal vectors
shift_dict = {key: shift_dist for key in tuple(map(tuple,outer_sp.normals))}
# assgin different value for target normal vectors
shift_dict[(1,0,0)] = shift_dist_tb
shift_dict[(-1,0,0)] = shift_dist_tb
# run shift_points_in_groups
outer_sp.shift_points_in_groups(shift_dict)

Merging and renumbering all motls from each object into one

In [None]:
motl_name = sorted(os.listdir(output_folder))
motl_merge = cryomotl.Motl.merge_and_renumber([f'{output_folder}/{i}' for i in motl_name])
motl_merge.write_out(f'{output_folder}/allmotl_pcShift-6.em')