## Motl child classes

The cryoCAT program provides a framework to handle and manipulate different format files by defining a base motl class and several child classes, each corresponding to a specific software package's motl format.

* **EmMotl**: manages motl files for the EM format (e.g., used by novaSTA, TOM/AV3). This binary format is a common, general-purpose file type for cryo-ET data, known for its compact size and compatibility with several software suites.
* **StopgapMotl**: handles motl files generated by Stopgap, a set of scripts used for cryo-ET data processing. This format is often a simple text file with columns for coordinates, angles, and other information. It is a common format for intermediate processing steps.
* **ModMotl**: this class is for motl files used by IMOD, a popular software package for 3D reconstruction and visualization in electron microscopy. The mod format is a binary file that stores motive lists and is tightly integrated with the IMOD suite. 
* **RelionMotl**: handles motl files for RELION, a powerful program for cryo-EM single-particle analysis and subtomogram averaging. The motl format in RELION is a STAR file, a self-describing text-based format that is highly versatile.
* **DynamoMotl**: this class corresponds to motl files used in Dynamo, a robust platform for cryo-ET data processing and subtomogram averaging. The Dynamo motl format is typically a tbl file, a plain-text table with 35 columns that record particle coordinates, orientations, scores, and related metadata.

### Example usage of subclasses
NOTE: For all the functions displayed, it is assumed that the `cryomotl` module is imported;

In [None]:
from cryocat import cryomotl
import pandas as pd
import os
os.chdir("../../../")  #Setting project path folder to fetch tutorial files

#### EmMotl


##### Initialization

An `EmMotl` object can be constructed in different ways: using an EM file, from another EmMotl object, from a pandas DataFrame:

In [None]:
emmotl_from_file = EmMotl(input_motl = "/path/to/emmotl.em", header = "/path/to/emmotl.em")

In [None]:
emmotl_copy = EmMotl(emmotl_from_file)

In [None]:
motl_df = pd.DataFrame({
    "subtomo_id": [1, 2, 3],
    "tomo_id": [1, 1, 2],
    "object_id": [10, 11, 12],
    "x": [15.0, 25.5, 35.2],
    "y": [16.1, 26.2, 36.3],
    "z": [17.2, 27.3, 37.4],
    "shift_x": [0.0, -0.2, 0.5],
    "shift_y": [0.1, 0.0, -0.3],
    "shift_z": [0.0, 0.0, 0.0],
    "phi": [0.0, 45.0, 90.0],
    "psi": [0.0, 30.0, 60.0],
    "theta": [0.0, 15.0, 30.0],
    "class": [1, 1, 2],
    "score": [0.95, 0.85, 0.90],
})
emmot_from_df = EmMotl(motl_df)

##### Writing out

`EmMotl` objects can be saved back into EM format:

In [None]:
emmotl_from_df.write_out(output_path = "/output/path/output.em")

#### StopgapMotl
The `StopgapMotl` class handles **motive lists stored in the STOPGAP STAR format**, used by the Stopgap subtomogram averaging pipeline.  
These files are plain text `.star` tables. Internally, `StopgapMotl` keeps two DataFrames:

- `.sg_df` → the original STAR-format DataFrame as read from file  
- `.df` → the standardized 20-column `Motl` DataFrame used across cryoCAT  

##### Initialization

A `StopgapMotl` object can be constructed in different ways: using a stopgap .star file, from another StopgapMotl object, from a pandas DataFrame:

In [None]:
stopgap_from_file = StopgapMotl("/path/to/stopgap.star")

In [None]:
stopgap_copy = StopgapMotl(stopgap_from_file)

In [None]:
df = pd.DataFrame({
    "subtomo_num": [1, 2],
    "tomo_num": [1, 1],
    "object": [1, 1],
    "orig_x": [100, 200],
    "orig_y": [150, 250],
    "orig_z": [200, 300],
    "score": [0.5, 0.6],
    "x_shift": [0, 0],
    "y_shift": [0, 0],
    "z_shift": [0, 0],
    "phi": [0, 90],
    "psi": [0, 0],
    "the": [0, 0],
    "class": [1, 1],
    "halfset": ["A", "B"],
    "motl_idx": [1, 2]
})
sg_from_df = StopgapMotl(df)

##### Writing out

`StopgapMotl` objects can be saved back into stopgap star format and into EM format; `StopgapMotl` write_out method has 2 additional arguments:

In [None]:
sg_from_df.write_out("/output/path/particles.star")

In [None]:
sg_from_df.write_out("/output/path/particles.em")

In [None]:
# Ensures motl_idx is sequential 1...N
sg_from_df.write_out(output_path = "/output/path/particles.star", reset_index = True)

In [None]:
# Recompute coordinates before saving
sg_from_df.write_out(output_path = "/output/path/particles.star", update_coord = True)

#### ModMotl
The ModMotl class handles `IMOD` .mod files, which store particle positions and contours as part of 3D models used in cryo-ET. Unlike other formats, .mod files are tied to `IMOD`’s visualization and modeling environment, so this class is particularly useful when particles are picked or annotated in `IMOD` and later need to be converted into a standard motive list.

##### Initialization

A `ModMotl` object can be constructed in different ways: using a  mod file, from another DynamoMotl object, from a pandas DataFrame; it is also possible to construct it using multiple files from the same path, specifing file prefix and suffix (which defaults to .mod).

In [None]:
mod_from_file = ModMotl("/path/to/modmotl.mod")

In [None]:
#Construct the object by using all .mod files in the specified path folder
mod_from_path = ModMotl("/path/to/mod/folder")

In [None]:
#Construct the object by using all .mod files in the specified path folder that contain the specified prefix
mod_from_path_prefix = ModMotl(input_path = "/path/to/mod/folder", mod_prefix = "prefix_")

In [None]:
#Construct the object by using all .mod files in the specified path folder that respect the specified suffix
mod_from_path_suffix = ModMotl(input_path = "/path/to/mod/folder", mod_suffix = ".mod")

In [None]:
mod_copy = ModMotl(dynamo_from_file)

In [None]:
mod_df = pd.DataFrame({
    'object_id': [1, 1, 2, 2],
    'x': [1, 2, 1, 2],
    'y': [1, 2, 1, 2],
    'z': [1, 2, 1, 2],
    'mod_id': [1, 1, 2, 2],
    'contour_id': [1, 1, 2, 2],
    'object_radius': [0.5, 0.5, 0.5, 0.5]
})
mod_from_df = ModMotl(mod_df)

##### Writing out

`ModMotl` objects can be saved back into MOD format:

In [None]:
mod_from_df.write_out("/output/path/output.mod")

#### DynamoMotl
The `DynamoMotl` class handles **motl files in Dynamo format**, which are typically used in Dynamo subtomogram averaging pipelines.  
Dynamo files are plain-text `.tbl` files with fixed-column indexing. Internally, `DynamoMotl` keeps:

- `.dynamo_df` → the original Dynamo-format DataFrame  
- `.df` → the standardized 20-column `Motl` DataFrame used across cryoCAT 


##### Initialization

A `DynamoMotl` object can be constructed in different ways: using a Dynamo tbl file, from another DynamoMotl object, from a pandas DataFrame:

In [None]:
dynamo_from_file = DynamoMotl("/path/to/dynamo.tbl")

In [None]:
dynamo_copy = DynamoMotl(dynamo_from_file)

In [None]:
dynamo_from_df = DynamoMotl(dynamo_df)

##### Writing out

`DynamoMotl` objects can be saved back into dynamo .tbl format:

In [None]:
dynamo_from_df.write_out("/output/path/dynamo.tbl")

#### RelionMotl
The RelionMotl class handles RELION STAR files containing particle lists for subtomogram averaging and single-particle analysis. RELION’s MOTL format is text-based and highly versatile, storing particle coordinates, orientations, class assignments, and metadata in a self-describing manner. Versions up to 4.0 are supported.

##### Initialization
`RelionMotl` objects can be constructed using a Relion format star file, a pandas DataFrame, or another RelionMotl object; this class supports Relion versions 3.0, 3.1, 4.0.
For Relion v5.x see RelionMotlv5 class.
`RelionMotl` init function takes in different arguments:

`version` : specifies the version of RelionMotl object as described above

`pixel_size` : defines the physical size of one voxel in Ångström, used to convert particle coordinates from voxels to real-space units and to set the optics header in the starfile

`binning` : specifies the downsampling factor applied to the data, effectively scaling voxel coordinates and the pixel size when writing out the starfile. Must be specified for relion 4.x versions or above

`optics_data` : provides the optics information (CTF, pixel size, defocus, etc.) either as a DataFrame, a dictionary, or a path to a starfile, which is used when writing the starfile or updating particle metadata. Not supported for relion 3.0 objects.

In [None]:
from cryocat.cryomotl import RelionMotl

In [None]:
relion_from_file = RelionMotl(input_motl = os.path.abspath("tests/test_data/motl_data/relion_3.0.star"), version = 3.0, pixel_size = 2.23)

In [None]:
relion_from_object = RelionMotl(input_motl = relion_from_file)

In [None]:
relion_df = pd.DataFrame({
    "rlnMicrographName": ["micrograph_01.mrc", "micrograph_02.mrc"],
    "rlnCoordinateX": [10.5, 22.1],
    "rlnCoordinateY": [15.2, 18.3],
    "rlnCoordinateZ": [8.0, 10.0],
    "rlnAngleRot": [0.0, 45.0],
    "rlnAngleTilt": [5.0, 10.0],
    "rlnAnglePsi": [180.0, 90.0],
    "rlnCtfMaxResolution": [3.5, 4.0],
    "rlnImageName": ["subtomo_0001_0001.mrc", "subtomo_0002_0001.mrc"],
    "rlnCtfImage": ["ctf_01.mrc", "ctf_02.mrc"],
    "rlnPixelSize": [1.2, 1.2],
    "rlnOpticsGroup": [1, 1],
    "rlnGroupNumber": [1, 1],
    "rlnOriginXAngst": [0.0, 0.0],
    "rlnOriginYAngst": [0.0, 0.0],
    "rlnOriginZAngst": [0.0, 0.0],
    "rlnClassNumber": [1, 2],
    "rlnNormCorrection": [1.0, 0.95],
    "rlnRandomSubset": [1, 2],
    "rlnLogLikeliContribution": [-120.5, -110.2],
    "rlnMaxValueProbDistribution": [0.98, 0.92],
    "rlnNrOfSignificantSamples": [50, 60]
})

relion_from_df = RelionMotl(input_motl = relion_df, version = 3.1, optics_data = "tests/test_data/motl_data/relion_3.1_optics2.star")

In [None]:
relion_v4 =  RelionMotl(input_motl = "tests/test_data/motl_data/relion_4.0.star", binning = 2)

##### Writing
`RelionMotl` objects can be written down into a star file in Relion format; apart from already described arguments, write_out function does also offer many custom outputs depending on specified arguments:

`tomo_format` and `subtomo_format` : formats of the tomo and subtomo output formats

`use_original_entries` : use the original Relion starfile entries when updating coordinates, rotations, and classes ; `keep_all_entries` : only if `use_original_entries` is True, keeps all original entries unchanged. When these arguments are set to False, the output dataframe is rebuilt from `self.df`

`add_object_id` and `add_subunit_id` : if True they add "object_id" and "subunit_id" from `self.df` as "ccObjectName" and "ccSubunitName" in the dataframe

`subtomo_size` : the edge length of the extracted subtomograms in voxels

In [None]:
#Example with basic arguments
relion_from_file.write_out(output_path = "tests/test_data/motl_data/tutorial_data/relion_test.star", write_optics = False, version = 3.0, pixel_size = 2.23)

In [None]:
#Example with tomo-subtomo_format arguments
rln_motl = RelionMotl()
rln_motl.fill({"tomo_id": [2], "subtomo_id":[33]})
rln_motl.write_out(output_path = "tests/test_data/motl_data/tutorial_data/relion_test2.star",tomo_format="/path/to/$xxxx.rec", subtomo_format="/path/to/$xxxx/$xxxx_$yy_1.2A.mrc", version=3.1)
#Let's reload the Motl
rln_check = RelionMotl("tests/test_data/motl_data/tutorial_data/relion_test2.star")
print(rln_check.relion_df["rlnMicrographName"].iloc[0])  # expected: "/path/to/0002.rec"
print(rln_check.relion_df["rlnImageName"].iloc[0])       # expected: "/path/to/0002/0002_33_1.2A.mrc"

In [None]:
#Example with use_original_entries and keep_all_entries:
#Here, the STAR file is built from scratch using the standardized MOTL dataframe. rlnCtfImage or rlnNormCorrection will not be preserved (e.g.)
relion_from_df.write_out(output_path="tests/test_data/motl_data/tutorial_data/false_orig.star", write_optics=False, use_original_entries=False) #extra metadata disappaears
rln1_check = RelionMotl("tests/test_data/motl_data/tutorial_data/false_orig.star")
print(rln1_check.relion_df.columns)

#Here, the STAR file is built from the existing relion_df, and coordinates/angles/classes are updated from current MOTL Dataframe
#In this example we show how coordinates are unchanged when using keep_all_entries
relion_from_df.relion_df.loc[1, ["rlnCoordinateX", "rlnCoordinateY", "rlnCoordinateZ"]] = [30.0, 35.0, 40.0]
relion_from_df.write_out(output_path="tests/test_data/motl_data/tutorial_data/true_orig.star", write_optics=False, use_original_entries=True, keep_all_entries=False) #extra metadata preserved, columns updated
rln2_check = RelionMotl("tests/test_data/motl_data/tutorial_data/true_orig.star")
print(rln2_check.relion_df.loc[1, ["rlnCoordinateX", "rlnCoordinateY", "rlnCoordinateZ"]])

#Here, even if your MOTL Dataframe dropped some particles, all rows from the original STAR are still written out unchanged!
relion_from_df.write_out(output_path="tests/test_data/motl_data/tutorial_data/true_all_orig.star", write_optics=False, use_original_entries=True, keep_all_entries=True)
rln3_check = RelionMotl("tests/test_data/motl_data/tutorial_data/true_all_orig.star")
print(rln3_check.relion_df.loc[1, ["rlnCoordinateX", "rlnCoordinateY", "rlnCoordinateZ"]])

In [None]:
#Example with add_object_id and add_subunit_id:
relion_from_df = RelionMotl(input_motl = relion_df, version = 3.1, optics_data = "tests/test_data/motl_data/relion_3.1_optics2.star")
relion_from_df.write_out(output_path="tests/test_data/motl_data/tutorial_data/add_obj_sub_id.star", write_optics=False, add_object_id=True, add_subunit_id=True)
rln_check3= RelionMotl("tests/test_data/motl_data/tutorial_data/add_obj_sub_id.star")
print(rln_check3.relion_df.columns) #2 new columns are added

#### RelionMotlv5
RelionMotlv5 is a class for handling Relion version 5 star files, which separate metadata into two files: one describing the tomograms and one describing the particles. Both files are needed during initialization to capture all relevant information. `Warp2` and `Relion5 STAR` files can be read, and output can be created in both formats. Relion 5 introduces a new system of coordinates, centered and in Angstroms.
RelionMotlv5 class inherits both from Motl and RelionMotl classes; only substancial differences and mechanics that differ from RelionMotl are shown below: everything else still works as in RelionMotl.

##### Initialization
`RelionMotl 5` objects can be constructed using different combination of objects/files for the two main arguments: tomograms and particles. Tomogram data can be loaded using a pandas DataFrame, a Relion5 format STAR file, or anything compatible with `ioutils.dimensions_load`. Particle data can be loaded as well using a pandas DataFrame, a Relion5 format STAR file or a RelionMotlv5 object.

`RelionMotl 5` objects do not require `version` argument to be specified, as it is by default 5.0

`tomo_idx` : path to a file containing tomogram indices or 1D array with the indices. Necessary if `input_tomograms` passed to `ioutils.dimensions_load` does not contain tomogram indices information.

In [None]:
from cryocat.cryomotl import RelionMotlv5

In [None]:
#Basic relion5 initialization with files
relion5_from_file = RelionMotlv5(input_particles="tests/test_data/motl_data/relion5/clean/R5_tutorial_run_data.star", 
                                 input_tomograms="tests/test_data/motl_data/relion5/clean/R5_tutorial_run_tomograms.star")

#Basic warp2 initialization with files
warp2_from_file = RelionMotlv5(input_particles="tests/test_data/motl_data/relion5/clean/warp2_particles_matching.star",
                               input_tomograms="tests/test_data/motl_data/relion5/clean/warp2_matching_tomograms.star")

In [None]:
particles_df = pd.DataFrame({
    "rlnTomoName": ["TS_01", "TS_01"],
    "rlnTomoSubtomogramRot": [167.7, 175.4],
    "rlnTomoSubtomogramTilt": [98.1, 75.3],
    "rlnTomoSubtomogramPsi": [-164.6, -132.3],
    "rlnAngleRot": [-0.25, -22.2],
    "rlnAngleTilt": [97.3, 87.0],
    "rlnAnglePsi": [9.0, -1.1],
    "rlnAngleTiltPrior": [90.0, 90.0],
    "rlnAnglePsiPrior": [0.0, 0.0],
    "rlnOpticsGroup": [1, 1],
    "rlnTomoParticleName": ["TS_01/12", "TS_01/13"],
    "rlnTomoVisibleFrames": [
        [0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0],
        [0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0]
    ],
    "rlnImageName": [
        "Extract/TS_01/12_stack2d.mrcs",
        "Extract/TS_01/13_stack2d.mrcs"
    ],
    "rlnOriginXAngst": [0.03, 1.04],
    "rlnOriginYAngst": [1.04, 0.03],
    "rlnOriginZAngst": [0.37, -0.31],
    "rlnCenteredCoordinateXAngst": [1775.0, 1667.4],
    "rlnCenteredCoordinateYAngst": [-951.5, -798.3],
    "rlnCenteredCoordinateZAngst": [1127.4, 1116.3],
    "rlnGroupNumber": [1, 1],
    "rlnClassNumber": [1, 1],
    "rlnNormCorrection": [1.0, 1.0],
    "rlnRandomSubset": [2, 1],
    "rlnLogLikeliContribution": [1.746e6, 1.745e6],
    "rlnMaxValueProbDistribution": [0.0187, 0.0233],
    "rlnNrOfSignificantSamples": [114, 120]
})
relion5_from_file_and_df = RelionMotlv5(input_particles=particles_df, 
                                 input_tomograms="tests/test_data/motl_data/relion5/clean/R5_tutorial_run_tomograms.star")

In [None]:
tomograms_df = pd.DataFrame({
    "rlnTomoName": ["TS_01", "TS_03", "TS_43", "TS_45", "TS_54"],
    "rlnVoltage": [300.0]*5,
    "rlnSphericalAberration": [2.7]*5,
    "rlnAmplitudeContrast": [0.1]*5,
    "rlnMicrographOriginalPixelSize": [0.675]*5,
    "rlnTomoHand": [-1.0]*5,
    "rlnOpticsGroupName": ["optics1"]*5,
    "rlnTomoTiltSeriesPixelSize": [1.35]*5,
    "rlnTomoTiltSeriesStarFile": [
        "Tomograms/job006/tilt_series/TS_01.star",
        "Tomograms/job006/tilt_series/TS_03.star",
        "Tomograms/job006/tilt_series/TS_43.star",
        "Tomograms/job006/tilt_series/TS_45.star",
        "Tomograms/job006/tilt_series/TS_54.star"
    ],
    "rlnEtomoDirectiveFile": [
        "AlignTiltSeries/job005/external/TS_01/TS_01.edf",
        "AlignTiltSeries/job005/external/TS_03/TS_03.edf",
        "AlignTiltSeries/job005/external/TS_43/TS_43.edf",
        "AlignTiltSeries/job005/external/TS_45/TS_45.edf",
        "AlignTiltSeries/job005/external/TS_54/TS_54.edf"
    ],
    "rlnTomoTomogramBinning": [7.407407]*5,
    "rlnTomoSizeX": [4000]*5,
    "rlnTomoSizeY": [4000]*5,
    "rlnTomoSizeZ": [2000]*5,
    "rlnTomoReconstructedTomogramHalf1": [
        "Tomograms/job006/tomograms/rec_TS_01_half1.mrc",
        "Tomograms/job006/tomograms/rec_TS_03_half1.mrc",
        "Tomograms/job006/tomograms/rec_TS_43_half1.mrc",
        "Tomograms/job006/tomograms/rec_TS_45_half1.mrc",
        "Tomograms/job006/tomograms/rec_TS_54_half1.mrc"
    ],
    "rlnTomoReconstructedTomogramHalf2": [
        "Tomograms/job006/tomograms/rec_TS_01_half2.mrc",
        "Tomograms/job006/tomograms/rec_TS_03_half2.mrc",
        "Tomograms/job006/tomograms/rec_TS_43_half2.mrc",
        "Tomograms/job006/tomograms/rec_TS_45_half2.mrc",
        "Tomograms/job006/tomograms/rec_TS_54_half2.mrc"
    ]
})

relion5_from_file_and_df = RelionMotlv5(input_particles="tests/test_data/motl_data/relion5/clean/R5_tutorial_run_data.star", 
                                 input_tomograms=tomograms_df)

In [None]:
#relion5 copy object: the only case when input_tomograms argument can be omitted
relion5_copy = RelionMotlv5(input_particles = relion5_from_file)

##### Writing
`RelionMotlv5` objects can be written down into a star file in Relionv5 format or Warp2 format; `RelionMotlv5` inherits from `RelionMotl`, thus the only main addition in writing is the argument `convert`:

`convert` : if the argument is True, then the output will be written out in the opposite format as the original one; e.g. the motl is created in relion5 format, then output will be delivered in warp2 format. The viceversa is also valid.

###### Conversion between Relion5 and Warp2

In [None]:
warp2_from_file.write_out(output_path="tests/test_data/motl_data/tutorial_data/warp2relion.star", convert=True, write_optics=False)

In [None]:
relion5_from_file.write_out(output_path="tests/test_data/motl_data/tutorial_data/relion2warp.star", convert=True, write_optics=False)

## Conversions between Motl classes --- <u>to be updated</u>
Conversion functions between the main Motl classes are available; these functions allow cryoCAT to map a standard internal pandas.DataFrame to each format, preserving coordinates, orientations, and metadata, so motive lists can be shared between programs without re-extraction or manual adjustments.


##### emmotl2relion


##### stopgap2emmotl


##### emmotl2stopgap


##### relion2stopgap


##### stopgap2relion
