## Motl basics

* Motl stands for "motive list" and it contains list of particles and their properties. It can be used in subtomogram averaging or to perform contextual analysis.
* Internally, the particle list is stored within a Motl class as pandas DataFrame with 20 columns (see below) and N rows, where N corresponds to number of particles. 
* Externally, the particle list can be loaded from and written as a binary file in EM format (novaSTA, TOM/AV3, ArtiaX compatible), a RELION starfile (currently up to version 4.x), a STOPGAP starfile, and a simple CSV file. 

### Motl class

* Motl class is a parent class containing functions that are general to all formats. 
* The particle list itself is stored in the member variable `df` as pandas DataFrame that has following columns:

    1.  "score" - a quality metric (typically cross-correlation value between the particle and the reference)
    2.  "geom1" - a free geometric property
    3.  "geom2" - a free geometric property
    4.  "subtomo_id" - a subtomogram id; **IMPORTANT** many functions rely on this one to be unique
    5.  "tomo_id" - a tomogram id to which the particle is affiliated to
    6.  "object_id" - an object id to which the particle is affiliated to
    7.  "subtomo_mean" - a mean value of the subtomogram
    8.  "x" - a position in the tomogram (an integer value), typically used for subtomogram extraction
    9.  "y" - a position in the tomogram (an integer value), typically used for subtomogram extraction
    10. "z" - a position in the tomogram (an integer value), typically used for subtomogram extraction
    11. "shift_x" - shift of the particle in X direction (a float value); to complete position of a particle is given by x + shift_x
    12. "shift_y" - shift of the particle in Y direction (a float value); to complete position of a particle is given by y + shift_y
    13. "shift_z" - shift of the particle in Z direction (a float value); to complete position of a particle is given by z + shift_z
    14. "geom3" - a free geometric property
    15. "geom4" - a free geometric property
    16. "geom5" - a free geometric property
    17. "phi" - a phi angle describing rotation around the first Z axis (following Euler zxz convention)
    18. "psi" - a psi angle describing rotation around the second Z axis (following Euler zxz convention)
    19. "theta" - a thetha angle describing rotation around the X axis (following Euler zxz convention)
    20. "class" - a class of the particle

### Work with particle lists: Basic examples
In the following section, some examples of how to use Motl files in your analyisis pipelines are provided. For a complete list of fucntions, please refer to the `cryomotl` module in the API guide. <br>
NOTE: For all the functions displayed, it is assumed that the `cryomotl` module is imported:

In [None]:
import cryocat
from cryocat import cryomotl

#### Loading a particle list as a Motl object
The first step to work with a particle list is to load it and store it as a Motl object. This can be accomplished with the `load()` function, which is used to initialized the Motl class. <br>
Example:

In [None]:
my_motl = cryomotl.Motl.load('path/to/motl_file')

Then, the properties of the particles can be displayed by inspecting the `df` attribute (dataframe).

In [None]:
my_motl.df

This will print out the content of the pandas Dataframe will all the columns described in the previous section.

#### Extract a subset of particles
To work on a subset of particles based on the value of a particular feature, you can extract them with the `get_motl.get_motl_subset()` function. By default, the feature that is taken into account is the tomogram ID (`tomo_id`) and it returns a new Motl object, however you cna ask for a panda DataFrame as well. Here are some examples:

In [None]:
subset_motl_tomo = my_motl.get_motl_subset(2) #select particles belonging to tomogram 2 and return a new motl object
subset_motl_class = my_motl.get_motl_subset(1, feature_id="class") #select particles belonging to class 1 and return a new motl object
subset_df_tomo = my_motl.get_motl_subset(2, return_df=True) #select particles belonging to tomogram 2 and return a pandas dataframe

#### Clean a particle list
Different fucntions are available to clean particle list depending on the analysis pipeline. <br/>

**Clean particles by distance:** Clean duplicate particles, commonly required when working with particles from oversampling. The function that accomplishes this is `clean_by_distance()`.

In [None]:
my_motl.clean_by_distance(10) #remove particles that are closer than 10 voxels to each other, keeping the one with the highest score value
my_motl.clean_by_distance(10, feature_id="class") #remove particles that are closer than 10 voxels to each other, grouping the particles by class and keeping the one with the highest score value

**Clean particles by CC value:** This is accomplished with the `clean_by_otsu()` function.

#### Write a Motl to disk
If you have edited the `df` attribute of your Motl object or you generated a new Motl object and you wish to save them to a particle list file, you need to use the `write_out()` function:

In [None]:
my_motl.write_out("path/to/desired_output_file.em") #this will save my_motl to desired_output_file.em