<img src="https://raw.githubusercontent.com/ccameron/REPIC/main/imgs/repic_icon.png" alt="REPIC icon" width="200"/>

## Welcome to the official REPIC Jupyter Notebook for Google Colab!

REliable PIcking by Consensus (REPIC) is a consensus methodology for harnessing multiple cryogenic-electron microscopy (cryo-EM) particle picking algorithms. It identifies particles common to multiple picked particle sets (i.e., consensus particles) using graph theory and integer linear programming.

Development: https://github.com/ccameron/REPIC

Documentation: https://repic.readthedocs.io/en/latest/


Below are the steps required to install and run both the one-shot and iterative modes of REPIC. Note &mdash; GPUs are recommended for the iterative mode (Runtime > Change runtime type > GPU)


If you encounter any issues with this notebook, please raise an issue in the REPIC GitHub repo: https://github.com/ccameron/REPIC/issues

Author: Christopher JF Cameron (2023)

### Step 1 &mdash; check availability of ```conda``` command

1. Check the availability of the ```conda``` command

In [None]:
! conda --version

If Conda is available, the output will be (may be a different version): ```conda 23.1.0 ```


If the output says ```conda: command not found```, install Conda via [condalab](https://github.com/conda-incubator/condacolab) using pip: https://pypi.org/project/condacolab/

In [None]:
! pip install -q condacolab
import condacolab
condacolab.install()

After installation, Colab will raise a warning that the kernel has crashed (e.g., "Your session crashed for an unknown reason."). This crash is expected and a result of the ```Restarting kernel...``` condalab step.

2. Update the Conda base environment

In [None]:
! conda update -n base -c conda-forge conda

### Step 2 &mdash; install particle picking algorithms

1. Install Mamba package manager: https://github.com/mamba-org/mamba

In [None]:
! conda install -c conda-forge mamba -y

2. Get CUDA version of GPU using NVIDIA system manager interface (exp v12.0): https://developer.nvidia.com/nvidia-system-management-interface

In [None]:
! nvidia-smi

3. Get version of installed CUDA toolkit (exp v11.8)

In [None]:
! nvcc --version

_(Optional)_ Update the following install instructions for particle picking algorithms (Steps 4-11) below to reflect the available CUDA versions

**SPHIRE-crYOLO install**

4. Install SPHIRE-crYOLO: https://cryolo.readthedocs.io/en/stable/

In [None]:
! mamba create -n cryolo -c conda-forge -c anaconda pyqt=5 python=3.7 cudatoolkit=10.0.130 cudnn=7.6.5 numpy=1.18.5 libtiff wxPython=4.1.1  adwaita-icon-theme -y
! source activate cryolo && pip install 'cryolo[gpu]'

5. Download pre-trained crYOLO LOWPASS model

In [None]:
! wget ftp://ftp.gwdg.de/pub/misc/sphire/crYOLO-GENERAL-MODELS/gmodel_phosnet_202005_N63_c17.h5

**Topaz install**
6. Install Topaz: https://topaz-em.readthedocs.io/en/latest/?badge=latest

In [None]:
! mamba create -n topaz -c tbepler -c nvidia -c pytorch pytorch torchvision torchaudio pytorch-cuda=11.8 topaz=0.2.5

**REPIC install**
7. Install REPIC

In [None]:
! mamba install -c bioconda repic -y

8. Install Linux basic calculator (bc) package: https://linux.die.net/man/1/bc

In [None]:
! apt-get install bc

**DeepPicker install**
9. Clone DeepPicker GitHub repository

In [None]:
! git clone https://github.com/nejyeah/DeepPicker-python.git

10. Apply REPIC patch to DeepPicker Python scripts

In [17]:
! cp  $(pip show repic | grep -in "Location" | cut -f2 -d ' ')/../../../docs/patches/deeppicker/*.py DeepPicker-python/

11. Create DeepPicker Conda environment

In [None]:
! mamba create -n deep -c anaconda -c conda-forge -c pytorch python=3.7 tensorflow-gpu=2.4.1 cudatoolkit=10.1.243 cudnn=7.6.5 scipy matplotlib scikit-image mrcfile pytorch torchvision

**Cleanup**

12. Check that Conda environments can be found (cryolo, deep, topaz)

In [None]:
! conda env list

13. Clean up installation files

In [None]:
! conda clean --all -y
! mamba clean --all -y

### Step 3 &mdash; run one-shot REPIC on example data _(optional)_

1. Clone the REPIC GitHub repo and obtain example data

In [None]:
! git clone https://github.com/ccameron/REPIC

2. Calculate particle overlap and enumerate cliques using [get_cliques](https://github.com/ccameron/REPIC/blob/main/repic/commands/get_cliques.py).py (expected run time: <1 min)

In [None]:
! repic get_cliques /content/REPIC/examples/10017/ /content/REPIC/examples/10017/clique_files/ 180

3. Find optimal cliques and create consensus particle files using [run_ilp.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/run_ilp.py) (expected run time: <1 min)

In [None]:
! repic run_ilp /content/REPIC/examples/10017/clique_files/ 180

A consensus particle coordinate file (in BOX format) for each micrograph will be produced in ```/content/REPIC/examples/10017/clique_files/```

### Step 4 &mdash; run iterative REPIC on example data _(optional)_

1. Download example data

In [34]:
! bash $(pip show repic | grep -in "Location" | cut -f2 -d ' ')/repic/iterative_particle_picking/get_examples.sh /content/REPIC/examples/10057/data/ &> aws_download.log

2. Create a configuration file for iterative ensemble particle picking using [iter_config.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/iter_config.py) (expected run time: <1 min)

In [None]:
! repic iter_config /content/REPIC/examples/10057/ 176 224 /content/gmodel_phosnet_202005_N63_c17.h5 /content/DeepPicker-python 4 22

3. Pick particles by iterative ensemble learning using [iter_pick.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/iter_pick.py), a wrapper of [run.sh](https://github.com/ccameron/REPIC/blob/main/repic/iterative_particle_picking/run.sh) (expected run time: 20-30 min/iteration):

In [None]:
! repic iter_pick /content/iter_config.json 4 100

The final set of consensus particles for the testing set will be found in: ```/content/REPIC/examples/10057/iterative_particle_picking/round_4/train_100/clique_files/test/*.box```

### Step 5 &mdash; run REPIC on your data

**REPIC overview**

REPIC can be run in one of two modes:

1. One shot &mdash; find consensus particles from the output of multiple particle picking algroithms
2. Iterative &mdash; _ab-initio_ train an ensemble of particle picking algorithms using either one-shot REPIC output or manually picked particles

One-shot REPIC expects particle coordinate files (in BOX format) to be stored in separate subfolders for each particle picking algorithm:

```console
folder/
  | particle_picking_algorithm_1/
      | mrc001.box
      | mrc002.box
            ⋮
      | mrc999.box
  | particle_picking_algorithm_2/
      | mrc001.box
      | mrc002.box
            ⋮
      | mrc999.box
                ⋮
  | particle_picking_algorithm_n/
      | mrc001.box
      | mrc002.box
            ⋮
      | mrc999.box
```

Iterative REPIC expects micrograph (MRC) and particle coordinate (if using manually picked particles) files to be found in a singular folder:
```console
folder/
  | mrc001.mrc
  | mrc001.box
  | mrc002.mrc
  | mrc002.box
        ⋮
  | mrc999.mrc
  | mrc999.box
```

REPIC will create training and validation subsets from files in ```folder/``` before performing _ab-initio_ model training.

**Uploading data to Google Colab - zipped folder**

1. Upload a zipped folder of picked particles:
  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Click folder icon in left panel of Colab ("Files") > Upload to session storage

2. Unzip folder:

In [None]:
! unzip <folder_name>.zip

3. Example command &mdash; calculate the particle overlap and enumerate cliques using [get_cliques.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/get_cliques.py):

In [None]:
! repic get_cliques <folder_name> <folder_name>/clique_files/ <box_size>

Replace ```<box_size>``` with the expected detection box size (in pixels).

**Uploading data to Google Colab - Google Drive**

1. Upload your data to Google Drive and mount it:

In [None]:
# Load the Drive helper and mount
from google.colab import drive
drive.mount('/content/drive')

2. Example command &mdash; calculate the particle overlap and enumerate cliques using [get_cliques.py](https://github.com/ccameron/REPIC/blob/main/repic/commands/get_cliques.py):

In [None]:
! repic get_cliques /content/drive/MyDrive/<path_to_folder> /content/drive/MyDrive/<path_to_folder>/clique_files/ <box_size>

For more information about REPIC commands see Steps 3 and 4 above or https://repic.readthedocs.io/en/latest/

**Downloading REPIC results**

Folders containing REPIC output can be downloaded using the following:

In [None]:
! zip -r repic_out.zip <path_to_REPIC_output_folder>/
from google.colab import files
files.download("repic_out.zip")