Skip to content

A program interfacing Segment Anything with cryoSPARC via cryosparc-tools to efficiently identify membrane proteins and complexes in cryo-EM micrographs.

License

Notifications You must be signed in to change notification settings

r-karimi/vesicle-picker

Repository files navigation

Vesicle Picker banner.png

Installation

  1. Ensure Git and Anaconda (or Miniconda) are installed on your machine.

  2. If you wish to run the Segment Anything model on GPU, ensure CUDA is installed on your machine. CUDA is not necessary if you wish to only run Segment Anything on your machine's CPU.

  3. Clone this repository:

    git clone https://github.com/r-karimi/vesicle-picker.git
    
  4. Enter this repository:

    cd vesicle-picker
    
  5. Create a clean conda virtual environment.

    conda create -n vesicle-picker
    conda activate vesicle-picker
    conda install pip
    
  6. Edit the pyproject.toml file in the base directory to install the correction version of PyTorch, PyTorch vision, and PyTorch audio for your machine. These instructions differ based on whether you are installing PyTorch for CPU or GPU usage.

    CPU Installation

    • Visit the PyTorch installation page and select the appropriate options, ensuring that Pip is selected as the package manager and CPU is selected as the compute platform. Note the given install command, but do not run it.

    • Modify install-pytorch in pyproject.toml with the install command noted above:

       # Example: CPU
       install-pytorch = "pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu"
      

    GPU Installation

    • Note your version of CUDA and Python by running:

       nvcc --version
       python --version
      
    • Browse the PyTorch wheels to find the appropriate versions of PyTorch, PyTorch vision, and PyTorch audio for your installed versions of CUDA and Python (e.g. cu118 for CUDA 11.8 and cp39 for Python 3.9).

    • Modify the install-pytorch command in pyproject.toml to match these versions:

       # Example: Python 3.9.X and CUDA 11.8
       install-pytorch = "pip install torch==2.1.1+cu118 torchvision==0.16.1+cu118 torchaudio==2.1.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html"
      
  7. Install vesicle-picker and dependencies:

    pip install .
    poe install-pytorch
    
  8. Download the Segment Anything model weights and place them in the vesicle-picker repository. We recommend trying with the ViT-L model weights first.

  9. Modify csparc_login.ini to match your active CryoSPARC instance from which micrographs will be imported into Vesicle Picker and into which particle locations will be exported.

Usage

Before processing your own dataset, we recommend working through the introductory Jupyter notebook find_vesicles.ipynb. This notebook describes how the program imports data residing in CryoSPARC and describes each step of the processing pipeline.

To process your own dataset, follow the steps below:

CryoSPARC (Part 1)

  1. Import your movies, then perform patch motion correction and patch CTF estimation.

  2. Curate your motion corrected micrographs.

  3. Note the project ID, workspace ID, and job ID of your Curate Exposures job.

Python

  1. Find the optimal mask pre-processing and postprocessing parameters for your data by importing a test micrograph using the find_vesicles.ipynb Jupyter notebook. We note in our paper that a combination of roundness and area postprocessing filters are sufficient to obtain high precision and recall in the task of finding synaptic vesicles. If these are sufficient for your dataset as well, then the parameters that need to be set by the user are as follows.

    • $\sigma_{space}$, $\sigma_{colour}$, and $d$ for the bilateral filter.
    • $roundness_{min}$ for the roundness filter.
    • $area_{min}$ and $area_{max}$ for the area filter.
    • $r_{dilation}$ or $r_{erosion}$ for particle picks offset from the membrane edge.
    • Box size to control the density of the picks.

    There are a variety of other postprocessing filters that can be applied to your data as well. These filters are commented out in parameters/filter_vesicles.ini by default. More information about the various postprocessing methods implemented in this library can be found in vesicle_picker/postprocessing.py.

  2. Find vesicles by modifying the find_vesicles.ini parameter file with your desired parameters, ensuring to fill in the correct CryoSPARC information. Also make sure to fill in your CryoSPARC login information, using csparc_login.ini as a template. Finally, indicate an appropriate output directory for the detected vesicles. These will be stored in Python .pkl files. We have pre-filled this parameter file with a reasonable set of starting parameters.

    When you're ready, run the find_vesicles.py script. The script takes a file path to the parameters file as its only argument:

    python find_vesicles.py parameters/find_vesicles.ini
    
  3. Filter the found vesicles by modifying the filter_vesicles.ini parameter file, uncommenting the types of filters that you want to use and setting their minimum and maximum values. Ensure to set the input directory for this script as the output directory of find_vesicles.py. Again, we have pre-filled this parameter file with a reasonable set of starting parameters.

    When you're ready, run filter_vesicles.py:

    python filter_vesicles.py parameters/filter_vesicles.ini
    
  4. Generate particle picks by modifying the generate_picks.ini parameter file. Set the workspace into which the vesicle picks will be exported. Set the dilation or erosion radius if desired, and set box size parameter to control the density of picks. We recommend picking with a high density and removing duplicate particles later in CryoSPARC. Ensure to set the input directory for this script as the output directory of filter_vesicles.py.

    Run generate_picks.py:

    python generate_picks.py parameters/generate_picks.ini
    

    Once this script has finished executing, you should see a collection of .pkl files in the output directory of this script, as well as a new, completed job in CryoSPARC called Vesicle Picks. This job will be used for downstream processing in CryoSPARC.

CryoSPARC (Part 2)

  1. Extract particles from the micrographs that were used as input to find_vesicles.py. We recommend extracting with a box size 2x to 3x larger than the box size used to generate picks.

  2. Proceed with downstream analysis in CryoSPARC, such as 2D classification and Ab initio reconstruction.

Tips

  • We recommend experimenting with different model architectures and downsampling factors to find a good trade-off between accuracy and speed when processing a full dataset. We have found that perfect recall when finding vesicles is usually unnecessary for obtaining a structure. A small set of high-quality vesicles are usually more informative than vesicles mixed in with junk, so don't be afraid of stringently filtering your vesicles.

  • If you're able to generate good 2D classes of a membrane protein complex with Vesicle Picker, these particles can be used for template matching and training a Topaz model to obtain a larger and more well-centered particle stack for subsequent 3D reconstruction and refinement.

  • When performing 2D classification, particularly when searching for small membrane proteins and protein complexes, we found that it is important to perform at least 40 iterations of expectation-maximization. We also typically increase the batchsize per class to 150 or 200. Finally, we almost always see better results when we disable the Recenter 2D classes parameter.

  • We typically iterate 2D classification and selection of promising 2D classes several times. In early iterations, we enable the Force Max over poses/shifts parameter to efficiently classify large numbers of particles. In later iterations, where images of proteins in membranes are enriched, we typical disable the Force Max over poses/shifts parameter to better resolve low SNR particles within membranes.

Reference

Karimi, R., Coupland, C. E. & Rubinstein, J. L. Vesicle Picker: A tool for efficient identification of membrane protein complexes in vesicles. bioRxiv 2024.07.15.603622 (2024) doi:10.1101/2024.07.15.603622.

About

A program interfacing Segment Anything with cryoSPARC via cryosparc-tools to efficiently identify membrane proteins and complexes in cryo-EM micrographs.

Resources

License

Stars

Watchers

Forks