Skip to content
Kartik Ayyer edited this page Jan 2, 2019 · 24 revisions

Welcome to the Dragonfly wiki!

This software package implements the EMC single-particle reconstruction algorithm via the MPI and OpenMP frameworks. This package also includes a data stream simulator, that generates noisy single-particle diffraction patterns from a PDB file as well as an experimental pattern-classification GUI to separate single particle diffraction patterns in experimental data. Please cite the following publication if you use Dragonfly for your work:

Ayyer, K., Lan, T. Y., Elser, V., & Loh, N. D. (2016). Dragonfly: an implementation of the expand–maximize–compress algorithm for single-particle imaging. Journal of applied crystallography, 49(4), 1320-1335.


Left: Flowchart with simulated data showing data stream simulator. Right: Flowchart for experimental data. The photon data and detector geometry are presumed to come from the X-ray facility.

Simulating single-particle imaging

This workflow simulates the creation of a scattering density from a PDB file, sets up an scattering geometry, simulates the collection of photon data, then prepares and initiates a 3D-diffraction volume reconstruction using the EMC algorithm.

The flowchart above (left) illustrates the pipeline to effect this workflow. A quick start guide is provided here. You might want to learn about Configuring your experiment and about the utilities in the Data stream simulator as well.

Experimental single-particle imaging

This workflow is used to reconstruct single-particle imaging data from elsewhere (e.g. actual experiments or undisclosed sources!). A quick start guide for processing data from the Single Particle Imaging (SPI) initiative at the LCLS is here. There is also additional information about experiment-specific configuration parameters as well as the experimental pattern classifier GUI.

The flowchart above (right) illustrates the processing pipeline typically employed once the single particle data is collected.


Repository structure and requirements

This parent directory of this repository contains:

  1. a Makefile to compile the C modules;
  2. a sample configuration file to start a simulation workflow;
  3. a dragonfly_init utility to spawn an isolated reconstruction instance;
  4. a src/ directory where EMC source code is kept;
  5. an aux/ directory that holds auxiliary input (e.g. detector masks, PDB files and scattering factors);
  6. a utils/ directory that contains Python modules for the data stream simulator and other C source files.

Python utilities

C requirements

  • A *NIX environment to compile code to binaries and execute them
  • gcc (or other C compiler)
  • OpenMP
  • MPI
  • GNU Scientific Library (gsl)

Python requirements

  • Core: numpy, pyqt, matplotlib
  • Data stream simulator: scipy, pyfftw
  • Pattern classifier: scikit-learn
  • Conversion utilities: h5py

If you have Anaconda, the dependencies can be installed in an environment with the following command: conda env create -f dragonfly_conda.yml


Data formats

This package uses the following file extensions. Although the extension names can be changed in the configuration file, their data formats are determined by the functions in the package. The specifics of the data format are listed with the usage description of the modules/utilities in this wiki.

Binary formats

  • *.bin - Raw binary dumps, usually not for direct user access
  • *.emc - Custom binary format to store sparse photon data efficiently

ASCII plain text

  • *.dat

FAQ and troubleshooting

For more information see FAQ. This will be updated as more questions are asked.

Email Duane Loh (duaneloh [at] nus.edu.sg) or Kartik Ayyer (kartik.ayyer [at] mpsd.mpg.de) for any questions.