The field of connectomics aims to reconstruct the wiring diagram of the brain by mapping the neural connections at the level of individual synapses. Recent advances in electronic microscopy (EM) have enabled the collection of a large number of image stacks at nanometer resolution, but the annotation requires expertise and is super time-consuming. Here we provide a deep learning framework powered by PyTorch for automatic and semi-automatic data annotation in connectomics. This repository is actively under development by Visual Computing Group (VCG) at Harvard University.
- Multitask Learning
- Active Learning
- CPU and GPU Parallelism
If you want new features that are relatively easy to implement (e.g., loss functions, models), please open a feature requirement discussion in issues or implement by yourself and submit a pull request. For other features that requires substantial amount of design and coding, please contact the author directly.
The code is developed and tested under the following configurations.
- Hardware: 1-8 Nvidia GPUs (with at least 12G GPU memories) (change
- Software: CentOS Linux 7.4 (Core), CUDA>=9.0, Python>=3.6, PyTorch>=1.3.0
Create a new conda environment:
conda create -n py3_torch python=3.7 source activate py3_torch conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
Please note that this package is developed on the Harvard FASRC cluster, where the current version of the Nvidia driver installed is 396.26 that supports Cuda version 9. More information about GPU computing on the FASRC cluster can be found here.
Download and install the package:
git clone firstname.lastname@example.org:zudi-lin/pytorch_connectomics.git cd pytorch_connectomics pip install -r requirements.txt pip install --editable .
- Visualize the training loss and validation images using tensorboardX.
- Use TensorBoard with
tensorboard --logdir runs(needs to install TensorFlow).
- Visualize the affinity graph and segmentation using Neuroglancer.
We provide a data augmentation interface several different kinds of commonly used augmentation method for EM images. The interface is pure-python, and operate on and output only numpy arrays, so it can be easily incorporated into any kinds of python-based deep learning frameworks (e.g., TensorFlow). For more details about the design of the data augmentation module, please check the documentation.
We provide several encoder-decoder architectures, which can be found here. Those models can be applied to any kinds of semantic segmentation tasks of 3D image stacks. We also provide benchmark results on SNEMI3D neuron segmentation challenges here with detailed training specifications for users to reproduce.
Syncronized Batch Normalization on PyTorch
Previous works have suggested that a reasonable large batch size can improve the performance of detection and segmentation models. Here we use a syncronized batch normalization module that computes the mean and standard-deviation across all devices during training. Please refer to Synchronized-BatchNorm-PyTorch for details. The implementation is pure-python, and uses unbiased variance to update the moving average, and use
sqrt(max(var, eps)) instead of
sqrt(var + eps).
This project is built upon numerous previous projects. Especially, we'd like to thank the contributors of the following github repositories:
- pyGreenTea: Janelia FlyEM team
- DataProvider: Princeton SeungLab
- EM-affinity: Harvard Visual Computing Group
This project is licensed under the MIT License - see the LICENSE file for details.