Thank you to the original authors for creating this package. Friedmann D, Pun A, et al. https://github.com/AlbertPun/TRAILMAP.
This forked repo of TRAILMAP adds:
-
Entry point to generate random 3D crops of raw data in
prepare_data.py
Instead of manually cropping 3D subvolumes of raw data for annotation, run
python3 prepare_data.py "generate_annotation_subvolumes" PATH_TO_RAW_FOLDER PATH_TO_OUTPUT_FOLDER CUBE_LENGTH
where PATH_TO_RAW_FOLDER is the folder containing raw tiff volumes (data/training/training-raw) and PATH_TO_OUTPUT_FOLDER (data/training/training-original/volumes) is the target folder where subvolumes will be stored for annotation. The CUBE_LENGTH parameter defaults to 200px as specified in the original TrailMap documentation. This will output a csv file of the x, y coordinates used to crop these volumes.
and changes:
-
VolumeDataGenerator
No longer inherits from
tensorflow.keras.utils.Sequence
.flow
method modified to return aVolumeArrayIterator
which inherits fromtensorflow.keras.preprocessing.image.Iterator
. This ensures all images are presented once and only once per epoch. This allows many arguments passed tomodel.fit
to be removed. -
Remove augmentation of validation set. Training and validation sets now have unique data generators.
Our best model (download) was trained from scratch on a Nvidia RTX 3070 GPU with the following parameters:
Parameter | Value |
---|---|
batch_size |
6 |
epochs |
22 |
learning_rate |
1e-4 |
decay |
0 |
This is a software package to extract axonal data from cleared brains as demonstrated in Mapping Mesoscale Axonal Projections in the Mouse Brain Using A 3D Convolutional Network Friedmann D, Pun A, et al. While it was trained and tested on axons in iDISCO-cleared samples imaged by lightsheet microscopy, it works well at identifying many types of filamentous structures in 3D volumes. Instructions are included for segmenting your data with our best existing model, but we also provide guidance on transfer learning with your own annotated data.
There are two readmes—this one includes basic instructions and descriptions for getting TrailMap up and running. The second is written to be simple enough to be useful for novice users—not just of machine learning tools, but even if this is your first time using python or linux, you can hopefully follow along.
You must have git installed along with git-lfs
To download the required files, run
git clone https://github.com/AlbertPun/TRAILMAP.git
From Terminal, enter into the TRAILMAP directory
cd /home/USERNAME/Documents/TRAILMAP
Due to the large size of the model, you must enable git-lfs in this directory by running
git lfs install
If git-lfs does not work for you (e.g. error messages with the model being corrupted), we have also provided a google drive link to the model weights here. Drag the downloaded file into the TRAILMAP/data/model-weights folder.
- While the network portion of TrailMap operates on your 3D volume in chunks and has minimal hardware requirements, visualizations benefit from having sufficient RAM to hold the whole sample volume at once. Depending on your brain file size, TrailMap will take 8-16GB of RAM. Opening a 16-bit volume covering a half-brain (as seen in the publication) requires ~24 GB. In practice, 64GB is sufficient, but 128GB+ provides greater flexibility when opening multiple volumes at once.
- A Nvidia GPU with CUDA support. The network presented here was trained with a 1080Ti with 11GB GDDR5X Memory.
- You need to also install the Nvidia Driver Verision > 418 with CUDA 10.1 and CUDNN 7.6 to use your GPU. Guide on installation for CUDA and CUDNN
- Python 3.7
We highly recommend you use Anaconda to manage your packages because it is by far the easiest way to install cuda and cudnn with the correct versions. Refer to the Readme Extended Version for step by step instructions on how to do this with Anaconda
tensorflow-gpu==2.1
opencv==3.4
pillow==7.0
numpy==1.18
h5py==2.1
Data structure: Your brain volume must be in the form of 2D image slices in a folder where the slice order is determined by the alphanumerical order of the image file names. See TRAILMAP/data/testing/example-chunk for an example
To run the network cd into the TRAILMAP directory and run
python3 segment_brain_batch.py input_folder1 input_folder2 input_folder3
where “input_folderX” is the absolute path to the folder of the brain you would like to process. Note: Depending on the amount of GPU memory, you may need to lower the batch_size in the segment_brain.py file.
The program will create a folder named “seg-input_folderX " in the same directory as the input_folder
To segment an example chunk run
python3 segment_brain_batch.py data/testing/example-chunk
Note: Depending on the amount of GPU memory, you may need to lower the batch_size in the segment_brain.py file.
If you would like to do transfer learning with your own examples, you must label some of your own data. The network takes in cubes with length of 64 pixels, so the training examples must be of this size.
When labeling the cube, you must follow the following legend. You only need to label individual slices, spaced every 20-40 slices, with the labels:
- 1 - background
- 2 - axon
- 3 - artifacts (objects that look like axons and should be avoided)
- 4 - the edge of the axon (should be programmatically inserted)
All other slices should be labeled with
- 0 - unlabeled
Our strategy, with provided utilities programs:
The basic procedure is to hand label cubes of length ~128-200 px (placed in the folders in data/training/training-original/labels and data/training/training-original/volumes) and crop out many cubes of length 64px (into the folder /training-set/). While we recommend this strategy, you may use your own strategy of populating the training-set folder if you wish. We recommend using ImageJ's Segmentation Editor Plugin to label your data.
TrailMap will determine a volume's matching label by sorting files in the ‘volumes’ and ‘labels’ folders alphabetically and assuming the label and volume at the same index are pairs.
We have included a script to add the edge label next to an axon label. After you have labeled axons in your volumes, run
python3 prepare_data.py "process_labels" PATH_TO_LABEL_FOLDER
where PATH_TO_LABEL_FOLDER is the folder containing all your labeled tiff volumes. This will create a folder edge-ORIGINAL_FOLDER_NAME with the new edge label added to each labeled volume.
Fully labeled examples are show here:
- data/training/training-original/labels/training_example_label.tif
- data/training/training-original/volumes/training_example_volume.tif
After you have placed your labeled examples in training-original folder you can populate the training-set folder by running
python3 prepare_data.py "generate_training_set" NUM_EXAMPLES
NUM_EXAMPLES is an option parameter that determines the numbers of crops to make in total using a round robin strategy from the training-original folder. If not specified, the default value is set to 100 * NUM_TRAINING_ORIGINAL_EXAMPLES
You must also include a validation set to judge your network's performance on the new data. You must put your validation examples in the data/training/validation-original with the same labeling technique as the training. You can then populate the validation-set folder by running
python3 prepare_data.py "generate_validation_set" num_examples
After generate_training_set has populated the training-set folder, you may start the transfer learning. This will require you to tune the parameters in train.py
There are some default parameters for training, but you will likely need to tune this depending on how different your own training set is to our data and if you need to do any augmentation.
A VolumeDataGenerator class is provided that handles basic operations (the train.py contains this class and more specific information in the comments). This follows the same paradigm as Tensorflow's ImageDataGenerator.
After you have populated the training-set folder and tuned parameters, start training with:
python3 train.py
Note: Depending on the amount of GPU memory, you may need to reduce the batch_size in the train.py file. You may also change steps_per_epoch = floor(NUM_TRAINING_EXAMPLES/batch_size) and validation_steps = floor(NUM_VALIDATION_EXAMPLES/batch_size) This will load in the current model and start training the model on your own data. Checkpoints are saved to data/model-weights at the end of each epoch along with tensorboard logs in data/tf-logs
- Albert Pun
- Drew Friedmann
This project is licensed under the MIT License
- Research sponsored by Liqun Luo's Lab