Skip to content

Optional parameters for DANNCE

joshuahwu edited this page Oct 8, 2021 · 15 revisions

DANNCE has other, optional parameters that can be put in the DANNCE base config (see configs/dannce_mouse_config.yaml) OR the project-specific io.yaml (see, e.g., demo/markerless_mouse_1/io.yaml). Put these optional parameters in the base config if you will use them across many projects, or in the io.yaml file if they are specific to a single project.


verbose.

int. Either 1 or 0. When set to 1, tensorflow will print out training and prediction progress. When set to 0, it won't. (Default: 0)

gpu_id.

string. When using multiple GPUs, use this parameter to set which GPU will be used. If this is not set, by default tensorflow will allocate all GPUs. (Default: "0")

interp.

string. Either nearest or linear. Sets the interpolation method when building 3D volumes from 2D images. (Default: nearest)

💡 Nearest neighbor interpolation is faster, but bilinear interpolation can in principle be more accurate. In the DANNCE paper, we used nearest for our analyses. Note that it is best to use the same interpolation method for both training and prediction.

medfilt_window.

int. If not None, the 3D COM trace with be smoothed with a median filter with size indicated here. (Default: None)

💡 Inspect your com3d.mat files after they are generated by the COMnet. You want them to look smooth, without any outliers. If there are many jumps or outliers, you should focus on improving the COMnet with additional training data or by adjusting COMnet training parameters. Using Label3D in COM-only mode (i.e. where only a single point is selected) can quickly generate more training data for this. If outliers or jumps are sparse, you can try to smooth them away by introducing this median filter. The larger the window, the more likely your COM will become inaccurate and not capture the entire animal in the DANNCE 3D volume.

n_views.

int. Total number of views that the DANNCE CNN will expect as input. (Default: 6)

💡 DANNCE requires network inputs with the same number of cameras as were used to train the network. Thus, if you are using fewer than 6 cameras and are fine-tuning a 6-camera version of DANNCE, the default behavior of dannce-train and dannce-predict is to duplicate the cameras in *dannce.mat to meet this requirement. If you don't want your cameras duplicated, for instance if you are fine-tuning a DANNCE network initially trained with fewer cameras, then this parameter must be set to the number of cameras you are using.

new_last_kernel_size.

list. The kernel size in the last convolutional layer, which is re-initialized during fine-tuning. (Default: [3, 3, 3])

💡 This is a tunable hyperparameter. We leave this at [3, 3, 3] for most analyses, although we also know that [1, 1, 1] works.

mono.

boolean. Set to True if using monochrome videos and a monochrome version of the DANNCE architecture.

Training Parameters

loss.

string. This is the loss function used for training. (Default: mask_nan_keep_loss)

💡This can either be the name of a loss function inside dannce/engine/losses.py, or the name of a keras loss function. In most cases, you'll just use mask_nan_keep_loss, which is just a mean squared error loss that ignores missing labels in a sample.

sigma.

int. When using the MAX network, this value sets the size of the spherical Gaussians (in mm) used as training targets for each landmark. (Default: 10)

💡This should scale with the size of your species (and volume size), and should roughly match the deviation of your labeled landmark positions from the true anatomical position of the landmark.

lr.

float. The learning rate used by the Adam optimizer during training. (Default: 1e-3)

💡 The learning rate is often included in hyperparameter searches. If not performing a systematic search, you might still try a few different learning rates to see if this improves performance. We do not recommend learning rates larger than 1e-2.

n_layers_locked.

int. The number of layers in the pretrained model (starting from the input layer) whose weights are locked and will not update during training. (Default: 2)

💡 After training on a large database, early layers in a CNN have learned to capture features universal to all images (such as edges) and thus do not need to be fine-tuned for another domain. The rule of thumb is that the more data you are using for fine-tuning, the more layers you should unlock. However, in our experiments we find that locking just the first convolutional layer works well, even when fine-tuning with just 50 samples.

channel_combo.

string. Either None or random. If random, camera order is shuffled as batches are generated. (Default: None)

💡 When the first layers of the CNN are locked during fine-tuning, this parameter has no practical effect. However, if training from scratch (train_mode: new), consider channel_combo: random to make the network robust to different view configurations.


rotate.

boolean. Either True of False. When True, image volumes are rotated randomly, in 90 degree increments around the vertical axis, during training as a form of image augmentation. (Default: True)

💡 We leave this set to True for all of our analyses, including training over Rat 7M.

augment_continuous_rotation.

boolean. Either True of False. When True, image volumes are rotated randomly, in continuous angular units, during training as a form of image augmentation. (Default: False)

💡 Thus far, we have set this to False for all of our analyses, although groups implementing volumetric human pose have used this type of augmentation.

augment_rotation_val.

float. Sets the range of continuous rotations used, in degrees. (Default: 5).

augment_hue.

boolean. Either True of False. When True, image volume hue is randomly scaled during training as a form of image augmentation. (Default: False)

💡 Thus far, we have set this to False for all of our analyses, although hue augmentation is common in many image analysis tasks.

augment_hue_val.

float. Sets the range of hue scaling used, as a fraction of the full hue range. (Default: 0.05).

augment_brightness.

boolean. Either True of False. When True, image volume brightness is randomly scaled during training as a form of image augmentation. (Default: False)

💡 Thus far, we have set this to False for all of our analyses, although brightness augmentation is common in many image analysis tasks.

augment_bright_val.

float. Sets the range of brightness scaling used, as a fraction of the full image brightness range. (Default: 0.05).

data_split_seed.

int. Sets the seed for the random number generator used to split data into training and testing partitions. Useful for comparing validation across separate training conditions. (Default: None).

valid_exp.

list[int]. List of indices indicating which samples to be reserved entirely for validation within the exp parameter inside the io.yaml file. (Default: None).

Prediction Parameters

start_sample.

int. Sample index you want to start predicting from in dannce-predict. Useful for parallelizing prediction over multiple GPUs. (Default: 0)

max_num_samples.

int. If not None, number of samples you will predict over. Useful for parallelizing prediction over multiple GPUs. When None, predicts over all samples. (Default: None)

predict_mode.

string. The backend used for 3D volume generation during prediction. (Default: torch)

  • torch uses PyTorch (fastest).
  • tf uses TensorFlow (use if you do not have PyTorch installed for some reason).
  • numpy does not use the GPU for volume generation (slowest). This is a backup method for when there are OOM issues on a GPU.