Skip to content

sailab-code/unsupervised-learning-feature-flow

Repository files navigation

CODE REPOSITORY CONTENTS

The datasets needed for reproducing experiments can be downloaded from the benchmark proposed by Tiezzi et al., which can be found at this link in the data folder: the content of the folder should be downloaded and inserted into the data folder of this repository. Similarly, for data of real-world videos, you can find prepared streams here.

confeat :               folder containing model definitions
data :                  folder containing the rendered streams
lve :                   source folder for handling the video processing and model creation and training  
run_conj.py :           experiments runner
run_conj_sparse.py :    experiments runner to be used when ground truth is provided in a small subset of frames (real world videos)
best_cmosfet_runs.txt : command lines (and parameters) to reproduce the main results

REPRODUCE PAPER RESULTS

We tested our code with PyTorch 1.10. Please install the other required dependencies by running:

pip install -r requirements.txt

In the best_cmosfet_runs.txt file there are the command lines (hence, the experiments parameters) required to reproduce the experiments of the main results (Table 1).

RUNNING EXPERIMENTS

We provide Python scripts to easily test the proposed model (run_conj.py is to be used when ground truth is provided for every frame, run_conj_sparse.py is to be used when ground truth is provided in a small subset of frames). The PyTorch device is chosen through the --device argument (cpu, cuda:0, cuda:1, etc.).

usage: run_conj.py [-h] [--laps_unsup LAPS_UNSUP] [--laps_sup LAPS_SUP] [--laps_metrics LAPS_METRICS] 
             [--step_size_features STEP_SIZE_FEATURES] [--step_size_displacements STEP_SIZE_DISPLACEMENTS] 
             [--force_gray {yes,no}]  [--dataset DATASET] 
             [--seed SEED] [--crops CROPS] [--flips FLIPS] [--jitters JITTERS]
             [--lambda_c_lower LAMBDA_C_LOWER] [--lambda_c_upper LAMBDA_C_UPPER] [--lambda_r LAMBDA_R]  [--lambda_s LAMBDA_S] 
             [--feature_planes FEATURE_PLANES]  [--device DEVICE] [--features_block FEATURES_BLOCK] [--displacement_block DISPLACEMENT_BLOCK]
             [--wandb WANDB]
             [--num_pairs NUM_PAIRS] [--lambda_sim LAMBDA_SIM]
             [--similarity_threshold SIMILARITY_THRESHOLD] [--dissimilarity_threshold DISSIMILARITY_THRESHOLD] [--moving_threshold MOVING_THRESHOLD] 
             [--sampling_type {plain,motion,features,motion_features}] [--kept_pairs_perc KEPT_PAIRS_PERC] 
             [--simdis_loss_tau SIMDIS_LOSS_TAU] [--teacher {yes,no}] [--teacher_ema_weight TEACHER_EMA_WEIGHT] 
             [--simdis_neg_avg SIMDIS_NEG_AVG]

Argument description/mapping with respect to the paper notation:

    --laps_unsup :  number of unsupervised laps where the coherence losses are minimized
    --laps_sup : number of laps on which the supervised templates are provided
    --laps_metrics : number of laps on which the metrics are computed (here the model weight are frozen, no learning is happening)
    --step_size_features : learning rate for the features branch
    --step_size_displacements : learning rate for the displacements branch
    --crops :  number of random crop augmentations
    --jitters :  number of jitters augmentations
    --flips :  number of horizontal flips  augmentations
    --lambda_c_lower : \lambda_{low} in the paper
    --lambda_c_upper : \lambda_{cur} in the paper
    --lambda_c_skip : \lambda_{skip} in the paper
    --lambda_r : \lambda_r in the paper
    --lambda_sim : \beta_f in the paper
    --lambda_s : \lambda_s in the paper 
    --simdis_loss_tau : \tau in the paper  
    --num_pairs: maximum number of points for the sampling procedure (corresponds to \eta * \eta from the paper)  
    --max_supervisions : number of supervisions per object
    --force_gray : if "yes", it corresponds to the "BW" of the paper. "no" requires an RGB stream
    --feature_planes : output dimension (number of features) of the features neural branch
    --displacement_block : architecture for the motion prediction branch [default: resunetblocknolastskip]
    --features_block : architecture for the feature branch [default: resunetblock_bias]
    --similarity_threshold : \tau_p in the paper
    --dissimilarity_threshold : \tau_n in the paper
    --moving_threshold : \tau_m in the paper
    --block_scheduling : T_{sched} in the paper
    --dataset  : specify the input stream
    --sampling_type : the nature of the sampling in {plain,motion,features,motion_features} (see ablation studies)
    --kept_pairs_perc : \aleph in the paper
    --teacher : activate the EMA net (default yes)
    --teacher_ema_weight  : \xi in the paper
    --seed : specify the seed

COLLECTING THE FINAL METRICS

The final metrics are dumped in the model_folder/final_metrics.json file.

This file contains a dictionary with multiple metrics. The metrics that are reported in Table 1 of the paper are under the key f1_window_whole_global. The F1-based on attention trajectory can be found under the key f1_window_foa_global.

COMPETITORS

For the competitors performances we re-runned the code by Tiezzi et al. We slightly updated the code to support the MOCO and PixPro models (pre-trained weights can be found in the respective repositories).

NOTICE: PyTorch does not guarantee Reproducibility is not guaranteed by PyTorch across different releases, platforms, hardware. Moreover, determinism cannot be enforced due to use of PyTorch operations for which deterministic implementations do not exist (e.g. bilinear upsampling).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published