AnyTrack: Track any features

Track anything with minimal annotations using multiple views (synchronized video).

Based on MBW: Multiview Bootstrapping in the Wild (NeurIPS 2022). Citation details are at the end of this read-me.

AnyTrack provides a powerful way of generating labeled data in the wild at scale. Thereby, it democratizes the domain of data collection to a plethora of machine learning-driven computer vision applications via its novel self-supervision technique.

Setup

To install libmamba(A Faster Solver for Conda) and set the new solver, run the following commands:
```
conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba
```
If you are new to `conda`` package and environment management, please visit

https://conda.io/projects/conda/en/latest/user-guide/getting-started.html

Create conda environment and activate it:

conda env create -f environment_gpu.yml
conda activate anytrack

Fetch the pre-trained flow and detector models from Google Cloud using:

gcloud storage cp gs://aiml-shop-anytrack-data/models.zip .
unzip models.zip
rm -rf models.zip && rm -rf md5sums.txt

Download the data and unzip it in the data directory.

This project makes use of the Google Cloud CLI (Command Line Interface)

To familiarise yourself with this tool please visit: https://cloud.google.com/sdk/docs/install-sdk

gcloud storage cp gs://aiml-shop-anytrack-data/data.zip .
unzip data.zip
rm -rf data.zip

The final directory after retrieving pre-trained models and sample data should look like this:

${anytrack}
 `-- data
     `-- Chimpanzee
         |-- annot/
         |-- images/
     `-- Clownfish
         |-- annot/
         |-- images/
     `-- Colobus_Monkey
         |-- annot/
         |-- images/
     `-- Fish
         |-- annot/
         |-- images/
     `-- Seahorse
         |-- annot/
         |-- images/
     `-- Tiger
         |-- annot/
         |-- images/
 `-- models
     |-- detector/
     |-- flow/
     |-- mvnrsfm/

Unit tests

./scripts/unit_tests.sh

Training (Generate labels from AnyTrack)

./scripts/train.sh

Note: The scripts need to have views folder named as CAM_1, CAM_2, ....

Evaluation and visualization

./scripts/eval.sh
./scripts/visualize.sh

Demos

Follow along tutorial.ipynb for a tutorial starting from how to capture videos from multiple views of a robot dog to training and visualizing the labels.
Please run visualize_demo.ipynb to play around and see visualizations of labels in the provided Jupyter notebook. We provide tools to visualize the 2D predictions (along with a confidence flag of Accept or Reject). The reconstructed 3D could also be visualized via an interactive plotly visualization tool.

jupyter notebook --ip 0.0.0.0 --no-browser

Note: Notebooks must be run in the respective python environment described in Setup.

Create a custom dataset

Create a json file as shown below in configs/joints/Corgi.json

{
"num_joints": 3,
"joint_connections": [
    [0, 2],
    [1, 2],
    [0, 1]
],
"joints_name": ["left_ear", "right_ear", "nose"],
"range_scale": 1,
"R": [
    [1.0, 0.0, 0.0],
    [0.0, 1.0, 0.0],
    [0.0, 0.0, 1.0]
]
}

The joints_connections are a line drawn between a number of indecies for verticies of a keypoint detector as listed in joints_name so in this example left_ear is the first (index 0) joint and nose is the last (index 2) joint.
The first joint_connection is from left_ear (index 0) to nose (index 2).
The R is the rotation (and scale) component of a 4x4 tranformation matrix, used by the visualisation as rigid_rotation with its value as identity it has no effect
We can see in this example we have three joints as indicated by num_joints
By chance we have three joint_connections by counting the length of this list, but this list could be anything typically to draw a skeleton of a articulated animal or equipment.
The joint connections could be soft tissue like the clown fish example.

Create a datset of frames extracted from multi view synced videos of an object. Check this tutorial on how to create a custom dataset.

-- Corgi
        |-- CAM_1/
            |-- 0001.jpg
            |-- 0002.jpg
            |-- 0003.jpg
            ..
        |-- CAM_2/
            |-- 0001.jpg
            |-- 0002.jpg
            |-- 0003.jpg
            ..
        |-- CAM_3/
            |-- 0001.jpg
            |-- 0002.jpg
            |-- 0003.jpg
            ..

Now we need to pick images from this dataset to annotate. Run the below command to generate a folder called ann_images containing images to be annotated.

python scripts/create_dataset.py --data /home/data/corgi_filtered --name Corgi --percent 2

This will create a data/Corgi folder.

Now annotate each view in folder data/Corgi/ann_images seperately in a annotation tool: Label Studio . The labels should be same as joints_name defined few steps above.

Export annotated data in JSON-MIN format and then rename it to match view name.
Copy the json files to Corgi/annot

To visualize the annotated images run,

Note: You need to install Voxel FiftyOne for this step.

python scripts/visualize.py --images data/Corgi/images/CAM_1 --annot data/Corgi/annot/CAM_1.json

Now to make pickle files run,
```
python scripts/create_pckl.py --data path/to/data --joints configs/joints/Corgi.json --name Corgi
```
--data path to data folder created in above steps

--name name of dataset

--joints path to the JSON file with joint information

This will create pkl files required to train the dataset.

Run the below command to test if we have everything

python common/preprocess.py --dataset Corgi --percentage_if_gt 10 --host localhost --port 8080

Train the model on the dataset

source ./scripts/train.sh

Citation

If you use our code, dataset, or models in your research, please cite with:


@inproceedings{dabhi2022mbw,
	title={MBW: Multi-view Bootstrapping in the Wild},
	author={Dabhi, Mosam and Wang, Chaoyang and Clifford, Tim and Jeni, Laszlo and Fasel, Ian and Lucey, Simon},
	booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
	year={2022},
	ee = {https://openreview.net/forum?id=i1bFPSw42W0},
	organization={NeurIPS}
}


@inproceedings{dabhi2021mvnrsfm,
	title={High Fidelity 3D Reconstructions with Limited Physical Views},
	author={Dabhi, Mosam and Wang, Chaoyang and Saluja, Kunal and Jeni, Laszlo and Fasel, Ian and Lucey, Simon},
	booktitle={2021 International Conference on 3D Vision (3DV)},
	year={2021},
	ee = {https://ieeexplore.ieee.org/abstract/document/9665845},
	organization={IEEE}
}

Licensing

AnyTrack is available for non-commercial internal research use by academic institutions or not-for-profit organisations only, free of charge. Please, see the license for further details. To the extent permitted by applicable law, your use is at your own risk and our liability is limited. Interested in a commercial license? For commercial queries, please email aimlshop@adelaide.edu.au with subject line "AnyTrack Commercial License".

This is an AIML Shop project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

AnyTrack: Track any features

Table of contents

Setup

Unit tests

Training (Generate labels from AnyTrack)

Evaluation and visualization

Demos

Create a custom dataset

Citation

Licensing

Files

README.md

Latest commit

History

README.md

File metadata and controls

AnyTrack: Track any features

Table of contents

Setup

Unit tests

Training (Generate labels from AnyTrack)

Evaluation and visualization

Demos

Create a custom dataset

Citation

Licensing