Skip to content


Repository files navigation

CrossMoST: Cross-Modal Self-Training: Aligning Images and Pointclouds to learn Classification without Labels

Official implementation of Cross-Modal Self-Training: Aligning Images and Point Clouds to learn Classification without Labels

What is CrossMoST

It is an optimization framework to improve the label-free classification performance of a zero-shot 3D vision model by leveraging unlabeled 3D data and their accompanying 2D views. We implement a student-teacher framework to simultaneously process 2D views and 3D point clouds and generate joint pseudo labels to train a classifier and guide cross-model feature alignment.


Overall Pipeline


[Install environments]

We trained our models on 4 Nvidia V100 GPUs, the code is tested with CUDA==11.0 and pytorch==1.10.1
conda create -n crossmost python=3.7.15
conda activate crossmost
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt

[Download datasets and initialize models, put them in the right paths.]

Download the used datasets and initialize models from here. For now, you ONLY need to download "modelnet40_normal_resampled", and "shapenet-55".
The data folder should have the following structure:

./data |
-- co3d |
-- modelnet40_rendered |
-- modelnet40_ply_hdf5_2048 |
-- redwood |
-- scanobjectnn |
-- [dataset].yaml
-- dataset_catalog.json 
-- labels.json 
-- templates.json 

Once you have downloaded and unzipped the datasets,

# Change the data paths in the config files

Then, download the Shapenet-pretrained backbones and the DVAE for the point-transformer.

./checkpoints |
-- dVAE.pth 

[Zero-shot evaluation of Shapenet-pretrained backbones]

Please change the script to accommodate your system accordingly, this script is used to train on 4 gpus by default. You can also modify the desired output folder in the script.

# the scripts are named by its correspoinding 3D backbone name.
bash ./

adjust the bash script accordingly to run evaluations for other datasets.

[Training CrossMoST]

bash ./

You can also run the baseline-self training

bash ./

adjust the bash script accordingly to run evaluations for other datasets.

Checkpoints for evaluating Baseline Self-training vs CrossMoST

You can download the checkpoints of the CrossMoST and our baselines from here and put them in the corresponding directories.

./checkpoints |
-- dVAE.pth 
-- co3d_baseline |
    -- checkpoint-best.pth |
-- scobjwbg_crossmost |
    -- checkpoint-best.pth

To run the evaluation on the provided checkpoints,

bash ./

You can also run the baseline-self training

bash ./

adjust the bash script accordingly to run evaluations for other datasets.


Our code borrows heavily from MUST repository. If you use our model, please consider citing them as well.



Cross-Modal Self-Training: Aligning Images and Point Clouds to learn Classification without Labels






No releases published


No packages published