MM_WM_AD

Repository for the "Multi-Modal World Models in Autonomous Driving" project in the 3D Vision course at ETHZ, 2025. There are two parts to this project in that depth distillation for IJEPA features has been conducted through ground truth depth on one hand and through DepthFM features on the other.

This main branch contains the code and setup instructions for ground truth depth distillation, while the branch ijepa_image_features contains the code and setup instructions for DepthFM distillation. The two are different and the most interesting results are within DepthFM distillation.

Setup

In order to run the latest training with ground truth depth prediction with DPT head, you need to clone this repository and download the dataset(s). First, start by cloning this repo:

cd /path/to/desired/workspace
git clone git@github.com:Juan5713/MM_WM_AD.git

Just to keep things clean, it is good to remove git-related files from the repo so as to freeze it and avoid git issues with nested repos. Run the following if you want to do so:

cd MM_WM_AD # now in MM_WM_AD/ijepa
rm -rf .git .github .gitignore

We will now grab the nyuv2 dataset from https://cs.nyu.edu/~fergus/datasets/nyu_depth_v2.html. In the download section, select the labeled dataset (will download a .mat file). Back in the repository create a datasets folder and a subfolder for nyuv2:

# now back in MM_WM_AD
mkdir datasets && cd datasets && mkdir nyuv2

and then move the downloaded file into this subfolder. Your folder structure should look something like this now:

MM_WM_AD/
|----- datasets/
|      |----- nyuv2/
|      |      |----- nyu_depth_v2_labeled.mat
|      |      |...
|      |...
|----- ijepa/
|      |...
|----- dataloaders/
|      |...
|----- jobscripts/
|      |...
|----- notebooks/
|      |...
|----- scripts/
|      |...
|----- utils/
|      |...
|...

Keep in mind the datasets folder has to be set up manually because they take up a considerable amount of space and it is not viable to keep them in a repo. The ijepa folder contains the adapted code from original IJEPA (https://github.com/facebookresearch/ijepa/tree/main).

Running on Euler cluster (ETH Zurich)

If you are running this on cluster, you will need to make some changes to the IJEPA configuration files. Particularly, open the file MM_WM_AD/ijepa/configs/in1k_vith14_ep300_GTDEPTH.yaml. In here, replace the scratch_dir argument with your absolute path to scratch, particularly /cluster/scratch/[your-eth-id]. Moreover, adapt the wandb logging parameters in the same config file under the wandb argument.

Now, ssh into Euler and make a directory in your cluster home directory and name it as you please. For example:

ssh [your-eth-id]@euler.ethz.ch # ssh into the cluster
cd ~
mkdir 3dvision

Now copy over the datasets, ijepa, jobscripts, dataloaders, utils and scripts directories to the cluster via scp. Run the following to copy the required directories:

exit # in case you were still ssh'ed in the cluster
cd /path/to/MM_WM_AD
scp -r scripts [your-eth-id]@euler.ethz.ch:/cluster/home/[your-eth-id]/3dvision/

Repeat the scp command replacing scripts with the other directory names. Moreover, you will need to head to your scratch directory and create directories to store outputs and checkpoints from the training process as follows:

cd $SCRATCH
mkdir ijepa && cd ijepa && mkdir predictions

Finally, make sure that you pip install -U -r requirements.txt. For this, you need to make sure that the modules stack/2024-06, python/3.11.6 and eth-proxy are loaded, similarly to what is in the jobscripts.

This should complete the required setup for running on the cluster. Now, to run the finetuning, execute:

ssh [your-eth-id]@euler.ethz.ch # ssh back into the cluster
cd ~/3dvision/jobscripts
sbatch < gt_depth.sh

In order to monitor the script running, you can execute watch squeue which will update the status every 2 seconds by default. To exit out of the watching window run ctrl+c. The output of the run will be stored in the jobscripts folder and the graphs will be in your selected wand project.

Running locally

DISCLAIMER: Running these training scripts requires a large amount of memory and processing power and we recommend running on cluster resources where possible.

If you are running this locally, you will need to make some changes to the IJEPA configuration files similarly to the cluster setup. Particularly, open the file MM_WM_AD/ijepa/configs/in1k_vith14_ep300_GTDEPTH.yaml. In here, replace the scratch_dir argument with your absolute path to the directory where you want to store output depth maps and checkpoints. Moreover, adapt the wandb logging parameters in the same config file under the wandb argument.

We can skip the step of copying folders like we would have done in the cluster. Next, move to the directory you indicated earlier to store output depth maps and checkpoints. Here, make folders to store the output:

cd /path/to/desired/out/dir
mkdir ijepa && cd ijepa && mkdir predictions

Finally, make sure that you pip install -r requirements.txt in a virtual environment or another software environment manager of your choice. This should complete the setup for running locally. Next, launch the training script as follows:

cd /path/to/MM_WM_AD
PYTHONPATH=./ijepa:./ python3 ./scripts/gt_depth_pred.py --fname ./ijepa/configs/in1k_vith14_ep300_GTDEPTH.yaml --devices cuda:0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MM_WM_AD

Setup

Running on Euler cluster (ETH Zurich)

Running locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
dataloaders		dataloaders
ijepa		ijepa
jobscripts		jobscripts
models		models
notebooks		notebooks
scripts		scripts
utils		utils
.gitignore		.gitignore
02-dataset_testing.ipynb		02-dataset_testing.ipynb
03-testing_dino.ipynb		03-testing_dino.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MM_WM_AD

Setup

Running on Euler cluster (ETH Zurich)

Running locally

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages