Scan2Cap

First Presentation & Second Presentation

Scan2Cap

This project builds upon the Scan2Cap architecture.

Meshed Memory Transformer

We utilize the Meshed Memory Transformer architecture in this model. Therefore some files provided here originate from the Meshed Memory Transformer.

The license is provided in LICENSE_Meshed_Memory_Transformer.md

Data

Please refer to Scan2Cap in order to get the dataset.

Setup

Please also follow all setup steps provided in Scan2Cap for our project. Please make sure to install our requirements.txt with pip install -r requirements.txt

Additionally you will need to insert create_scanrefer_filtered_train_small.py into /data and execute it.

This will create a small subset of the training data for evaluation on items from the training set. This does not affect evaluation on items from the validation set.
If you wish to use the whole training set during evaluation, just copy ScanrRefer_filtered_train.json to ScanrRefer_filtered_train_small.json in /data.

Training our model

To reproduce our S2C-MMT results please first train the model for 50 epochs.

    python scripts/train.py --use_multiview --use_normal --use_orientation --use_relation --num_graph_steps 2 --num_locals 10 --batch_size=18 --epoch=50 --lr=0.001 --val_step=2000 --wd=0.0001 --transformer_dropout=0.1 --attention_module_memory_slots=20 --d_model=192 --transformer_d_ff=1024 --no_beam_search --transformer_d_k=32 --transformer_d_v=32 --no_encoder

As a next step train the model further for 5 epochs on following settings

    python scripts/train.py --use_multiview --use_normal --use_orientation --use_relation --num_graph_steps 2 --num_locals 10 --batch_size=18 --epoch=50 --lr=0.0001 --val_step=100 --wd=1e-4 --transformer_dropout=0.1 --attention_module_memory_slots=20 --d_model=192 --transformer_d_ff=1024 --no_beam_search --transformer_d_k=32 --transformer_d_v=32 --use_checkpoint=<model_name> --no_encoder --load_best

You will find your trained model in outputs/<model_name>.

Reproducing results

We provide a pretrained model in outputs/s2c_mmt to reproduce the results. You can also use the steps mentioned in "Training our model" to train the model yourself.

Evaluating caption performance at 0.5IoU

 python scripts/eval.py --use_multiview --use_normal --use_relation --num_graph_steps 2 --num_locals 10 --batch_size=8 --transformer_dropout=0 --attention_module_memory_slots=20 --d_model=192 --transformer_d_ff=1024 --transformer_d_k=32 --transformer_d_v=32 --folder <model_name> --min_iou=0.5 --eval_caption --beam_size 2 --no_encoder

Evaluating caption performance at 0.25IoU

 python scripts/eval.py --use_multiview --use_normal --use_relation --num_graph_steps 2 --num_locals 10 --batch_size=8 --transformer_dropout=0 --attention_module_memory_slots=20 --d_model=192 --transformer_d_ff=1024 --transformer_d_k=32 --transformer_d_v=32 --folder <model_name> --min_iou=0.25 --eval_caption --beam_size 2 --no_encoder

Evaluating detection performance:

 python scripts/eval.py --use_multiview --use_normal --use_relation --num_graph_steps 2 --num_locals 10 --batch_size=8 --transformer_dropout=0 --attention_module_memory_slots=20 --d_model=192 --transformer_d_ff=1024 --transformer_d_k=32 --transformer_d_v=32 --folder <model_name> --eval_detection --no_beam_search --no_encoder

License

Scan2Cap is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Meshed-Memory Transformer follwos BSD-3-Clause License.

Please refer to Scan2Cap and Meshed Memory Transformer for licensing

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data/scannet		data/scannet
lib		lib
models		models
outputs/s2c_mmt		outputs/s2c_mmt
presentation		presentation
pretrained		pretrained
scripts		scripts
utils		utils
utils_meshed_memory_transformer		utils_meshed_memory_transformer
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE_Meshed_Memory_Transformer.md		LICENSE_Meshed_Memory_Transformer.md
README.md		README.md
create_scanrefer_filtered_train_small.py		create_scanrefer_filtered_train_small.py
download-scannet.py		download-scannet.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scan2Cap

Meshed Memory Transformer

Data

Setup

Training our model

Reproducing results

License

About

Releases

Packages

Languages

antoniooroz/scan2cap-mmt

Folders and files

Latest commit

History

Repository files navigation

Scan2Cap

Meshed Memory Transformer

Data

Setup

Training our model

Reproducing results

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages