Skip to content

yangyangxu0/DeMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeMT

This repo is the official implementation of "DeMT" as well as the follow-ups. It currently includes code and models for the following tasks:

Updates

02/10/2023

  1. We will release the code of DeMT at the end of February.

  2. Merged Code.

  3. Released a series of models. Please look into the data scaling paper for more details.

02/07/2023

News:

  1. The Thirty-Seventh Conference on Artificial Intelligence (AAAI2023) will be held in Washington, DC, USA., from February 7-14, 2023.

02/01/2023

  1. DeMT got accepted by AAAI 2023.

Introduction

DeMT (the name DeMT stands for Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction) is initially described in arxiv, which is based on a simple and effective encoder-decoder architecture (i.e., deformable mixer encoder and task-aware transformer decoder). First, the deformable mixer encoder contains two types of operators: the channel-aware mixing operator leveraged to allow communication among different channels (i.e., efficient channel location mixing), and the spatial-aware deformable operator with deformable convolution applied to efficiently sample more informative spatial locations (i.e., deformed features). Second, the task-aware transformer decoder consists of the task interaction block and task query block. The former is applied to capture task interaction features via self-attention. The latter leverages the deformed features and task-interacted features to generate the corresponding task-specific feature through a query-based Transformer for corresponding task predictions.

DeMT achieves strong performance on PASCAL-Context (75.33 mIoU semantic segmentation and 63.11 mIoU Human Segmentation on test) and and NYUD-v2 semantic segmentation (54.34 mIoU on test), surpassing previous models by a large margin.

DeMT

Main Results on ImageNet with Pretrained Models

DeMT on NYUD-v2 dataset

model backbone #params FLOPs SemSeg Depth Noemal Boundary model checkpopint log
DeMT HRNet-18 4.76M 22.07G 39.18 0.5922 20.21 76.4 Google Drive log
DeMT Swin-T 32.07M 100.70G 46.36 0.5871 20.60 76.9 Google Drive log
DeMT(xd=2) Swin-T 36.6M - 47.45 0.5563 19.90 77.0 Google Drive log
DeMT Swin-S 53.03M 121.05G 51.50 0.5474 20.02 78.1 Google Drive log
DeMT Swin-B 90.9M 153.65G 54.34 0.5209 19.21 78.5 Google Drive log
DeMT Swin-L 201.64M -G 56.94 0.5007 19.14 78.8 Google Drive log

DeMT on PASCAL-Contex dataset

model backbone SemSeg PartSeg Sal Normal Boundary
DeMT HRNet-18 59.23 57.93 83.93 14.02 69.80
DeMT Swin-T 69.71 57.18 82.63 14.56 71.20
DeMT Swin-S 72.01 58.96 83.20 14.57 72.10
DeMT Swin-B 75.33 63.11 83.42 14.54 73.20

Citing DeMT multi-task method

@inproceedings{xyy2023DeMT,
  title={DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction},
  author={Xu, Yangyang and Yang, Yibo and Zhang, Lefei },
  booktitle={Proceedings of the The Thirty-Seventh Conference on Artificial Intelligence (AAAI)},
  year={2023}
}

Getting Started

Install

conda install pytorch==1.7.0 torchvision==0.8.1 cudatoolkit=10.1 -c pytorch
conda install pytorch-lightning==1.1.8 -c conda-forge
conda install opencv==4.4.0 -c conda-forge
conda install scikit-image==0.17.2

Data Prepare

wget https://data.vision.ee.ethz.ch/brdavid/atrc/NYUDv2.tar.gz
wget https://data.vision.ee.ethz.ch/brdavid/atrc/PASCALContext.tar.gz
tar xfvz ./NYUDv2.tar.gz 
tar xfvz ./PASCALContext.tar.gz

Train

To train DeMT model:

python ./src/main.py --cfg ./config/t-nyud/swin/siwn_t_DeMT.yaml --datamodule.data_dir $DATA_DIR --trainer.gpus 8

Evaluation

  • When the training is finished, the boundary predictions are saved in the following directory: ./logger/NYUD_xxx/version_x/edge_preds/ .
  • The evaluation of boundary detection use the MATLAB-based SEISM repository to obtain the optimal-dataset-scale-F-measure (odsF) scores.

Acknowledgement

This repository is based ATRC. Thanks to ATRC!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages