DETR tutorials to use it on custom data :) #428

NielsRogge · 2021-08-25T16:24:19Z

Some months ago, I've added DETR to HuggingFace Transformers 🤗 I've replaced the original torchvision backbones (ResNets) such that you can use any backbone available in the timm repository (like EfficientNets or MobileNets etc.) 🥳 The model is implemented using the same API as other models in HuggingFace like BERT (i.e. you have DetrModel which is the encoder-decoder Transformer without any head on top, DetrForObjectDetection which has the object detection and class labels classifier heads on top and DetrForSegmentation, which adds the mask head on top).

The model weights are hosted on the HuggingFace hub. The documentation can be found here: https://huggingface.co/transformers/model_doc/detr.html

I've made 5 different notebooks, illustrating how to use DETR both for inference and training on custom data, both for object detection and panoptic segmentation. You can find them here: https://github.com/NielsRogge/Transformers-Tutorials

I've also made a notebook for evaluating the model on COCO. Hope it helps for people to easily use DETR!

Issues for which this might be helpful:

#421
#419
#378
#366
#341
#307
#205
#190
#179
#153
#152
#148
#141
#125
#111
#109
#89
#84
#53
#40
#28
#21
#14
#9

PS the most epic thing was Yann LeCun himself tweeting about this: https://twitter.com/ylecun/status/1405640394143113219

The text was updated successfully, but these errors were encountered:

ducvuuit · 2021-08-28T02:31:07Z

I train on my dataset, at epoch 0, it always stops at 5% without any error, Ram and memory are always guaranteed during training.
Does anyone have the same problem as me?

NielsRogge · 2021-08-28T10:10:59Z

Hi, I've just run my notebook, it still works fine. That's weird. You can perhaps restart the runtime and run again?

ducvuuit · 2021-09-04T08:30:19Z

Hi @NielsRogge, I fixed the error above, only readjust "max_steps". But the time for training is too long. With 1 epoch takes over 7 hours while the official source code only takes about 30 minutes for each epoch. I adjusted the batch size but also take over 7 hours. How to solve this problem?
Thank you in advance.

NielsRogge · 2021-09-06T09:16:08Z

Yes, it takes 30 minutes per epoch if you have 8 GPUs, as stated in the README. On a single GPU, it will take a bit longer. ;)

mytk2012 · 2021-09-15T16:46:36Z

It seems that authors advise people to finetune model to apply in custom dataset. If we have enough datasets(>10k), shall we need to fintune it? It's weired we must finetune detr model , but CNN model can be applied in coco-format without finetuing.

NielsRogge · 2021-10-14T07:50:50Z

Yes, I would fine-tune the whole model if your dataset is about 10k images. Only the class labels classifier would need to be trained from scratch.

ver0z · 2021-11-18T15:56:37Z

Do you know if it is possible to use DeepSpeed with DETR ? It could help to speedup the training, don't it ?

NielsRogge · 2021-11-18T16:57:21Z

Hi,

DeepSpeed is mostly meant to fit very big models in one or more GPUs. Its use case is not to speed up training, iirc.

ver0z · 2021-11-18T17:56:45Z

So if I pass a big number of batch could it be useful in this case ?

ohhenrylee · 2023-01-25T23:03:38Z

During training, is the CNN backbone trained at the same time as DETR or will pre-training be required?

NielsRogge · 2023-01-26T08:19:07Z

The CNN is trained at the same time as the encoder-decoder Transformer, however one starts with a pre-trained CNN and a randomly initialized encoder-decoder Transformer. One also typically uses a different learning rate between the CNN backbone and the Transformer.

NielsRogge · 2023-10-16T07:56:19Z

Hi,

Replacing the backbone can be done as follows:

from transformers import ConvNextConfig, DetrConfig, DetrForObjectDetection

backbone_config = ConvNextConfig(out_features=["stage1", "stage2", "stage3", "stage4"])
config = DetrConfig(backbone_config=backbone_config, use_timm_backbone=False)

model = DetrForObjectDetection(config)

This will work out-of-the-box for convolutional backbones like ConvNeXt:

import torch

pixel_values = torch.randn(1, 3, 224, 224)
outputs = model(pixel_values)

For a vision transformer-based backbone, I'd recommend using ViTDet:

from transformers import VitDetConfig, DetrConfig, DetrForObjectDetection

backbone_config = VitDetConfig(out_features=["stage1", "stage2", "stage3", "stage4"])
config = DetrConfig(backbone_config=backbone_config,  use_timm_backbone=False)

model = DetrForObjectDetection(config)

truong2710-cyber · 2023-12-01T04:04:12Z

Hi @NielsRogge, I have a question.
If I trained a DeTR model with your notebook (in HuggingFace), can I convert the checkpoint back to the format of the original DeTR repo https://github.com/facebookresearch/detr?
Thanks.

truong2710-cyber · 2023-12-01T04:28:58Z

Btw, I have a task in which I have to train Detr on remapped COCO dataset containing only 2 classes (class 1: person, class 2: vehicle corresponding to 3 classes in original COCO (car, truck, bus)). What should I do?

NielsRogge · 2023-12-01T08:02:01Z

@truong2710-cyber sure, you could do that. This is the conversion script used to rename the keys of the state dictionary from the original repo to the HF format, so you could also do it the other way around.

In case you have 2 classes, then you can initialize the DETR model as follows:

from transformers import DetrForObjectDetection

model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50", num_labels=2, ignore_mismatched_sizes=True)

This will reuse all existing layers of the DETR model pre-trained on COCO, and randomly initialize a new classification head. See also my notebook regarding fine-tuning.

truong2710-cyber · 2023-12-01T08:58:25Z

Thanks a lot @NielsRogge . If I have a A100 GPU, could you estimate how long does it take to finetune the model on my remapped COCO dataset?

SergiyShebotnov · 2023-12-01T15:23:09Z

Btw, I have a task in which I have to train Detr on remapped COCO dataset containing only 2 classes (class 1: person, class 2: vehicle corresponding to 3 classes in original COCO (car, truck, bus)). What should I do?

Have you managed to convert a COCO dataset to Huggingface dataset format? If so, could share the steps?

truong2710-cyber · 2023-12-01T17:13:43Z

Btw, I have a task in which I have to train Detr on remapped COCO dataset containing only 2 classes (class 1: person, class 2: vehicle corresponding to 3 classes in original COCO (car, truck, bus)). What should I do?

Have you managed to convert a COCO dataset to Huggingface dataset format? If so, could share the steps?

I think Huggingface directly uses COCO format, isn't it?

SergiyShebotnov · 2023-12-01T17:17:15Z

HF does not support datasets in the COCO format (See here huggingface/transformers#25337), they use their own HF format. You have to write your own COCO to HF converter for COCO segmentation datasets.

Googling around I did not find it, I am yet to see a public script loading instance segmentations in COCO format to HF format.

NielsRogge · 2023-12-02T09:32:10Z

I made this notebook to upload a COCO dataset to the hub.

The dataset lives here: https://huggingface.co/datasets/nielsr/coco-panoptic-val2017

Kilikia123 · 2024-02-17T10:10:30Z

Please, can you show me your training and valid loss curve on epochs?

NielsRogge changed the title ~~DETR tutorials to use it on custom data~~ DETR tutorials to use it on custom data :) Aug 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DETR tutorials to use it on custom data :) #428

DETR tutorials to use it on custom data :) #428

NielsRogge commented Aug 25, 2021 •

edited

Loading

ducvuuit commented Aug 28, 2021

NielsRogge commented Aug 28, 2021

ducvuuit commented Sep 4, 2021

NielsRogge commented Sep 6, 2021

mytk2012 commented Sep 15, 2021

NielsRogge commented Oct 14, 2021

ver0z commented Nov 18, 2021

NielsRogge commented Nov 18, 2021

ver0z commented Nov 18, 2021

ohhenrylee commented Jan 25, 2023

NielsRogge commented Jan 26, 2023

NielsRogge commented Oct 16, 2023 •

edited

Loading

truong2710-cyber commented Dec 1, 2023 •

edited

Loading

truong2710-cyber commented Dec 1, 2023

NielsRogge commented Dec 1, 2023

truong2710-cyber commented Dec 1, 2023

SergiyShebotnov commented Dec 1, 2023

truong2710-cyber commented Dec 1, 2023

SergiyShebotnov commented Dec 1, 2023 •

edited

Loading

NielsRogge commented Dec 2, 2023

Kilikia123 commented Feb 17, 2024 •

edited

Loading

DETR tutorials to use it on custom data :) #428

DETR tutorials to use it on custom data :) #428

Comments

NielsRogge commented Aug 25, 2021 • edited Loading

ducvuuit commented Aug 28, 2021

NielsRogge commented Aug 28, 2021

ducvuuit commented Sep 4, 2021

NielsRogge commented Sep 6, 2021

mytk2012 commented Sep 15, 2021

NielsRogge commented Oct 14, 2021

ver0z commented Nov 18, 2021

NielsRogge commented Nov 18, 2021

ver0z commented Nov 18, 2021

ohhenrylee commented Jan 25, 2023

NielsRogge commented Jan 26, 2023

NielsRogge commented Oct 16, 2023 • edited Loading

truong2710-cyber commented Dec 1, 2023 • edited Loading

truong2710-cyber commented Dec 1, 2023

NielsRogge commented Dec 1, 2023

truong2710-cyber commented Dec 1, 2023

SergiyShebotnov commented Dec 1, 2023

truong2710-cyber commented Dec 1, 2023

SergiyShebotnov commented Dec 1, 2023 • edited Loading

NielsRogge commented Dec 2, 2023

Kilikia123 commented Feb 17, 2024 • edited Loading

NielsRogge commented Aug 25, 2021 •

edited

Loading

NielsRogge commented Oct 16, 2023 •

edited

Loading

truong2710-cyber commented Dec 1, 2023 •

edited

Loading

SergiyShebotnov commented Dec 1, 2023 •

edited

Loading

Kilikia123 commented Feb 17, 2024 •

edited

Loading