Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom dataset: class_error: 100 #100

Open
dethresearcher opened this issue May 4, 2023 · 6 comments
Open

Custom dataset: class_error: 100 #100

dethresearcher opened this issue May 4, 2023 · 6 comments

Comments

@dethresearcher
Copy link

dethresearcher commented May 4, 2023

Hello,

I'm trying to train a custom dataset using the private detection setting (since I plan to eventually swap out the default detector with my own). I would like to train the tracker (including the detector) simultaneously and from scratch, without any pretraining.

I am using the following command:

python -m torch.distributed.launch --nproc_per_node=2 --use_env src/train.py with
mot17
deformable
multi_frame
tracking
resume=models/mot17_crowdhuman_deformable_multi_frame/checkpoint_epoch_40.pth
output_dir=models/custom_deformable_multi_frame
mot_path_train=data/xxx
mot_path_val=data/xxx
train_split=xxx_train_coco
val_split=xxx_test_coco
epochs=20

The training works until it starts the first evaluation:

  1. "datasets/tracking/factory.py", line 59, in init
    assert dataset in DATASETS, f"[!] Dataset not found: {dataset}"

I understand the error is relating to the dataset not defined under tracking/.*, but I wanted to make sure we needed to do this step because it was not stated in the README, and that it was mentioned that a custom dataset could be used "without changing our codebase".

  1. class_error: 100, and most of the errors are 0. Is this expected?

Epoch: [1] [4900/6318] eta: 0:06:30 lr: 0.000100 class_error: 100.00 loss: 0.0000 (0.0376) loss_bbox: 0.0000 (0.0000) loss_bbox_0: 0.0000 (0.0000) loss_bbox_1: 0.0000 (0.0000) loss_bbox_2: 0.0000 (0.0000) loss_bbox_3: 0.0000 (0.0000) loss_bbox_4: 0.0000 (0.0000) loss_ce: 0.0000 (0.0078) loss_ce_0: 0.0000 (0.0013) loss_ce_1: 0.0000 (0.0049) loss_ce_2: 0.0000 (0.0078) loss_ce_3: 0.0000 (0.0077) loss_ce_4: 0.0000 (0.0081) loss_giou: 0.0000 (0.0000) loss_giou_0: 0.0000 (0.0000) loss_giou_1: 0.0000 (0.0000) loss_giou_2: 0.0000 (0.0000) loss_giou_3: 0.0000 (0.0000) loss_giou_4: 0.0000 (0.0000) cardinality_error_unscaled: 498.0000 (498.2397) cardinality_error_0_unscaled: 498.0000 (498.5926) cardinality_error_1_unscaled: 499.5000 (499.5661) cardinality_error_2_unscaled: 499.5000 (499.4653) cardinality_error_3_unscaled: 500.0000 (499.9463) cardinality_error_4_unscaled: 500.0000 (499.8592) class_error_unscaled: 100.0000 (100.0000) loss_bbox_unscaled: 0.0000 (0.0000) loss_bbox_0_unscaled: 0.0000 (0.0000) loss_bbox_1_unscaled: 0.0000 (0.0000) loss_bbox_2_unscaled: 0.0000 (0.0000) loss_bbox_3_unscaled: 0.0000 (0.0000) loss_bbox_4_unscaled: 0.0000 (0.0000) loss_ce_unscaled: 0.0000 (0.0039) loss_ce_0_unscaled: 0.0000 (0.0007) loss_ce_1_unscaled: 0.0000 (0.0025) loss_ce_2_unscaled: 0.0000 (0.0039) loss_ce_3_unscaled: 0.0000 (0.0038) loss_ce_4_unscaled: 0.0000 (0.0040) loss_giou_unscaled: 0.0000 (0.0000) loss_giou_0_unscaled: 0.0000 (0.0000) loss_giou_1_unscaled: 0.0000 (0.0000) loss_giou_2_unscaled: 0.0000 (0.0000) loss_giou_3_unscaled: 0.0000 (0.0000) loss_giou_4_unscaled: 0.0000 (0.0000) lr_backbone: 0.0000 (0.0000) time: 0.5473 data: 0.0034 max mem: 7175

  1. Does this repo train the detector + tracker together?

  2. I am trying to do evaluation with private detections, but it seems like the *_sequence.py files are calling the _sequence() method which requires public detections?

@HojinKimSIA
Copy link

same issue...

@timmeinhardt
Copy link
Owner

  1. The README says it can be trained without code changes. ;) But yes, to run an evaluation one has to add the dataset to the code in the factory.py file. A custom dataset might require individual evaluation code anyway. Hence, we did not give further details on that matter.
  2. The class error should go down. Also your losses are mostly zero. This does not look correct.
  3. Yes, the idea of the paper is to have a unified model which performs detection and tracking. So, I am not sure how you will swap your detector for our Deformable DETR. Unless it has a similar Transformer decoder architecture with cross-attention between image features and object queries.
  4. As mentioned in point 1 you need to add your individual eval code/file. You can copy paste one of the *_sequence.py files and adjust it to your needs, e.g., remove the requirement of public detection files.

@dethresearcher
Copy link
Author

Thanks for the answer @timmeinhardt!

Regarding the losses mostly being zero -- do you have some ideas of what could be wrong? e.g. in the data annotation or maybe hyperparameters?

@timmeinhardt
Copy link
Owner

The data annotation is definitely the right place to look at. This could be due to wrong label indices for the background. The datasets/coco.py expects the person label to be at index 1 in the ground truth file. If your custom dataset has more than one label you need to check to code as well. This is not supported out of the box and might need some adjustment here and there.

@dethresearcher
Copy link
Author

Thanks @timmeinhardt!

The problem was related to using the generate_coco_from_mot.py script with a custom dataset, by calling generate_coco_from_mot using mots=False, I had effectively set ignore: 1 for all the annotations because row[8] = -1 in my case.

Could you clarify why python src/track.py with reid is private detection? It seems as though all evaluation processes call _sequence which loads detection results from det.txt. In particular, python src/track.py with reid uses tracks such as MOT17-01-FRCNN, etc. which I thought was public detection?

@timmeinhardt
Copy link
Owner

The code might load the detection results but it is not using it. For a quick fix you can put your ground truth files and detections and then run private mode. See track.yaml for the config entry which enables public detections. It is set to False by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants