Skip to content

Strange behavior in detection reference code #5221

@rvandeghen

Description

@rvandeghen

🐛 Describe the bug

Hi,

I am using your training procedure for object detection https://github.com/pytorch/vision/blob/main/references/detection/train.py with a custom dataset. When I evaluate my model, the output seems correct for the very first epoch but for the following epochs, the metrics fall to 0.
However, this is not related to the model performance as I can use a checkpoint and evaluate it in another process, which gives back expected values.

From the code, the difference between COCO dataset and a custom dataset happens here:

return convert_to_coco_api(dataset)

I suppose that the current behavior is not expected. Have you ever faced a similar issue and how can I correct it ?

Renaud

Versions

Collecting environment information...
PyTorch version: 1.10.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: CentOS Linux release 8.2.2004 (Core) (x86_64)
GCC version: (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5)
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.28

Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.18.0-193.6.3.el8_2.x86_64-x86_64-with-glibc2.28
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce RTX 2080 Ti
Nvidia driver version: 450.57
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] torch==1.10.1
[pip3] torchvision==0.11.2
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py39h7f8727e_0
[conda] mkl_fft 1.3.1 py39hd3c417c_0
[conda] mkl_random 1.2.2 py39h51133e4_0
[conda] numpy 1.21.2 py39h20f2e39_0
[conda] numpy-base 1.21.2 py39h79a1101_0
[conda] pytorch 1.10.1 py3.9_cuda11.3_cudnn8.2.0_0 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchvision 0.11.2 py39_cu113 pytorch

cc @datumbox

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions