metrics are not displayed during training #9136

DanilKonon · 2020-08-21T10:28:01Z

Hi

I install tensorflow 2.2, and use efficientdet_d2_coco17_tpu-32.
I managed to start training this model with this command:

!python3 ./models/research/object_detection/model_main_tf2.py   \
            --pipeline_config_path=./pipeline.config  \
            --model_dir=./efficient_det  \
            --batch_size=4 \
            --num_train_steps=150_000   --sample_1_of_n_eval_examples=4     --alsologtostderr

But while training it outputs only loss without metrics evaluated on eval set as it was in Tensorflow Object Detection 1.

I0821 09:23:17.842881 140282024908672 model_lib_v2.py:652] Step 16800 per-step time 1.223s loss=0.788
INFO:tensorflow:Step 16900 per-step time 1.076s loss=0.503
I0821 09:25:09.853582 140282024908672 model_lib_v2.py:652] Step 16900 per-step time 1.076s loss=0.503
INFO:tensorflow:Step 17000 per-step time 1.252s loss=1.163
I0821 09:27:01.702962 140282024908672 model_lib_v2.py:652] Step 17000 per-step time 1.252s loss=1.163
INFO:tensorflow:Step 17100 per-step time 1.072s loss=0.916
I0821 09:28:55.677012 140282024908672 model_lib_v2.py:652] Step 17100 per-step time 1.072s loss=0.916
INFO:tensorflow:Step 17200 per-step time 1.138s loss=0.819

How can I see my metrics?

Also, in train folder there are events file. They are initialised in the beginning, and then nothing is happening with them. How can I update events to see model metrics and progress in Tensorboard?

The text was updated successfully, but these errors were encountered:

dinis-rodrigues · 2020-08-30T21:16:57Z

Yup, same issue here. in TF 1 this is works properly, while in TF 2 it does not...

TolgaBkm · 2020-08-31T14:06:23Z

I also have the same issue. I have to stop the training once in a while, run evaluation manually and resume the training process afterwards.

ecatkins · 2020-09-01T19:31:05Z

Also hitting the same issue trying to port my code over from TF1 to TF2. I previously was using Weights & Biases to sync to Tensorboard so that I could monitor the progress of training... and now not sure what to do

dinis-rodrigues · 2020-09-01T20:29:09Z

Just checked related issues, it seems that evaluation while training, as we did in TF 1 is not supported with TF2's model_main_tf2.py

cl886699 · 2020-09-03T07:51:57Z

me too

qraleq · 2020-09-16T09:57:07Z

Hi, any update on this issue?

LaraNeves · 2021-01-06T08:51:46Z

I found this tutorial to be really useful to get evaluation on tensorboard while training the model with TF2. Check that for more details but basically you have to run your model_main_tf2.py script in parallel, one for training with the training dataset the other for evaluating with the validation dataset. You can either use 2 GPUs or if you have only one, use GPU for training, CPU for evaluating - it's explained how in the tutorial.

DanilKonon · 2021-04-20T07:49:59Z

I understand that we can run in parallel two scripts. But what should I do if I run everything in Colab? I thought most everyone uses Colab here...

DanilKonon added the type:support label Aug 21, 2020

saikumarchalla self-assigned this Aug 22, 2020

saikumarchalla added the models:research models that come under research directory label Aug 22, 2020

saikumarchalla assigned jaeyounkim and unassigned saikumarchalla Aug 22, 2020

saikumarchalla assigned tombstone, jch1 and pkulzc and unassigned jaeyounkim Aug 31, 2020

jaeyounkim added models:research:odapi ODAPI and removed models:research models that come under research directory labels Jun 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metrics are not displayed during training #9136

metrics are not displayed during training #9136

DanilKonon commented Aug 21, 2020

dinis-rodrigues commented Aug 30, 2020

TolgaBkm commented Aug 31, 2020

ecatkins commented Sep 1, 2020

dinis-rodrigues commented Sep 1, 2020

cl886699 commented Sep 3, 2020

qraleq commented Sep 16, 2020

LaraNeves commented Jan 6, 2021

DanilKonon commented Apr 20, 2021

metrics are not displayed during training #9136

metrics are not displayed during training #9136

Comments

DanilKonon commented Aug 21, 2020

dinis-rodrigues commented Aug 30, 2020

TolgaBkm commented Aug 31, 2020

ecatkins commented Sep 1, 2020

dinis-rodrigues commented Sep 1, 2020

cl886699 commented Sep 3, 2020

qraleq commented Sep 16, 2020

LaraNeves commented Jan 6, 2021

DanilKonon commented Apr 20, 2021