Add option to load model weights from checkpoint before starting to t… #368

Viktor-Nilsson · 2022-01-18T09:54:33Z

Added an optional parameter that allows passing a path to a checkpoint file when calling objectdetector.create()
If a checkpoint path is passed, the underlying tf.keras.model will load the model weights from the checkpoint before training is started.

…rain objectdetector

bhack

Can you extend the test to cover this new option?

Viktor-Nilsson · 2022-01-18T18:49:44Z

@MarkDaoust
I'm not sure how to do that in a good way. The existing test uses a random generated .jpg ( to avoid binary files in the git repo ?) .

I could add a new test case that loads one of my existing trained checkpoints, evaluates the model and verifies the test case and that weights are loaded by checking that the AP is high enough.
Adding a checkpoint file for efficientdet-lite0 in the git repo is however not so nice since it is ~ 33MB of binary data.
Thoughts?

MarkDaoust · 2022-01-18T18:59:01Z

I'm not sure why the CODEOWNERS file didn't assign Khanh and Lu directly. They're the real owners here.

bhack · 2022-01-18T19:24:40Z

I'm not sure why the CODEOWNERS file didn't assign Khanh and Lu directly. They're the real owners here.

Cause the pattern is wrong. Here we are in a subdir that It is only covered by your global.

MarkDaoust · 2022-01-18T19:50:27Z

Oh, right. I'll send a fix for that.

khanhlvg · 2022-01-19T00:44:03Z

@ziyeqinghan Could you take a look?

ThuhinSatheesh · 2022-02-21T15:27:24Z

I tried training a model from a checkpoint but while training the losses returned NaN values. Is there any way around this or am I doing something wrong?

IvanColantoni · 2022-04-07T15:16:14Z

Hi and thanks for this.
I would add that loading weights with model.load_weights() method didn't work in my case.
I restored the checkpoint from model_dir by importing the function :
from tensorflow_examples.lite.model_maker.third_party.efficientdet.keras.util_keras import restore_ckpt in the object_detector_spec.py file and calling it in the if block before the model.fit() method as you suggested:

if load_checkpoint_path is not None:
       restore_ckpt(model,load_checkpoint_path)

From what I understand this is because checkpoint for EfficientDetNetTrainHub are different and need a custom function to correctly restore them. Not sure about it though.
be sure that in load_checkpoint_path dir there are ckpt-xx.dataxxx , ckpt-xx.index plus a checkpoint plain text file with the number of checkpoint you want to restore e.g:

from my terminal in model_dir path:

cat checkpoint give

model_checkpoint_path: "ckpt-100"
all_model_checkpoint_paths: "ckpt-100"

imneonizer · 2022-07-01T12:18:11Z

Since there is no option to create issues, I just have a question how to do multi GPU training using tflite model maker ?
https://github.com/tensorflow/examples/blob/master/tensorflow_examples/lite/model_maker/core/task/object_detector.py#L73-L75

grewe · 2022-10-26T22:16:08Z

This does not seem to be in the actual code, yet I see a commit here. What is the status?

Viktor-Nilsson · 2022-10-27T07:30:12Z

Closing pr since it was reported not to work for other who attempted to use the code and I have no capacity to further investigate it.

Bede-sv · 2023-01-11T02:19:01Z

@Viktor-Nilsson This worked for me when I tried it

justingrayston · 2023-04-14T15:34:09Z

I'd be keen to get this supported too, and as I am sure many others would as the ability to improve your own custom model is key without being wasteful with GPU retraining on data you've already trained with before.

Add option to load model weights from checkpoint before starting to t…

bd4bd79

…rain objectdetector

Viktor-Nilsson requested review from MarkDaoust and wolffg as code owners January 18, 2022 09:54

google-ml-butler bot added the size:S CL Change Size: Small label Jan 18, 2022

google-ml-butler bot assigned gbaned Jan 18, 2022

google-ml-butler bot added the awaiting review label Jan 18, 2022

bhack reviewed Jan 18, 2022

View reviewed changes

MarkDaoust assigned khanhlvg and lu-wang-g Jan 18, 2022

khanhlvg assigned ziyeqinghan Jan 19, 2022

MarkDaoust requested review from khanhlvg and removed request for MarkDaoust and wolffg February 22, 2022 20:33

Viktor-Nilsson closed this Oct 27, 2022

google-ml-butler bot removed the awaiting review label Oct 27, 2022

Viktor-Nilsson deleted the vnilsson_load_checkpoint branch October 27, 2022 07:30

Add option to load model weights from checkpoint before starting to t… #368

Add option to load model weights from checkpoint before starting to t… #368

Uh oh!

Conversation

Viktor-Nilsson commented Jan 18, 2022

Uh oh!

bhack left a comment

Choose a reason for hiding this comment

Uh oh!

Viktor-Nilsson commented Jan 18, 2022

Uh oh!

MarkDaoust commented Jan 18, 2022

Uh oh!

bhack commented Jan 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MarkDaoust commented Jan 18, 2022

Uh oh!

khanhlvg commented Jan 19, 2022

Uh oh!

ThuhinSatheesh commented Feb 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IvanColantoni commented Apr 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

imneonizer commented Jul 1, 2022

Uh oh!

grewe commented Oct 26, 2022

Uh oh!

Viktor-Nilsson commented Oct 27, 2022

Uh oh!

Bede-sv commented Jan 11, 2023

Uh oh!

justingrayston commented Apr 14, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

bhack commented Jan 18, 2022 •

edited

Loading

ThuhinSatheesh commented Feb 21, 2022 •

edited

Loading

IvanColantoni commented Apr 7, 2022 •

edited

Loading