Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mAP scores are all zeros, but model is improving as the loss is small/decreasing #11060

Closed
ironllamagirl opened this issue Jul 25, 2023 · 10 comments
Assignees
Labels
models:official models that come under official repository stale stat:awaiting response Waiting on input from the contributor type:support

Comments

@ironllamagirl
Copy link

ironllamagirl commented Jul 25, 2023

I am finetuning a retinanet using tensorflow orbit, inspired by this tutorial
to perform object detection on a custom dataset.

Training seems to be going well. Loss value is decreasing however all mAP scores have the value 0

Anyone know why I'm getting all zeros when the loss is not too bad and improving?

image

@ironllamagirl
Copy link
Author

update. when I use this code after I save the latest checkpoint and I reload it to visualize prediction boxes I'm getting the images with nothing on it.

`input_image_size = (640, 640)
plt.figure(figsize=(40, 40))
min_score_thresh = 0.0 # Change minimum score for threshold to see all bounding boxes confidences.

for i, serialized_example in enumerate(test_ds):
plt.subplot(1, 3, i+1)
decoded_tensors = tf_ex_decoder.decode(serialized_example)
image = build_inputs_for_object_detection(decoded_tensors['image'], input_image_size)
image = tf.expand_dims(image, axis=0)
image = tf.cast(image, dtype = tf.uint8)
image_np = image[0].numpy()
result = model_fn(image)
visualization_utils.visualize_boxes_and_labels_on_image_array(
image_np,
result['detection_boxes'][0].numpy(),
result['detection_classes'][0].numpy().astype(int),
result['detection_scores'][0].numpy(),
category_index=category_index,
use_normalized_coordinates=False,
max_boxes_to_draw=200,
min_score_thresh=min_score_thresh,
agnostic_mode=False,
instance_masks=None,
line_thickness=4)
plt.imshow(image_np)
plt.axis('off')

plt.show()
`

image

This is what result looks like, and the detection_boxes are all zeros.. What am I missing?

@sineeli sineeli added the models:official models that come under official repository label Aug 23, 2023
@laxmareddyp
Copy link
Collaborator

Hi @ironllamagirl ,

Sorry for the delay in response.

There must be some mistake in the configuration, can you tell us what are the number of classes that you are using and how many you mentioned in the config(generally it is original_num_classes + 1). Also check how you encoded the images as tfrecords and once you decoded tfrecords back and plot the images everything is similar?

@laxmareddyp laxmareddyp added the stat:awaiting response Waiting on input from the contributor label Aug 28, 2023
@MarcoPrassel
Copy link

MarcoPrassel commented Aug 31, 2023

I am suffering from similar issues....News @ironllamagirl ?

@github-actions
Copy link

github-actions bot commented Sep 8, 2023

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Sep 8, 2023
@github-actions
Copy link

This issue was closed due to lack of activity after being marked stale for past 7 days.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@nguyenthekhoig7
Copy link

nguyenthekhoig7 commented Apr 2, 2024

Hi @laxmareddyp, I am facing similar issue when exporting a model from TF Vision model (Resnet50-FPN Retinanet for object detection), the output is exactly as the result from @ironllamagirl.

What I did was that I download checkpoint from here, and exp_config was loaded by exp_config = exp_factory.get_exp_config('retinanet_resnetfpn_coco'), then I export the model by

export_saved_model_lib.export_inference_graph(
    input_type='image_tensor',
    batch_size=1,
    input_image_size = [640, 640],
    params=exp_config_default,
    checkpoint_path=tf.train.latest_checkpoint(checkpoint_dir),
    export_dir=export_dir,
    log_model_flops_and_params = True
)

Then, when I have the saved_model, I loaded it up, exactly as @ironllamagirl did.

Since the config and checkpoint are straight from source, where should I look at next?
Thank you for your support

@laxmareddyp
Copy link
Collaborator

Hi @nguyenthekhoig7 ,

Apologies for the inconvenience. Could you please submit a new ticket with additional details such as notebooks or reproducible code, along with the checkpoint path you used? I attempted to access the path mentioned for the checkpoint, but it seems to redirect to the issue pages.Thanks

@nguyenthekhoig7
Copy link

nguyenthekhoig7 commented Apr 2, 2024

Hi @laxmareddyp sorry for my mistake, I added the incorrect link, here is the link to the check point: https://github.com/tensorflow/models/blob/master/official/vision/MODEL_GARDEN.md#retinanet-imagenet-pretrained

I used the model with 72 epochs pretrained (the bottom one in the table)
image

In case this is still not reproducible, I will open a new ticker tomorrow morning because its' night here.
Thank you very much for your timely support

@nguyenthekhoig7
Copy link

Hi, this morning I am able to solve the problem, the solution is adding only 2 lines of code:

exp_config.task.init_checkpoint ='./resnet50-ckpt/ckpt-33264'
exp_config.task.init_checkpoint_modules ="all"

which adds checkpoint_path to the config (I feel a bit redundant, because we have to input the checkpoint_path both in the config and in the export function).

For those who also meet this problem in the future, here is a working code in Colab: https://colab.research.google.com/drive/1nJiTI9aikiSmcrEFkpDQf8NNs8IFXvO9?usp=sharing

Again, thank you @laxmareddyp for your timely support yesterday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models:official models that come under official repository stale stat:awaiting response Waiting on input from the contributor type:support
Projects
None yet
Development

No branches or pull requests

5 participants