-
Couldn't load subscription status.
- Fork 45.4k
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- I am reporting the issue to the correct repository. (Model Garden official or research directory)
- I checked to make sure that this issue has not been filed already.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/official/vision/image_classification/resnet
For pretrained checkpoints, I used the ones linked in the README (https://github.com/tensorflow/models/tree/master/official/vision/image_classification/resnet#pretrained-models)
2. Describe the bug
- I'm trying to evaluate and finetune the Resnet 50 model available at the URL(mentioned above). However I get near zero accuracy when I evaluate. I would like to know how to evaluate and finetune using the existing RN50 checkpoint.
The command I use for evaluating the existing model
python3 resnet_ctl_imagenet_main.py --model_dir=checkpoints/ --num_gpus=1 --batch_size=32 --train_epochs=1 --train_steps=1 --use_synthetic_data=false --data_dir imagenet_tfr_data/
The model_dir is set to checkpoints directory which has the downloaded checkpoint (from the README link). The checkpoint manager picks up this checkpoint, however does not seem to load as I get many unresolved object issues where the layer names mismatch.
W0518 15:35:57.377599 139869265008448 util.py:144] Unresolved object in checkpoint: (root).layer_with_weights-4.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).layer_with_weights-5.axis
-
Currently I seem to get this resnet model running for Tensorflow 2.2. There were multiple errors for 2.0 and 2.1. One such error is
from tensorflow.python.keras.layers.preprocessing import image_preprocessing as image_ops. ImportError: cannot import name 'image_preprocessing'
This may not be relevant to the actual issue but for me TF 2,0 and TF 2.1 seem to give import and not found attribute errors which drove me to try TF 2.2. -
Evaluation works when I do the following
In the resnet_runnable.py, I use keras way of loading the checkpoint
self.model.load_weights(flags_obj.pretrained_filepath)
This probably loads the checkpoint according to network topology rather than names (used by tf.train.CheckpointManager. (Is this correct way of loading ? )
I disable training manually and run the self._evaluate_once(current_step) to get 76.476. Just wanted to confirm if this is same accuracy that you obtained?
The questions are
- Is there a plan to add standalone eval script to this repo ?
- If the way of evaluation described in 3) is recommended, can it be added to the repo as well as update the documentation as well on eval/finetuning steps?
I would be happy to make a PR if required :)
Thank you !!