Skip to content

NotFoundError finetune nasnet_large #2720

@DavidWiesner

Description

@DavidWiesner

Hi,

i have downloaded the NASNet-A_Large_331 checkpoint and have try to finetune this via train_image_classifier.py.

I got the following error when restoring :

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.NotFoundError'>, 
Tensor name "cell_0/comb_iter_1/left/separable_5x5_1/depthwise_weights" not found in checkpoint files model.ckpt.index
	 [[Node: save/RestoreV2_37 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_37/tensor_names, save/RestoreV2_37/shape_and_slices)]]

Infos provided in #2648 and #2656 did not resolve this issue.


System information

  • What is the top-level directory of the model you are using:
    /notebooks
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
    no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Linux Ubuntu 16.04.3 LTS (Xenial Xerus)
  • TensorFlow installed from (source or binary):
    tensorflow via docker 1.4.0-gpu-py3
  • TensorFlow version (use command below):
    v1.4.0-rc1-11-g130a514 1.4.0
  • Bazel version (if compiling from source):
  • CUDA/cuDNN version:
  • cuda-8.0
  • GPU model and memory:
    Tesla V100-SXM2 16152MiB
  • Exact command to reproduce:
export CUDA_VISIBLE_DEVICES=""
force=0
DATASET='cifar10'
DATASET_DIR='cifar10'
TRAIN_DIR='train_dir'
PRETRAINED_MODEL="model.ckpt.index"
MODEL_NAME='nasnet_large'

[[ -d "issue-nasnet-tf-models" ]] || git clone https://github.com/tensorflow/models.git "issue-nasnet-tf-models"
cd "issue-nasnet-tf-models/research/slim/"

if [ ${force} -ne 0 ] || [ ! -f "$PRETRAINED_MODEL" ] ; then
  wget https://storage.googleapis.com/download.tensorflow.org/models/nasnet-a_large_04_10_2017.tar.gz
  tar -xf nasnet-a_large_04_10_2017.tar.gz
  rm -f nasnet-a_large_04_10_2017.tar.gz
fi

[[ -d "${DATASET_DIR}" ]] || python3 download_and_convert_data.py --dataset_name=${DATASET} --dataset_dir=${DATASET_DIR}

# Fine-tune only the new layers for 1000 steps.
python3 train_image_classifier.py \
  --train_dir=${TRAIN_DIR} \
  --dataset_name=${DATASET} \
  --dataset_split_name=train \
  --dataset_dir=${DATASET_DIR} \
  --model_name=${MODEL_NAME} \
  --checkpoint_path=${PRETRAINED_MODEL} \
  --checkpoint_exclude_scopes=aux_11/aux_logits,final_layer/FC \
  --trainable_scopes=.*/aux_logits/FC,final_layer/FC \
  --max_number_of_steps=1000 \
  --learning_rate_decay_type=fixed \
  --clone_on_cpu=True \
  --moving_average_decay=0.99

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions