Unable to train Custom model #74

jllarraz · 2019-10-02T11:19:06Z

Hi,

I have been trying to train a custom model with 2 classes and I have modified the training script to do a transfer knowledge from the trained yolo model.
This is my train.py script (See attached file
train.txt

But unfortunately I can't make it work, it always fails with
WARNING:tensorflow:Reduce LR on plateau conditioned on metric val_losswhich is not available. Available metrics are: lr W1002 12:07:19.268314 4404483520 callbacks.py:1824] Reduce LR on plateau conditioned on metricval_losswhich is not available. Available metrics are: lr WARNING:tensorflow:Early stopping conditioned on metricval_losswhich is not available. Available metrics are: W1002 12:07:19.268509 4404483520 callbacks.py:1250] Early stopping conditioned on metricval_loss which is not available. Available metrics are: 1/Unknown - 8s 8s/stepTraceback (most recent call last): File "/Users/t230418/Downloads/TensorFlow2/train.py", line 187, in <module> app.run(main) File "/usr/local/lib/python3.7/site-packages/absl/app.py", line 299, in run _run_main(main, args) File "/usr/local/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "/Users/t230418/Downloads/TensorFlow2/train.py", line 182, in main validation_data=val_dataset) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 728, in fit use_multiprocessing=use_multiprocessing) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 324, in fit total_epochs=epochs) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 123, in run_one_epoch batch_outs = execution_function(iterator) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 86, in execution_function distributed_function(input_fn)) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 73, in distributed_function per_replica_function, args=(model, x, y, sample_weights)) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 760, in experimental_run_v2 return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1787, in call_for_each_replica return self._call_for_each_replica(fn, args, kwargs) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 2132, in _call_for_each_replica return fn(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 258, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 264, in train_on_batch output_loss_metrics=model._output_loss_metrics) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 311, in train_on_batch output_loss_metrics=output_loss_metrics)) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 252, in _process_single_batch training=training)) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 166, in _model_loss per_sample_losses = loss_fn.call(targets[i], outs[i]) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/losses.py", line 221, in call return self.fn(y_true, y_pred, **self._fn_kwargs) File "/Users/t230418/Downloads/TensorFlow2/yolov3_tf2/models.py", line 304, in yolo_loss true_class_idx, pred_class) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/losses.py", line 978, in sparse_categorical_crossentropy y_true, y_pred, from_logits=from_logits, axis=axis) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py", line 4549, in sparse_categorical_crossentropy labels=target, logits=output) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 3477, in sparse_softmax_cross_entropy_with_logits_v2 labels=labels, logits=logits, name=name) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 3397, in sparse_softmax_cross_entropy_with_logits precise_logits, labels, name=name) File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_nn_ops.py", line 11838, in sparse_softmax_cross_entropy_with_logits _six.raise_from(_core._status_to_exception(e.code, message), None) File "<string>", line 3, in raise_from tensorflow.python.framework.errors_impl.InvalidArgumentError: Received a label value of -1 which is outside the valid range of [0, 2). Label values: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [Op:SparseSoftmaxCrossEntropyWithLogits] WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-6 W1002 12:07:19.918585 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-6 WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-7 W1002 12:07:19.918745 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-7 WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-8 W1002 12:07:19.918799 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-8 WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-6.arguments W1002 12:07:19.918851 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-6.arguments WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-6._variable_dict W1002 12:07:19.918897 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-6._variable_dict WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-6._trainable_weights W1002 12:07:19.918942 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-6._trainable_weights WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-6._non_trainable_weights W1002 12:07:19.918987 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-6._non_trainable_weights WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-7.arguments W1002 12:07:19.919031 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-7.arguments WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-7._variable_dict W1002 12:07:19.919076 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-7._variable_dict WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-7._trainable_weights W1002 12:07:19.919139 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-7._trainable_weights WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-7._non_trainable_weights W1002 12:07:19.919196 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-7._non_trainable_weights WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-8.arguments W1002 12:07:19.919239 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-8.arguments WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-8._variable_dict W1002 12:07:19.919282 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-8._variable_dict WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-8._trainable_weights W1002 12:07:19.919326 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-8._trainable_weights WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-8._non_trainable_weights W1002 12:07:19.919369 4404483520 util.py:144] Unresolved object in checkpoint: (root).layer-8._non_trainable_weights WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details. W1002 12:07:19.919421 4404483520 util.py:152] A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details.

Any ideas?

The text was updated successfully, but these errors were encountered:

AnaRhisT94 · 2019-10-02T11:45:54Z

We'll need more info. then that.
How did you create the tfrecord files for those two classes?

jllarraz · 2019-10-02T12:13:31Z

I use rectlabel El mié., 2 oct. 2019 12:45, AnaRhisT <notifications@github.com> escribió:

…

We'll need more info. then that. How did you create the tfrecord files for those two classes? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#74?email_source=notifications&email_token=ABFQ5726CBMHYIX6DQIMVQLQMSCXJA5CNFSM4I4VAAG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAEOUYY#issuecomment-537455203>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABFQ57YOXHQTO2OR7DPPQNTQMSCXJANCNFSM4I4VAAGQ> .

jllarraz · 2019-10-02T12:36:49Z

This is the script that I am using
rectlabel_create_pascal_tf_record.txt

AnaRhisT94 · 2019-10-02T13:00:57Z

You haven't used the transfer flag here which is

flags.DEFINE_enum('transfer', 'none',
                  ['none', 'darknet', 'no_output', 'frozen', 'fine_tune'],
                  'none: Training from scratch, '
                  'darknet: Transfer darknet, '
                  'no_output: Transfer all but output, '
                  'frozen: Transfer and freeze all, '
                  'fine_tune: Transfer all and freeze darknet only')

Try to use:

flags.DEFINE_enum('transfer', 'no_output',
                  ['none', 'darknet', 'no_output', 'frozen', 'fine_tune'],
                  'none: Training from scratch, '
                  'darknet: Transfer darknet, '
                  'no_output: Transfer all but output, '
                  'frozen: Transfer and freeze all, '
                  'fine_tune: Transfer all and freeze darknet only')

Then use classes=80, and just filter the classes you don't need in the inference time.

I'm also struggling with training for 2 classes myself. I'll be trying this https://github.com/YunYang1994/tensorflow-yolov3 implementation soon.

jllarraz · 2019-10-02T13:07:12Z

I pass the transfer as a parameter, as far as I know that bit of code just defines the possible values of the flag transfer and if you don't specifiy one it uses none as default or no_output in your example.

this is how I use the script (just copied from the website)
python train.py --batch_size 8 --dataset /mypath/train.tfRecord --val_dataset /mypath/val.tfRecord --classes /myPath/my_coco.names --epochs 10 --mode eager_fit --transfer fine_tune --weights ./checkpoints/yolov3-tiny.tf --tiny

And I also tried
python train.py --batch_size 8 --dataset /mypath/train.tfRecord --val_dataset /mypath/val.tfRecord --epochs 10 --mode eager_fit --transfer fine_tune --weights ./checkpoints/yolov3-tiny.tf --tiny

jllarraz · 2019-10-02T13:22:15Z

I have created a sample tf record with just one sample in case that it's helpful to find the problem
train_example.tfRecord.zip

AnaRhisT94 · 2019-10-02T13:29:55Z

From the error it says that there are many -1 values, and indeed we can see in the output a few -1 values. Take a look at it

jllarraz · 2019-10-02T13:36:31Z

That's the reason why I am asking, because I can't figure out why that happens.

AnaRhisT94 · 2019-10-02T13:43:33Z

Debug the creation of the TFRecord file, maybe it's because of:
class_id = -1.

Check in debugging this:

class_id = getClassId(obj['name'], label_map_dict)
if class_id < 0:
continue

see maybe the class_id is a string and you should do class_id <"0" instead.

jllarraz · 2019-10-02T14:00:14Z

Thanks for the tip, but I had checked that and it's an int. Also the classId seems ok it's always in the range

AnaRhisT94 · 2019-10-02T14:30:03Z

You're welcome.
If you don't mind, I'd also love some help:
#75

Vitor050291 · 2019-10-07T17:59:48Z

I am having the same trouble here

DanielWicz · 2019-10-09T09:25:34Z

I am also having this trouble

fabhau · 2019-10-09T14:06:26Z

I am also having this problem in combination with transfer learning.

Kuz-man · 2019-11-05T15:09:52Z

Basically, the labels are created in dataset.py where each item in the tfrecord is passed to the parse_tfrecord method. That's the lines you're interested in:

class_text = tf.sparse.to_dense(
        x['image/object/class/text'], default_value='')
labels = tf.cast(class_table.lookup(class_text), tf.float32)

So, if a label under image/object/class/text does not exist in the value you pass to the --classes argument, the label will be -1, as assigned here:

class_table = tf.lookup.StaticHashTable(tf.lookup.TextFileInitializer(
        class_file, tf.string, 0, tf.int64, LINE_NUMBER, delimiter="\n"), -1)

In a nutshell, make sure your labels in the tfrecord match those inside the --classes argument (which defaults to ./data/coco.names). In one of your calls above, you're leaving the argument empty and in the other you're using: --classes /myPath/my_coco.names.
Also, if your tfrecord already has class/label inside, you might want to use those instead of looking up.

zzh8829 · 2019-12-21T06:51:30Z

See this tutorial on custom training https://github.com/zzh8829/yolov3-tf2/blob/master/docs/training_voc.md

escudero · 2020-11-22T06:11:10Z

I don't know if it was your case.
It happened to me because I was using accented labels.
I removed the accents and it worked correctly.

zzh8829 added the training Training Related Questions label Dec 20, 2019

zzh8829 closed this as completed Dec 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to train Custom model #74

Unable to train Custom model #74

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019

jllarraz commented Oct 2, 2019 via email

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019 •

edited

jllarraz commented Oct 2, 2019

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019 •

edited

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019

Vitor050291 commented Oct 7, 2019

DanielWicz commented Oct 9, 2019

fabhau commented Oct 9, 2019

Kuz-man commented Nov 5, 2019

zzh8829 commented Dec 21, 2019

escudero commented Nov 22, 2020

Unable to train Custom model #74

Unable to train Custom model #74

Comments

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019

jllarraz commented Oct 2, 2019 via email

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019 • edited

jllarraz commented Oct 2, 2019

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019 • edited

jllarraz commented Oct 2, 2019

AnaRhisT94 commented Oct 2, 2019

Vitor050291 commented Oct 7, 2019

DanielWicz commented Oct 9, 2019

fabhau commented Oct 9, 2019

Kuz-man commented Nov 5, 2019

zzh8829 commented Dec 21, 2019

escudero commented Nov 22, 2020

AnaRhisT94 commented Oct 2, 2019 •

edited

AnaRhisT94 commented Oct 2, 2019 •

edited