Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mean of empty slice "dist.avg": np.nanmean(dists) warning - Top Down ID model #511

Closed
catubc opened this issue Mar 12, 2021 · 2 comments
Closed

Comments

@catubc
Copy link

catubc commented Mar 12, 2021

Hi
I'm getting this crash during the save step post training.

Any advice?
catubc


INFO:sleap.nn.training:Creating tf.data.Datasets for training data generation...
INFO:sleap.nn.training:Finished creating training datasets. [437.7s]
INFO:sleap.nn.training:Starting training loop...
Epoch 1/10
200/200 - 17s - loss: 0.0025 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 1.3472 - ClassVectorsHead_accuracy: 0.2637 - val_loss: 0.0026 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 1.3667 - val_ClassVectorsHead_accuracy: 0.2763
Epoch 2/10
200/200 - 18s - loss: 0.0025 - CenteredInstanceConfmapsHead_loss: 0.0011 - ClassVectorsHead_loss: 1.3315 - ClassVectorsHead_accuracy: 0.2612 - val_loss: 0.0023 - val_CenteredInstanceConfmapsHead_loss: 0.0011 - val_ClassVectorsHead_loss: 1.2187 - val_ClassVectorsHead_accuracy: 0.2368
Epoch 3/10
200/200 - 16s - loss: 0.0024 - CenteredInstanceConfmapsHead_loss: 0.0011 - ClassVectorsHead_loss: 1.3538 - ClassVectorsHead_accuracy: 0.2525 - val_loss: 0.0024 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 1.2644 - val_ClassVectorsHead_accuracy: 0.2500
Epoch 4/10
200/200 - 16s - loss: 0.0026 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 1.4481 - ClassVectorsHead_accuracy: 0.2587 - val_loss: 0.0025 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 1.3283 - val_ClassVectorsHead_accuracy: 0.2368
Epoch 5/10
200/200 - 16s - loss: 0.0052 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 3.9977 - ClassVectorsHead_accuracy: 0.2325 - val_loss: 0.0079 - val_CenteredInstanceConfmapsHead_loss: 0.0013 - val_ClassVectorsHead_loss: 6.6782 - val_ClassVectorsHead_accuracy: 0.2105
Epoch 6/10
200/200 - 16s - loss: 0.1655 - CenteredInstanceConfmapsHead_loss: 0.0017 - ClassVectorsHead_loss: 163.7638 - ClassVectorsHead_accuracy: 0.2412 - val_loss: 0.0026 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 1.3679 - val_ClassVectorsHead_accuracy: 0.2895
Epoch 7/10
200/200 - 16s - loss: 0.1285 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 127.2796 - ClassVectorsHead_accuracy: 0.2575 - val_loss: 0.0025 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 1.2878 - val_ClassVectorsHead_accuracy: 0.3289
Epoch 8/10
200/200 - 16s - loss: 0.8476 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 846.4465 - ClassVectorsHead_accuracy: 0.2625 - val_loss: 0.0047 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 3.4568 - val_ClassVectorsHead_accuracy: 0.2105
Epoch 9/10
200/200 - 16s - loss: 0.0109 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 9.6544 - ClassVectorsHead_accuracy: 0.2537 - val_loss: 0.0212 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 19.9377 - val_ClassVectorsHead_accuracy: 0.1974
Epoch 10/10
200/200 - 16s - loss: 3.4853 - CenteredInstanceConfmapsHead_loss: 0.0012 - ClassVectorsHead_loss: 3484.0969 - ClassVectorsHead_accuracy: 0.2725 - val_loss: 6.5799 - val_CenteredInstanceConfmapsHead_loss: 0.0012 - val_ClassVectorsHead_loss: 6578.7217 - val_ClassVectorsHead_accuracy: 0.2895
INFO:sleap.nn.training:Finished training loop. [2.8 min]
INFO:sleap.nn.training:Saving evaluation metrics to model folder...
WARNING:sleap.nn.evals:Failed to compute metrics.
INFO:sleap.nn.evals:Saved predictions: models/gerbils.multiclass_topdown/labels_pr.train.slp

/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:505: RuntimeWarning: Mean of empty slice
  "dist.avg": np.nanmean(dists),
/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:538: RuntimeWarning: Mean of empty slice.
  mPCK = mPCK_parts.mean()
/usr/local/lib/python3.7/dist-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:632: RuntimeWarning: Mean of empty slice.
  pair_pck = metrics["pck.pcks"].mean(axis=-1).mean(axis=-1)
/usr/local/lib/python3.7/dist-packages/numpy/core/_methods.py:154: RuntimeWarning: invalid value encountered in true_divide
  ret, rcount, out=ret, casting='unsafe', subok=False)
/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:634: RuntimeWarning: Mean of empty slice.
  metrics["oks.mOKS"] = pair_oks.mean()

WARNING:sleap.nn.evals:Failed to compute metrics.
INFO:sleap.nn.evals:Saved predictions: models/gerbils.multiclass_topdown/labels_pr.val.slp

/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:505: RuntimeWarning: Mean of empty slice
  "dist.avg": np.nanmean(dists),
/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:538: RuntimeWarning: Mean of empty slice.
  mPCK = mPCK_parts.mean()
/usr/local/lib/python3.7/dist-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:632: RuntimeWarning: Mean of empty slice.
  pair_pck = metrics["pck.pcks"].mean(axis=-1).mean(axis=-1)
/usr/local/lib/python3.7/dist-packages/numpy/core/_methods.py:154: RuntimeWarning: invalid value encountered in true_divide
  ret, rcount, out=ret, casting='unsafe', subok=False)
/usr/local/lib/python3.7/dist-packages/sleap/nn/evals.py:634: RuntimeWarning: Mean of empty slice.
  metrics["oks.mOKS"] = pair_oks.mean()

@catubc catubc changed the title Mean of empty slice "dist.avg": np.nanmean(dists) crash - Top Down ID model Mean of empty slice "dist.avg": np.nanmean(dists) warning - Top Down ID model Mar 12, 2021
@talmo
Copy link
Collaborator

talmo commented Mar 12, 2021

Looks like the training diverged (see the class vectors loss fluctuating). Since there was no improvement for some number of epochs, training stopped early and the resulting model failed to generate any predictions.

Maybe try increasing the class vectors loss a bit (x10), increasing the early stopping patience (+5), or decreasing the early stopping minimum improvement threshold (~1e-8 is good).

@catubc
Copy link
Author

catubc commented Mar 12, 2021

Yeah, I think there were too few epochs being trained on.
I'll post again if this occurs after 500+epochs.
catubc

@catubc catubc closed this as completed Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants