Skip to content

[bug] Model comparison tests are flaky WRT word_embeddings_weight gradients #304

@jlamypoirier

Description

@jlamypoirier

🐞 Describe the Bug

Model comparison tests (aka run_test_script) wit have recently started showing random failures with excessive diff on word_embeddings_weight gradients (>>>> [train_2] Excessive diff for tensor Global gradient: layers.0.word_embeddings_weight), with diffs slightly above the threshold. We need to investigate whether there is an actual bug/regression behind this or if it's just random.

Example:

>>>> [train_2] Excessive diff for tensor Global gradient: layers.0.word_embeddings_weight:
  * Max diff scaled = 0.15082430839538574 > 0.15 (scale=0.001214031595736742, unregularized=0.0006883841124363244)
  Ref samples:    0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  2.9449e-03
  Test samples:   0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  0.0000e+00  2.9182e-03

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions