You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> np.testing.assert_allclose(golden_out, result, rtol=1e-01, atol=1e-02)
E AssertionError:
E Not equal to tolerance rtol=0.1, atol=0.01
E
E Mismatched elements: 5505 / 4000032 (0.138%)
E Max absolute difference: 0.09074688
E Max relative difference: 3171.7234
E x: array([[[ 2.683771, 0.183121, 10.453473, ..., 6.315439, 2.047505,
E 3.32532 ],
E [-0.482143, 0.061366, 9.494564, ..., 6.593861, 1.620899,...
E y: array([[[ 2.671124, 0.182537, 10.456981, ..., 6.322483, 2.0[515](https://github.com/nod-ai/SHARK/runs/7868468050?check_suite_focus=true#step:9:516)46,
E 3.322179],
E [-0.481575, 0.061454, 9.495419, ..., 6.59101 , 1.619549,...
roberta-base-tf assert failure:
> np.testing.assert_allclose(golden_out, result, rtol=1e-01, atol=1e-02)
E AssertionError:
E Not equal to tolerance rtol=0.1, atol=0.01
E
E Mismatched elements: 453 / 804240 (0.0563%)
E Max absolute difference: 0.04533577
E Max relative difference: 763.70135
E x: array([[[33.55235 , -3.827327, 18.863625, ..., 3.420343, 6.171632,
E 11.648125],
E [-0.598835, -4.141003, 14.904708, ..., -4.515923, -1.790529,...
E y: array([[[33.567413, -3.829913, 18.870962, ..., 3.422938, 6.174327,
E 11.656706],
E [-0.58585 , -4.141752, 14.913631, ..., -4.516505, -1.788759,...
To reproduce:
On a100 instance,
remove xfail for gpu case in tank/roberta-base_tf/roberta-base_tf_test.py
remove xfail for gpu case in tank/xlm-roberta-base_tf/xlm-roberta-base_tf.py
run: pytest tank/*roberta -k "gpu"
The text was updated successfully, but these errors were encountered:
monorimet
changed the title
TF roberta and XLM roberta numerics issues on A100 without TF32
TF roberta/XLM roberta numerics issues on A100 if num_iterations >= 100
Aug 16, 2022
XLM-roberta assert failure:
roberta-base-tf assert failure:
To reproduce:
On a100 instance,
pytest tank/*roberta -k "gpu"
The text was updated successfully, but these errors were encountered: