Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to fix onnx tests #1278

Merged
merged 5 commits into from Jan 12, 2023
Merged

Attempting to fix onnx tests #1278

merged 5 commits into from Jan 12, 2023

Conversation

corey-nm
Copy link
Contributor

@corey-nm corey-nm commented Dec 20, 2022

Background

This is attempting to fix a bug in python 3.9 where the test_one_shot_ks_perf_sensitivity is failing.

Two notes:

  1. I could not reproduce this failure locally
  2. These test pass in python 3.7/3.8/3.10 jenkins pipelines.

So the only failure is in jenkins pipeline on python 3.9.

My current hypothesis is that the results returned by model analysis can have multiple values where the x["index"] is the same:
image

This could make the sorted call be inconsistently ordering the actual results.

Another piece of information supporting this hypothesis is the error messages of the 2 failures:

failure 1:

expected_layers = [{'averages': OrderedDict([(0.0, 3.500000000000001e-05), (0.4, 3.2000000000000005e-05), (0.8, 3.300000000000001e-05), ...'baseline_average': 7.089999999999999e-05, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
actual_layers = [{'averages': OrderedDict([(0.0, 4.599999999999999e-05), (0.4, 4.499999999999999e-05), (0.8, 4.699999999999999e-05), (...0152823)]), 'baseline_average': 0.0137507, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
is_perf = True

failure 2:

expected_layers = [{'averages': OrderedDict([(0.0, 4.9999999999999996e-05), (0.4, 3.0000000000000004e-05), (0.8, 4.599999999999999e-05),...9999e-05)]), 'baseline_average': 4.07e-05, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
actual_layers = [{'averages': OrderedDict([(0.0, 3.4000000000000007e-05), (0.4, 3.2000000000000005e-05), (0.8, 3.500000000000001e-05),... 'baseline_average': 0.015620499999999997, 'baseline_measurement_index': 0, 'baseline_measurement_key': 0.0, ...}, ...]
is_perf = True

If you swap the actual_layers values from both of these failures, then it seems like both would pass.

Fix

If my hypothesis is correct, then adding a tie-breaker to the sorting should fix this bug. I changed the sorting key to sort based on:

  1. id
  2. if id's are equal, then index
  3. If id and index are equal, then name

This will break the ties, so if this doesn't solve the issue then I don't know what the problem is.

Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for digging in Corey. definitely seems like it could be a sorting consistency issue, however curious why this would be popping up now especially since it's a call into python sort which I believe is stable

were you able to have tests pass locally on this? looks like the onnx tests in GHA are failing on some weird numpy property exception

tests/sparseml/onnx/optim/test_sensitivity_ks.py Outdated Show resolved Hide resolved
@corey-nm
Copy link
Contributor Author

corey-nm commented Dec 20, 2022

@bfineran

were you able to have tests pass locally on this? looks like the onnx tests in GHA are failing on some weird numpy property exception

Yeah this is related to the numpy 1.24 breaking changes (ie rahul's numpy pins should fix this)

@bfineran bfineran merged commit 0b703ea into main Jan 12, 2023
@bfineran bfineran deleted the try-fix-onnx-tests branch January 12, 2023 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants