[WIP] Add systematic tests #1072

simsurace · 2023-09-23T20:19:43Z

In order to find possible failure modes of Enzyme, the idea of this PR is to iterate over "all" functions in Base and stdlibs and test whether Enzyme gives the correct derivatives. Currently, it is probably noisy and pretty limited (also to save CI time):

It only includes LinearAlgebra
It is limited to 1- and 2-argument functions
It only checks functions that either take scalars, vectors, or matrices and return real numbers of arrays of real numbers (excluding Bool).

Still, it may already catch some interesting failures.

codecov-commenter · 2023-09-23T20:32:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8981a34) 74.34% compared to head (d530851) 93.28%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1072       +/-   ##
===========================================
+ Coverage   74.34%   93.28%   +18.93%     
===========================================
  Files          35        8       -27     
  Lines       10307      253    -10054     
===========================================
- Hits         7663      236     -7427     
+ Misses       2644       17     -2627

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wsmoses · 2023-09-24T01:39:30Z

@simsurace you mentioned some functionaity to automatically submit minimal tests (which were tracked) for current failures?

Is that possible to do (it would be immensely useful)?

simsurace · 2023-09-24T06:48:54Z

Yeah I will think about how to do that. Basically, from the failed test it should be easy to generate code for a MRE. Maybe we first need to clarify if the current way to detect failures is the correct one. There seem to be hundreds of them even just in the current test set. Maybe you don't want thousands of open issues to sift through by hand (assuming we test other modules as well). So maybe right now the better option is still to open issues by hand (e.g. the eigmax one would be the first because it is fatal, i.e. it stops the whole test suite).

wsmoses · 2023-09-24T07:07:49Z

Maybe we could have a single issue with tasks that are auto populated/updated?

simsurace · 2023-09-24T08:02:52Z

Great idea. I need to figure out how to do this.

wsmoses · 2024-01-27T04:21:01Z

bump @simsurace did you have a chance to work on this?

simsurace · 2024-01-27T07:59:23Z

I think the whole automation involving auto-updating of GitHub issues is beyond by current capacity. Would the tests as they are now be useful as a non-blocking CI step? Or could we maybe extract a bunch of MREs by hand?

simsurace · 2024-01-27T08:36:53Z

I've thought about it a bit. From my understanding, the steps that would need to be solved to make an automatic solution work:

Figure out how to extract all relevant information about the error as a string/text file
Figure out how to post/update an issue to the GitHub repo from within Julia or bash (including the string or text file from above). Maybe use an id that can be checked to avoid duplicate issues.

Then I think the loop over modules and functions could be changed such that it only runs until the first error, saves the whole error output to a file and then updates the issue with a code block that corresponds to the call that triggered the error and the file with the stacktrace attached.

I don't think any of this is very hard, but it's stuff I need time to figure out. So if someone has done it before, they would certainly be in a much better position to get this done quickly. Anyways I will try to remember this task when I do have some time available.

gdalle · 2024-06-24T11:56:01Z

This might actually be easier with DifferentiationInterfaceTest, we could define scenarios for all Base functions and just use test_differentiation to run them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add systematic tests #1072

[WIP] Add systematic tests #1072

simsurace commented Sep 23, 2023

codecov-commenter commented Sep 23, 2023 •

edited

Loading

wsmoses commented Sep 24, 2023

simsurace commented Sep 24, 2023

wsmoses commented Sep 24, 2023

simsurace commented Sep 24, 2023

wsmoses commented Jan 27, 2024

simsurace commented Jan 27, 2024

simsurace commented Jan 27, 2024

gdalle commented Jun 24, 2024

[WIP] Add systematic tests #1072

Are you sure you want to change the base?

[WIP] Add systematic tests #1072

Conversation

simsurace commented Sep 23, 2023

codecov-commenter commented Sep 23, 2023 • edited Loading

Codecov Report

wsmoses commented Sep 24, 2023

simsurace commented Sep 24, 2023

wsmoses commented Sep 24, 2023

simsurace commented Sep 24, 2023

wsmoses commented Jan 27, 2024

simsurace commented Jan 27, 2024

simsurace commented Jan 27, 2024

gdalle commented Jun 24, 2024

codecov-commenter commented Sep 23, 2023 •

edited

Loading