Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing Unit Test Bugfixes and Performance Related Testing Refactors #372

Merged
merged 21 commits into from
Nov 16, 2023

Conversation

coreyostrove
Copy link
Contributor

This PR includes two main changes:

  1. Bugfixes for the failing notebook regression tests uncovered after merging in PR Feature globally germ aware fpr #350. The details of exactly what broke is complicated, but in a nutshell related to some new numerical stability problems with some of the older parts of the germ selection code (why this started randomly breaking now, who knows?). For more details see the thread at Nondeterministic Unit Test Failures #369.
  2. A combination of unit test performance tweaks and refactors to address issue test_fiducialpairreduction takes a very long time #360. Speedups are primarily due to tweaks to the algorithms and drivers testing modules. This includes switching from legacy modelpacks to modern ones in a number of places.

It looks like all of the unit tests, including the extras and the notebook regression tests (which I manually dispatched on this branch) are passing now. Let me know if you have any questions.

One thing worth noting, I had originally merged in the branch bugfix-for-0.9.12.0, but found that this resulted in a number of unit tests in the extras package failing. Check out 'build and run test extras' 211 for more details on that. So I opted to revert that merge on this branch to keep the changes separate (and save myself the effort of tracking that down).

sserita and others added 20 commits November 9, 2023 14:52
I am unable to reproduce the failure scenario locally on my windows machine, so try some remote debugging on the github runners.
Not the most graceful, but only import qibo if the correct version.
Will be removed next major release due to growing divergence
between actual and desired interface
The streamlines the tests in the algorithms test module and does a refactor of the test fixtures to use a more modern modelpack format. Also improves coverage on fiducial selection codebase.
Add the failing line of the germ selection notebook as a unit test to try and get more verbose output on the failure from the runner. Also temporarily tweak main.yml to only test against windows for faster turnaround.
Some more verbosity for debugging
minor typo fix
Try using the schur decomposition in the twirling superoperator construction instead. Also some unrelated fixes for the fiducial pair reduction unit tests.
To use a more modern modelpack, and also a smaller one which speeds up the testing a bit.
Streamlined the RB unit tests to test on a fewer number of qubits for a modest reduction in the runtime.
Change the distribution CLI option for xdist to ensure tests remain properly grouped across workers.
Reduce the number of parameterizations fit in the QutritGST demo notebook to reduce runtime (it was periodically timing out on certain cells on the github runners).
Update the implementation of the twirling superoperator to check the commutator of the input matrix and branch off of that to determine the decomposition to use. Also rename the new germ selection unit test.
Resume testing against the entire module.
Now that the evotype dependent serialization bug is resolved go ahead with disabling checkpointing on these tests.
…o feature-faster-algorithm-tests"

This reverts commit b4325dc, reversing
changes made to 38ca39d.
Copy link
Contributor

@rileyjmurray rileyjmurray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good! I have two substantive comments for the eig vs Schur issue. I also have a question about the level of user-exposure for code in pygsti/algorithms/directx.py.

Somewhat tangentially, I stumbled across this comment block for the Label class, which says that it's supposed to be a base class only. Based on how the Label class is used in practice, it seems like this comment is wrong. Corey, what's your take?

Comment on lines +398 to +400
sigma,
Label('GsigmaLbl') if sigma.line_labels == ('*',) else Label('GsigmaLbl', sigma.line_labels),
dataset, prep_fiducials, meas_fiducials, target_model,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that you've made a change to this file, since Erik just marked it as minimal-exposure. @coreyostrove any chance you'd disagree with Erik's take on this file's user exposure? Here's the conventions we agreed to, for reference:

  • High exposure: API changes are likely to break workflows for some users, including "casual" users.
  • Low exposure: API changes are unlikely to affect casual users of pyGSTi but might affect some of pyGSTI's advanced users.
  • Minimal exposure: while it's conceivable that API changes could break user code, the possibility of this is sufficiently remote that we can change the API at will.

Copy link
Contributor Author

@coreyostrove coreyostrove Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I agree with Erik's marking of this as minimal-exposure. This change arose as a bugfix for some failures encountered in the corresponding test package (test_directx.py) as part of the refactor moving from legacy modelpacks to the modern implementation in testing. This particular line of code turned out to be fragile when it came to circuit line labeling, and so broke when using circuits from the new modelpacks that have different default circuit line labeling behavior than they do in the legacy modelpacks.

In reality I suspect this is likely due to a disjoint bug in the circuit line label handling code itself in the circuits module that I've run into in other contexts and mentioned to @sserita before, but haven't ever properly documented. (This is a bug related to properly reconciling line labels between circuits with a placeholder '*' and those with fully specified labels). In the moment it was easier to patch in a workaround in directx than to context switch and try to track the circuit bug. That said, I am in the correct headspace for this sort of thing now, so if nothing else I should be able to get a minimal reproduction of the aforementioned bug documented pretty soon.

While I was working in the directx code I did wonder to myself if we even need any of this anymore? I.e. insofar as it is minimal exposure I suspect that is because few (possibly 0) people use it in practice and there are now generally better (more pygstithonic) ways to produce the same results. I've personally never interacted with this code in practical usage, and don't know of anyone who does use this code. Given that, maybe this would be a decent candidate for a module we can retire to legacy as part of out spring cleaning process. @enielse and @sserita, thoughts on this last question?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I just read through @enielse's latest commit on the code marking branch and saw that he has marked directx as:

user-exposure: minimal, forsaken (EGN - we don't use this anymore and never really did)

So that is two votes in favor of decommissioning this part of the codebase. @sserita, how do you feel about formally marking this as deprecated in 0.9.12.0 and then removing it (remove = move to the legacy folder) either in 0.9.13.0 or some point thereafter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the minimal tag and we can formally mark this as deprecated. I will do so in the bugfix-for-0.9.12 branch before I merge that in.

pygsti/algorithms/germselection.py Outdated Show resolved Hide resolved
This commit adds a new kwarg for specifying the tolerance to use in determining whether a matrix is normal for the purposes of selecting the correct decomposition algorithm. Also adds a proper docstring explaining the various function arguments and outputs and removes some holdover print statements from debugging.
@coreyostrove
Copy link
Contributor Author

RE: Label. I use the base Label class directly pretty regularly, for what that is worth. I suspect that I am not alone in that regard, and that the comment block you mentioned probably accurately describes the initial design intentions, but not present behavior.

Copy link
Contributor

@sserita sserita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good, thanks for your hard work on this!

@sserita sserita merged commit f2c28b5 into develop Nov 16, 2023
13 checks passed
@sserita sserita deleted the feature-faster-algorithm-tests branch November 16, 2023 18:58
@sserita sserita mentioned this pull request Dec 2, 2023
86 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants