Fix slow doctests or mark # long time #35443

tornaria · 2023-04-05T17:06:30Z

📚 Description

A test is supposed to take < 1s or else be marked # long time.

Here we consider slow tests taking >> 10s. When possible we fix or change the test to take less time, otherwise we just mark the test as long time. Occasionally we create a new smaller test and keep the original one as long.

After this and #35442 the slowest tests are a few taking ~ 10s.
The total time to doctest all goes down from 880 to 806 seconds (using -tp 8 --all).

~~NOTE: there's a minor merge conflict with #35314 which I will resolve once that PR is merged.~~

📝 Checklist

The title is concise, informative, and self-explanatory.
The description explains in detail what this PR is about.

tornaria · 2023-04-06T23:34:33Z

rebased to 10.0.beta8
added a new commit which should fix some more tests taking > 5s

A test should take < 1s or else be marked # long time. When possible we fix the test to take less time, otherwise we just mark the test as long time.

tornaria · 2023-04-14T23:37:49Z

Rebased to 10.0.beta9 and added more commits.

This is long but it should be easy to review since it's mostly adding # long time labels here and there. In the gh "files changed" tab it's very easy to see this lines.

A few changes are either reducing the size of the test, or fixing so it takes less time and it doesn't need to be marked long time.

If it is easier to review, I could either
(a) separate the changes just adding # long time from the few other changes.
(b) go through the review process myself and add a comment explaining each change that is not just adding # long time.

With this PR + a few more changes that I will PR separately, I have no tests taking more than ~ 5s. This saves ~10-15% of total test time (from 215s to 187s with -tp 32). Bear in mind that some tests are much faster with -tp1 than with -tp32.

There are still ~ 700 tests taking more than ~ 1s, but I will stop here.

mkoeppe · 2023-04-14T23:41:32Z

src/sage/crypto/lwe.py

@@ -298,6 +298,7 @@ def __init__(self, n, q, D, secret_dist='uniform', m=None):

            sage: from numpy import std
            sage: while abs(std([e if e <= 200 else e-401 for e in S()]) - 3.0) > 0.01:
+            ....:     L = []  # reset L to avoid quadratic behaviour


isn't the idea of this test that by increasing the number of samples, the error bound will be hit?

I'm not sure. To be honest I'm not sure what is the role of this test, but the previous implementation exhibits quadratic behavior which is the cause of this test usually being ok but sometimes being very slow:

sage -t --warn-long --random-seed=110988274722243807127083377606682083581 src/sage/crypto/lwe.py ********************************************************************** File "src/sage/crypto/lwe.py", line 300, in sage.crypto.lwe.LWE.__init__ Warning, slow doctest: while abs(std([e if e <= 200 else e-401 for e in S()]) - 3.0) > 0.01: add_samples() Test ran for 16.66 s, check ran for 0.00 s [112 tests, 17.29 s]

vs.

sage -t --warn-long 0.4 --random-seed=1 src/sage/crypto/lwe.py ********************************************************************** File "src/sage/crypto/lwe.py", line 300, in sage.crypto.lwe.LWE.__init__ Warning, slow doctest: while abs(std([e if e <= 200 else e-401 for e in S()]) - 3.0) > 0.01: add_samples() Test ran for 0.47 s, check ran for 0.00 s [112 tests, 1.37 s]

As far as I understand, they want to show that these samples indeed have a normal distribution with standard deviation 3.0. They take 1000 samples and want the standard deviation of these to be close to 3.0. Otherwise they keep adding samples, etc. until the standard deviation of the samples is indeed close to 3.0.

However, the way this is implemented it becomes O(n^2) when they have to try 1000n samples.

With my change, instead of adding more samples, we take a new set of 1000 samples. In this way, trying 1000n samples is O(n). So, even if we have to try more times, this is better.

Moreover, now this is linear instead of quadratic, it's even faster too try a sample of 100.

Summary: the new way the test is done, it keeps computing the standard deviation of 100 samples until it's really close to 3.0.

Let me know if you are happy with my explanation. Otherwise, I'll revert and place a # long time label (although I'd be more inclined to just nuke the test).

I think this was the only non-cosmetic objection you had (and the cosmetic ones are more or less all addressed).

I think I would be more comfortable if we just get rid of the while loop, compute and print the std and mark the result as random

That said, your solution is of course fine.

src/sage/geometry/polyhedron/base6.py

src/sage/rings/tests.py

orlitzky · 2023-04-15T01:03:00Z

With this PR + a few more changes that I will PR separately, I have no tests taking more than ~ 5s. This saves ~10-15% of total test time (from 215s to 187s with -tp 32). Bear in mind that some tests are much faster with -tp1 than with -tp32.

It takes several hours for me to run the test suite without --long, which really emphasizes how much of a losing battle this is while the threshold is measured in wall time and not cpu time. And that's with many files timing out completely (#32973).

NB: now that we've moved to Github, our notifications are once again being sent through SendGrid who regularly and intentionally violate the mail RFCs to delete my notifications (https://www.mail-archive.com/sage-devel@googlegroups.com/msg88600.html). Please keep that in mind if you ever want to draw my attention to a ticket.

tornaria · 2023-04-15T02:11:52Z

@mkoeppe Thanks for your review. I added your suggestions. Also a minor change to a doctest suggested by codecov (it turns out I changed one line of code in src/sage/plot/animate.py because the method apng() was using an incorrect filename (tmp_filename('.png') instead of the correct tmp_filename('.png')). As a matter of fact, the doctest in line 46 tests for this, but this doesn't seem to satisfy codecov, so I modified a doctest in line 1046 to test this change.

tornaria · 2023-04-15T02:22:37Z

With this PR + a few more changes that I will PR separately, I have no tests taking more than ~ 5s. This saves ~10-15% of total test time (from 215s to 187s with -tp 32). Bear in mind that some tests are much faster with -tp1 than with -tp32.

It takes several hours for me to run the test suite without --long, which really emphasizes how much of a losing battle this is while the threshold is measured in wall time and not cpu time. And that's with many files timing out completely (#32973).

For me it is now taking 4786 cputime seconds, or 187 wall time (using -tp 32 on a 36 core / 72 threads box).
This is down from 5362 cputime seconds (211 wall time) on a clean 10.0.beta8 checkout.

NB: now that we've moved to Github, our notifications are once again being sent through SendGrid who regularly and intentionally violate the mail RFCs to delete my notifications (https://www.mail-archive.com/sage-devel@googlegroups.com/msg88600.html). Please keep that in mind if you ever want to draw my attention to a ticket.

I'm sorry about that. EEE at work.

src/sage/schemes/curves/affine_curve.py

mkoeppe · 2023-04-15T08:17:43Z

src/sage/schemes/elliptic_curves/ell_curve_isogeny.py

-        sage: L.<b> = K.extension(x^2 + 26)                                             # optional - sage.rings.number_field
-        sage: EL = E.change_ring(L)                                                     # optional - sage.rings.number_field
-        sage: iso2 = EL.isogenies_prime_degree(2); len(iso2)                            # optional - sage.rings.number_field
+        sage: pol = NumberField(pol26,'a').optimized_representation()[0].polynomial()   # optional - sage.rings.number_field, long time


Oh, thanks... I can do a grep and find all of those. Is there a style guide about this? I was often unsure which column to place the first #, separation between labels, etc. The only rule I know is that the first # needs to have 2 spaces before. Other than that, every convention I could think about is represented in some part of the code...

E.g. some places do # optional - A # optional - B but other places do # optional - A B, etc.

I'm not even sure about the "legal" syntax, much less about the "preferred" style.

We should definitely add a style guide for this; and a linter/fixer for these would probably also be useful (see #35401).

Both forms are correct. The style # optional - A # optional - B in many places comes from using my simple editor macros.

src/sage/manifolds/differentiable/affine_connection.py

src/sage/plot/animate.py

tornaria · 2023-04-15T15:49:47Z

@mkoeppe I did reorder almost all # long time labels as you suggested.

These are all the exceptions:

$ git diff upstream/develop -- | grep '^+.*#.*# long'
+            sage: sum(FM.plot({}, srange(-2, 2, 0.1), srange(-2, 2, 0.1), opacity=0.2)  # not tested    # long time     # optional - sage.symbolic  # optional - sage.plot  # optional - sage.rings.number_field
+            sage: for j in M.irange():  # check on M's default frame  # long time
+            sage: for j in M.irange():  # check on frame e  # long time
+            sage: F.relative_error(asy[0], alpha, [1, 2, 4, 8, 16], asy[1])  # abs tol 1e-10  # long time
+            sage: rho.non_surjective() # See Section 5.10 of [Ser1972].  # long time
+            sage: rho.isogeny_bound() # See Section 5.10 of [Ser1972].  # long time
+            sage: rho.isogeny_bound()  # No 7-isogeny, but...   # long time
+            sage: rho.reducible_primes() # See Section 5.10 of [Ser1972].  # long time
+            sage: rho.isogeny_bound()  # No 7-isogeny, but...   # long time
+        sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) # See Section 5.10 of [Ser1972].  # long time
+        sage: (out, err, ret) = test_executable([  # optional - gdb # long time
+        sage: out.find('(gdb) ') >= 0              # optional - gdb # long time
+        sage: ret                                  # optional - gdb # long time

Only the last three seem like they could be reordered, however, the # long time labels in those three lines are aligned with other # long time lines in the same file and it looks ok this way.

tornaria · 2023-04-15T16:10:19Z

I think I'm done here, unless something else is really necessary, I'd rather finish this PR.

Aside: this "codecov" check is quite annoying, since I don't know how to make it happy. It seems all of my latest PRs are marked check failure because of this. A couple had actual errors, but since all of them have red crosses, it's not immediate to tell which ones.

I think we should be really serious about CI passing and PRs be reworked if some check fail. But it's quite frustrating to be aiming for a moving target that we don't know how it works.

Is it possible that the codecov checks run and indicate something but that they are not taken into account in the global "pass/fail" decision for a PR?

Also maybe it's better to have a separate repo / branch where CI experiments are carried before being pushed to develop?

src/sage/schemes/elliptic_curves/ell_curve_isogeny.py

tornaria · 2023-04-15T14:45:25Z

src/sage/schemes/elliptic_curves/gal_reps_number_field.py

        sage: K = NumberField(x**2 - 29, 'a'); a = K.gen()
        sage: E = EllipticCurve([1, 0, ((5 + a)/2)**2, 0, 0])
-        sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) # See Section 5.10 of [Ser1972].
+        sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) # See Section 5.10 of [Ser1972].  # long time
        [3, 5, 29]
        sage: E = EllipticCurve_from_j(1728).change_ring(K) # CM
        sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E)


This one is really far right (column 130). However, it seems the comment before may be more important.

In situations like this, I have often rewritten the test as from sage.schemes.elliptic_curves.gal_reps_number_field import _non_surjective (even if this import line is very long).

src/sage/tests/cmdline.py

mkoeppe · 2023-04-15T22:53:07Z

this "codecov" check is quite annoying, since I don't know how to make it happy. It seems all of my latest PRs are marked check failure because of this. A couple had actual errors, but since all of them have red crosses, it's not immediate to tell which ones.

I haven't followed the recent work on codecov; maybe @tobiasdiez or @kwankyu can comment on this

mkoeppe · 2023-04-15T23:07:07Z

@mkoeppe I did reorder almost all # long time labels as you suggested.

These are all the exceptions:

$ git diff upstream/develop -- | grep '^+.*#.*# long'
+            sage: sum(FM.plot({}, srange(-2, 2, 0.1), srange(-2, 2, 0.1), opacity=0.2)  # not tested    # long time     # optional - sage.symbolic  # optional - sage.plot  # optional - sage.rings.number_field
+            sage: for j in M.irange():  # check on M's default frame  # long time
+            sage: for j in M.irange():  # check on frame e  # long time
+            sage: F.relative_error(asy[0], alpha, [1, 2, 4, 8, 16], asy[1])  # abs tol 1e-10  # long time
+            sage: rho.non_surjective() # See Section 5.10 of [Ser1972].  # long time
+            sage: rho.isogeny_bound() # See Section 5.10 of [Ser1972].  # long time
+            sage: rho.isogeny_bound()  # No 7-isogeny, but...   # long time
+            sage: rho.reducible_primes() # See Section 5.10 of [Ser1972].  # long time
+            sage: rho.isogeny_bound()  # No 7-isogeny, but...   # long time
+        sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) # See Section 5.10 of [Ser1972].  # long time
+        sage: (out, err, ret) = test_executable([  # optional - gdb # long time
+        sage: out.find('(gdb) ') >= 0              # optional - gdb # long time
+        sage: ret                                  # optional - gdb # long time

Only the last three seem like they could be reordered, however, the # long time labels in those three lines are aligned with other # long time lines in the same file and it looks ok this way.

I don't really have a strong preference for the order of # long and the # optional annotations that correspond to the traditional "optional packages"; just the new modularization# optional - sage... should not start before column 88 (and preferably exactly at column 88) to avoid being too distracting.

In any case, aligning the annotations in a column certainly reduces the visual clutter and is a good thing to do when one makes changes to these lines anyway.

tobiasdiez · 2023-04-16T06:00:47Z

this "codecov" check is quite annoying, since I don't know how to make it happy. It seems all of my latest PRs are marked check failure because of this. A couple had actual errors, but since all of them have red crosses, it's not immediate to tell which ones.

From what I observed its actually not an issue with codecov but that some tests have random input and thus trigger different code paths. I've opened #35522 for this. If you experience any other problems, please open a new issue and I'll have a look.

tornaria · 2023-04-18T00:45:36Z

this "codecov" check is quite annoying, since I don't know how to make it happy. It seems all of my latest PRs are marked check failure because of this. A couple had actual errors, but since all of them have red crosses, it's not immediate to tell which ones.

From what I observed its actually not an issue with codecov but that some tests have random input and thus trigger different code paths. I've opened #35522 for this. If you experience any other problems, please open a new issue and I'll have a look.

Please have a look at the codecov/patch check: it's very specific

Check warning on line 1064 in src/sage/plot/animate.py

Codecov / codecov/patch

src/sage/plot/animate.py#L1064

Added line #L1064 was not covered by tests

However, if you look at the diff I added a test in lines 1046 and 1047 that would fail without the change I did to L1064. If that is not covering this line, please tell me what would cover that change.

As for the other issue: maybe codecov could be run with --random-seed=0 so it is more deterministic.

tornaria · 2023-04-18T01:07:49Z

Looking at https://app.codecov.io/gh/sagemath/sage/pull/35443/blob/src/sage/plot/animate.py#L1064 I think I understand what is going on here. The codecov/patch test only runs the testsuite in normal (not long) mode. In this case, the method apng() is never called in a normal-mode test.

Maybe an option is to run long tests at least just for those files that are changed in the patch. In fact, it'd be nice to run the whole testsuite in "normal" mode and the changed files a second time in "long" mode. This could also catch cases when doctesting works in "long" mode but it doesn't in "normal" mode because of a missing # long time label.

In this particular case, maybe there could be a test that calls apng() for a trivial animation so it's fast but it would still test that the filename is set ok.

But before worrying about that, we should be clear about what is the expectation: do we aim for 100% code coverage on normal test? on long test? Is this an aim for the whole codebase, or just for lines that change?

Whatever is the answer to those questions, IMO we must stick to them, either do not merge PRs that don't pass coverage check (with few reasonable exceptions) or else don't make coverage failure part of PR failure.

Otherwise, we risk making the whole CI check useless.

tornaria · 2023-04-18T01:23:13Z

As per my previous comment, I added a small quick test that should satisfy codecov/patch.

github-actions · 2023-04-18T02:33:03Z

Documentation preview for this PR is ready! 🎉
Built with commit: 1021032

tobiasdiez · 2023-04-18T04:04:23Z

Maybe an option is to run long tests at least just for those files that are changed in the patch. In fact, it'd be nice to run the whole testsuite in "normal" mode and the changed files a second time in "long" mode. This could also catch cases when doctesting works in "long" mode but it doesn't in "normal" mode because of a missing # long time label.

I don't think such a hybrid mode is supported (yet) by our doctest framework, or is it? Maybe we can always run all long tests in CI, or would this be to long?

But before worrying about that, we should be clear about what is the expectation: do we aim for 100% code coverage on normal test? on long test? Is this an aim for the whole codebase, or just for lines that change?

As far as I understand it, sage has a high priority on writing tests with a high coverage. Striving for 100% coverage however is usually not a good idea, since the additional tests you create to cover "trivial branches" create a maintenance overhead without providing real value.

Maybe we should move this discussion to a new issue?

mkoeppe · 2023-04-18T05:05:14Z

Maybe an option is to run long tests at least just for those files that are changed in the patch. In fact, it'd be nice to run the whole testsuite in "normal" mode and the changed files a second time in "long" mode. This could also catch cases when doctesting works in "long" mode but it doesn't in "normal" mode because of a missing # long time label.

I don't think such a hybrid mode is supported (yet) by our doctest framework, or is it? Maybe we can always run all long tests in CI, or would this be to long?

I'd be +1 on running the long tests in CI.

And before running the long tests, perhaps we can run the changed files of the PR first (similar to sage -t --new) for a quick turnaround.

tornaria requested a review from orlitzky April 5, 2023 17:06

tornaria force-pushed the slow_doctests branch from 4b92eb1 to 1553e66 Compare April 6, 2023 23:32

tornaria added 7 commits April 14, 2023 20:02

Fix slow doctests or mark # long time

fdf67c2

A test should take < 1s or else be marked # long time. When possible we fix the test to take less time, otherwise we just mark the test as long time.

Fix slow doctests or mark # long time (part 2)

327f3be

Mark animation tests # long time, or fix

f95682c

Fix slow doctests or mark # long time (part 3)

109bfee

Fix slow doctests or mark # long time (part 4)

823dfe9

Fix slow doctests or mark # long time (part 5)

7e68238

Fix slow doctests or mark # long time (part 6)

17c2226

tornaria force-pushed the slow_doctests branch from 1553e66 to 17c2226 Compare April 14, 2023 23:16

tornaria requested review from vbraun and mkoeppe April 14, 2023 23:22

mkoeppe reviewed Apr 14, 2023

View reviewed changes

src/sage/geometry/polyhedron/base6.py Outdated Show resolved Hide resolved

mkoeppe reviewed Apr 14, 2023

View reviewed changes

src/sage/rings/tests.py Outdated Show resolved Hide resolved

tornaria force-pushed the slow_doctests branch from 27d5e01 to aecdfa5 Compare April 15, 2023 04:29

small changes suggested on review and codecov

b70464c

tornaria force-pushed the slow_doctests branch from aecdfa5 to b70464c Compare April 15, 2023 04:51

mkoeppe reviewed Apr 15, 2023

View reviewed changes

src/sage/schemes/curves/affine_curve.py Outdated Show resolved Hide resolved

mkoeppe reviewed Apr 15, 2023

View reviewed changes

tornaria commented Apr 15, 2023

View reviewed changes

src/sage/manifolds/differentiable/affine_connection.py Show resolved Hide resolved

tornaria commented Apr 15, 2023

View reviewed changes

src/sage/plot/animate.py Outdated Show resolved Hide resolved

Reorder # long time labels as suggested by review

dacd182

A few more reorder # long time labels

7d23473

tornaria added the s: needs review label Apr 15, 2023

mkoeppe reviewed Apr 15, 2023

View reviewed changes

src/sage/schemes/elliptic_curves/ell_curve_isogeny.py Outdated Show resolved Hide resolved

rewrite test so labels fit

76600f1

tornaria commented Apr 15, 2023

View reviewed changes

mkoeppe approved these changes Apr 15, 2023

View reviewed changes

mkoeppe added s: positive review t: tests and removed s: needs review labels Apr 15, 2023

Add a test to satisfy codecov/patch

1021032

vbraun merged commit ef68bee into sagemath:develop Apr 23, 2023
6 of 7 checks passed

mkoeppe added this to the sage-10.0 milestone Apr 23, 2023

mkoeppe removed the s: positive review label Apr 23, 2023

vbraun mentioned this pull request Jun 3, 2023

Random testing has revealed a problem in test_karatsuba_multiplication #35715

Open

2 tasks

tornaria deleted the slow_doctests branch November 27, 2023 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix slow doctests or mark # long time #35443

Fix slow doctests or mark # long time #35443

tornaria commented Apr 5, 2023 •

edited

tornaria commented Apr 6, 2023 •

edited

tornaria commented Apr 14, 2023

mkoeppe Apr 14, 2023

tornaria Apr 15, 2023

tornaria Apr 15, 2023

mkoeppe Apr 15, 2023

mkoeppe Apr 15, 2023

orlitzky commented Apr 15, 2023

tornaria commented Apr 15, 2023

tornaria commented Apr 15, 2023

mkoeppe Apr 15, 2023

tornaria Apr 15, 2023

mkoeppe Apr 15, 2023

tornaria commented Apr 15, 2023

tornaria commented Apr 15, 2023

tornaria Apr 15, 2023

mkoeppe Apr 15, 2023

mkoeppe commented Apr 15, 2023

mkoeppe commented Apr 15, 2023

tobiasdiez commented Apr 16, 2023

tornaria commented Apr 18, 2023

tornaria commented Apr 18, 2023

tornaria commented Apr 18, 2023

github-actions bot commented Apr 18, 2023

tobiasdiez commented Apr 18, 2023

mkoeppe commented Apr 18, 2023

Fix slow doctests or mark # long time #35443

Fix slow doctests or mark # long time #35443

Conversation

tornaria commented Apr 5, 2023 • edited

📚 Description

📝 Checklist

tornaria commented Apr 6, 2023 • edited

tornaria commented Apr 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orlitzky commented Apr 15, 2023

tornaria commented Apr 15, 2023

tornaria commented Apr 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tornaria commented Apr 15, 2023

tornaria commented Apr 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkoeppe commented Apr 15, 2023

mkoeppe commented Apr 15, 2023

tobiasdiez commented Apr 16, 2023

tornaria commented Apr 18, 2023

tornaria commented Apr 18, 2023

tornaria commented Apr 18, 2023

github-actions bot commented Apr 18, 2023

tobiasdiez commented Apr 18, 2023

mkoeppe commented Apr 18, 2023

tornaria commented Apr 5, 2023 •

edited

tornaria commented Apr 6, 2023 •

edited