Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated slow tests and removed a few XFAIL #16010

Merged
merged 16 commits into from Feb 24, 2019

Conversation

oscargus
Copy link
Contributor

References to other Issues or PRs

Closes #9807
Closes #7157 (see #9242 for discussion if it is a good idea)

Brief description of what is fixed or changed

Based on the recent runs I marked a number of tests as slow. I also removed XFAIL for those tests that appears as if they are passing.

Other comments

There are some tests which are slow that have identical names to other tests. As the output only tells the name of the test and not in which file, I differentiated the names here to see which test it is.

It would make sense to update the test time distribution at some stage (after this) as the slow tests now are distributed as 6, 9, and 18 minutes (will probably be worse after this).

Is there a way to change the not so slow test threshold? Right now it only shows the skipped tests, but tests (now) taking a few seconds should probably be un-slow-marked. Not obvious how to figure it out though.

Release Notes

NO ENTRY

@oscargus oscargus added the Testing Related to the test runner. Do not use for test failures unless it relates to the test runner itself label Feb 17, 2019
@sympy-bot
Copy link

Hi, I am the SymPy bot (v137). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

  • No release notes entry will be added for this pull request.

Note: This comment will be updated with the latest check if you edit the pull request. You need to reload the page to see it.

Click here to see the pull request description that was parsed.

<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234". See
https://github.com/blog/1506-closing-issues-via-pull-requests . Please also
write a comment on that issue linking back to this pull request once it is
open. -->
Closes #9807 
Closes #7157 (see #9242 for discussion if it is a good idea)

#### Brief description of what is fixed or changed
Based on the recent runs I marked a number of tests as slow. I also removed XFAIL for those tests that appears as if they are passing.

#### Other comments
There are some tests which are slow that have identical names to other tests. As the output only tells the name of the test and not in which file, I differentiated the names here to see which test it is.

It would make sense to update the test time distribution at some stage (after this) as the slow tests now are distributed as 6, 9, and 18 minutes (will probably be worse after this).

Is there a way to change the not so slow test threshold? Right now it only shows the skipped tests, but tests (now) taking a few seconds should probably be un-slow-marked. Not obvious how to figure it out though.

#### Release Notes

<!-- Write the release notes for this release below. See
https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more information
on how to write release notes. The bot will check your release notes
automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
NO ENTRY
<!-- END RELEASE NOTES -->

@oscargus
Copy link
Contributor Author

I tried to generate new split densities, but the complete run took 25 seconds to clearly something is not working. I used the following code:

# This list can be generated with the code:
# from time import time
# import sympy
#
# delays, num_splits = [], 30
# for i in range(1, num_splits + 1):
# tic = time()
# sympy.test(split='{}/{}'.format(i, num_splits), time_balance=False)
# delays.append(time() - tic)
# tot = sum(delays)
# print([round(x / tot, 4) for x in delays]))

(Extra ending bracket on the final line, but that is not the problem...)

@oscargus
Copy link
Contributor Author

oscargus commented Feb 17, 2019

Before this the time taken was 6/12/16/21 minutes for non-slow tests and 6/9/18 minutes for slow tests. New numbers are 4/8/14/12 minutes for non-slow tests and 7/14/26 minutes for slow tests.

With new split densities, the non-slow tests should finish in about 10 minutes and the slow tests in about 16 minutes.

@oscargus oscargus changed the title [WIP] Updated slow tests and removed a few XFAIL Updated slow tests and removed a few XFAIL Feb 17, 2019
@oscargus
Copy link
Contributor Author

Btw, I changed the default "warn for slow tests not being slow" to 5 seconds from 0.1 seconds. I think it makes sense to keep it like that. I also removed the slow decorator for most of the tests taking less than 5 seconds. If not, there were different results for 2.7 and 3.7 or I simply missed them.

There is a test test_files that takes 45 seconds for 2.7, but less than 10 seconds for 3.X, so I didn't mark it as slow since it relates to the code quality of the files, so better have it run every time. (Not sure exactly when it is run though...)

@oscarbenjamin
Copy link
Contributor

This all looks good to me.

Are you updating the splits as well?

I've just started running the split timer but realised that I should probably not use the computer while they're running. Need a spare computer really...

@oscargus
Copy link
Contributor Author

oscargus commented Feb 17, 2019

I didn't manage to get the split-code running (see above). All tests took 25 seconds with a rather equal distribution, so I really doubt the results...

Yeah, I recognize the multi-computer problem... But should be OK to run it overnight if only the test code was running properly. Do you have any idea how to run it? I have no problem running it if it only worked...

@oscarbenjamin
Copy link
Contributor

I'm using the same code and it seems to be working:

$ cat runsplits.py 
from time import time
import sympy

delays, num_splits = [], 30
for i in range(1, num_splits + 1):
    tic = time()
    sympy.test(split='{}/{}'.format(i, num_splits), time_balance=False)
    delays.append(time() - tic)
tot = sum(delays)
print([round(x / tot, 4) for x in delays])

It's been running for 30 minutes and is at split 25/30.

I'm not sure what to do about external dependencies though. Should I uninstall numpy, matplotlib, tensorflow etc and base the splits on that configuration?

@oscargus
Copy link
Contributor Author

OK! I ran it from IPython, but I'll try a separate file and see if that changes something.

I do not think that those tests really changes much, but really no idea. Either way it probably improves over the current ones...

Any idea how to profile the slow tests?

@oscarbenjamin
Copy link
Contributor

Any idea how to profile the slow tests?

I think you can pass slow=True to sympy.test. Although I don't know if that does both slow and non-slow tests.

@oscarbenjamin
Copy link
Contributor

The output I get is:

[0.0977, 0.0406, 0.038, 0.0077, 0.0081, 0.004, 0.0527, 0.0259, 0.0721, 0.0794, 0.0018, 0.0019, 0.0091, 0.0086, 0.0267, 0.0043, 0.004, 0.0219, 0.0529, 0.0018, 0.0118, 0.019, 0.0016, 0.0384, 0.0439, 0.0295, 0.2488, 0.0026, 0.0062, 0.039]

@oscarbenjamin
Copy link
Contributor

I also got 2 fails though:

______________________________________________________________________________________________________________________________________________________________
_________________________________________ sympy/utilities/tests/test_lambdify.py:test_tensorflow_logical_operations __________________________________________
Traceback (most recent call last):
  File "/Users/enojb/current/sympy/sympy/sympy/utilities/tests/test_lambdify.py", line 597, in test_tensorflow_logical_operations
    a = tensorflow.constant(False)
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/tensor_util.py", line 542, in make_tensor_proto
    append_fn(tensor_proto, proto_values)
  File "tensorflow/python/framework/fast_tensor_util.pyx", line 134, in tensorflow.python.framework.fast_tensor_util.AppendBoolArrayToTensorProto
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/numpy/lib/type_check.py", line 489, in asscalar
    return a.item()
UnboundLocalError: local variable 'a' referenced before assignment
______________________________________________________________________________________________________________________________________________________________
______________________________________________ sympy/utilities/tests/test_lambdify.py:test_tensorflow_piecewise ______________________________________________
Traceback (most recent call last):
  File "/Users/enojb/current/sympy/sympy/sympy/utilities/tests/test_lambdify.py", line 610, in test_tensorflow_piecewise
    assert func(a).eval(session=s, feed_dict={a: -1}) == -1
  File "<lambdifygenerated-78>", line 2, in _lambdifygenerated
    return (where((x == 0), 0, where((x < 0), -1, where((x > 0), 1, 0))))
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py", line 2624, in where
    return gen_math_ops.select(condition=condition, x=x, y=y, name=name)
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6997, in select
    "Select", condition=condition, t=x, e=y, name=name)
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 510, in _apply_op_helper
    preferred_dtype=default_dtype)
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 229, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/tensorflow/python/framework/tensor_util.py", line 542, in make_tensor_proto
    append_fn(tensor_proto, proto_values)
  File "tensorflow/python/framework/fast_tensor_util.pyx", line 134, in tensorflow.python.framework.fast_tensor_util.AppendBoolArrayToTensorProto
  File "/Users/enojb/current/sympy/venv/lib/python3.7/site-packages/numpy/lib/type_check.py", line 489, in asscalar
    return a.item()
UnboundLocalError: local variable 'a' referenced before assignment

@oscarbenjamin
Copy link
Contributor

For the slow tests you might need to simulate being on Travis. Some XFAIL, slow tests take a really long time and are skipped on travis. This one in particular I've never seen complete:

if ON_TRAVIS:

Maybe it's enough to set the environment variable ON_TRAVIS...

@asmeurer
Copy link
Member

I forgot about the slow tests in #15997. As long as the total time is well below 50 min., we can remove the split.

@oscargus
Copy link
Contributor Author

@asmeurer The total time for slow tests is 47 minutes at the moment, so maybe not "well below"... Maybe split both in two?

@oscarbenjamin Seems to make sense. The assumptions and solvers are taking most time is my impression so first slice and towards the end would be expected. No idea where those fails comes from. Seems tensorflow/numpy related?

@oscarbenjamin
Copy link
Contributor

Oh yeah, I've installed tensorflow in Python 3.7 but it's not officially supported in 3.7.

I just tried setting slow=True and running with env var ON_TRAVIS=1 but it seems to be running test_bicycle from test_kane3.py which should be skipped on Travis. So I'm not sure how to replicate the if ON_TRAVIS: skip effect.

Really I think we should get rid of the ON_TRAVIS skip effect. Those tests should just be marked as skip. If it's too slow for Travis then it's too slow for any normal test run. If someone wants to run those tests they should import them and run them directly. As soon as a test becomes too slow for Travis it basically becomes ignored and so will slowly break and never get fixed unless someone really cares and takes time to run it regularly themselves (and puts the effort in to fix it since the onus will be on them to fix it retrospectively).

@oscarbenjamin
Copy link
Contributor

When running the ODE tests with pytest I use -m 'not slow or not xfail' i.e. I skip all tests that are both slow and XFAIL. I think that should be the general case - slow+xfail tests are often slow because they fail. The benefit of running xfail tests does not justify routinely running slow xfail tests.

@asmeurer
Copy link
Member

A tensorflow release is coming for Python 3.7. Once it is out, the optional dependency tests should automatically move over to it, as they don't pin the Python version (see #15867).

@asmeurer The total time for slow tests is 47 minutes at the moment, so maybe not "well below"... Maybe split both in two?

We can try putting it in one and if there are too many timeouts we can split it. Ideally the tests should be getting faster, as we improve the performance of SymPy.

@oscarbenjamin
Copy link
Contributor

I think you can simulate being on Travis with:

$ TRAVIS_BUILD_NUMBER=1 python runsplits.py 

From here:

ON_TRAVIS = os.getenv('TRAVIS_BUILD_NUMBER', None)

Note that I've added slow=True inside runsplits.py:

$ cat runsplits.py 
from time import time
import sympy

delays, num_splits = [], 30
for i in range(1, num_splits + 1):
    tic = time()
    sympy.test(split='{}/{}'.format(i, num_splits), time_balance=False, slow=True)
    delays.append(time() - tic)
tot = sum(delays)
print([round(x / tot, 4) for x in delays])

We should add instructions for the slow tests to the comment in runtests.py

@oscarbenjamin
Copy link
Contributor

This is the output I get for the slow tests:

[0.156, 0.0006, 0.0006, 0.0007, 0.0008, 0.0014, 0.0304, 0.032, 0.0744, 0.0713, 0.0006, 0.0006, 0.0158, 0.003, 0.0078, 0.0195, 0.0013, 0.0122, 0.0008, 0.0006, 0.0078, 0.0008, 0.0007, 0.0267, 0.0007, 0.138, 0.2453, 0.0006, 0.001, 0.1479]

@asmeurer
Copy link
Member

I've removed the slow tests split in #15997. Perhaps we should merge that PR then merge this with master, and see how long the slow tests build takes. If it's too long in the PR we can add it back.

@oscarbenjamin
Copy link
Contributor

I can see from here that the slow test in this branch take 47 or 49 minutes for 3.7 or 2.7. That's close to the 50 minute cutoff. Do you think that the total time will be substantially less than the 3 splits?

@oscargus
Copy link
Contributor Author

I will add the split times from @oscarbenjamin so that the probability of a somewhat even split increases.

I agree that probably a two-way split would improve things for the slow tests, especially since more tests are marked slow here. Without the split quite few slow tests can be added after that (although it is not clear how much the common "start-up" accounts for, but hard to see that it should be more than a few minutes).

@oscargus
Copy link
Contributor Author

I've added the splits from @oscarbenjamin Thanks! Will be interesting to see the outcome of it.

@oscargus
Copy link
Contributor Author

Now: 8/11/8/10 for non-slow and 18/15/14 so clearly better!

@oscarbenjamin
Copy link
Contributor

Er maybe it would have been better if I had run the split times using this PR instead of master...

I don't have time to redo them for a few days unfortunately.

@oscarbenjamin
Copy link
Contributor

New times with this PR:

Non-slow:

$ python runsplits.py 
...
[0.0801, 0.0099, 0.0429, 0.0103, 0.0122, 0.0055, 0.0533, 0.0191, 0.0977, 0.0878, 0.0026, 0.0028, 0.0147, 0.0118, 0.0358, 0.0063, 0.0026, 0.0351, 0.0084, 0.0027, 0.0158, 0.0156, 0.0024, 0.0416, 0.0566, 0.0425, 0.2123, 0.0042, 0.0099, 0.0576]

@oscarbenjamin
Copy link
Contributor

Slow tests (this PR):

$ TRAVIS_BUILD_NUMBER=1 python runsplits.py
[0.1525, 0.0342, 0.0092, 0.0004, 0.0005, 0.0005, 0.0379, 0.0353, 0.0637, 0.0801, 0.0005, 0.0004, 0.0133, 0.0021, 0.0098, 0.0108, 0.0005, 0.0076, 0.0005, 0.0004, 0.0056, 0.0093, 0.0005, 0.0264, 0.0051, 0.0956, 0.2983, 0.0005, 0.0005, 0.0981]

@oscarbenjamin
Copy link
Contributor

I've just pushed those new split times here

@oscargus
Copy link
Contributor Author

Great! Looks like it had the desired effect!

@oscarbenjamin
Copy link
Contributor

I think we should bring back some split for the slow tests. Or at least the slow tests can be moved earlier in the job order. The slow tests take 45 minutes (close to the 50 minute cutoff) and are the only things running at the end so they slow down finishing.

@oscargus
Copy link
Contributor Author

I agree with @oscarbenjamin . Just a few tests on the slower side may bring it up to 50 minutes. While it naturally is a good idea to speed things up, there will also be a need to test quite complicated computations. (Right now there are some tests that really are too slow for Travis and enabling any of those, who may be about as slow as some of the slowest test running now, I haven't really tried any, can bring it over that limit, so better to add a few of those if possible and do a two-way split.)

@oscargus
Copy link
Contributor Author

There are 13 tests which are currently disabled on Travis (from grep:ing if ON_TRAVIS). The slowest test actually executed on Travis took about three minutes now. Not sure if all these 13 tests do take longer than that.

@oscarbenjamin
Copy link
Contributor

Looking at other PRs it looks as if the slow tests take 28 minutes after #15997. They take 47 minutes in this PR because of more tests are labelled as slow here. Maybe the split for slow tests should be added back here before this is merged.

@oscargus
Copy link
Contributor Author

Yes, that makes sense. Will try to figure out how that works, but since there are several recent PRs it should be feasible.

@oscargus
Copy link
Contributor Author

I think that the slow tests are now split in two.

@@ -1,7 +1,7 @@
from sympy import Symbol, symbols, S, simplify
from sympy import Symbol, symbols, S, simplify, Interval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be I'm missing something. Is this import of Interval needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see why this is needed now. The max_shear_force test below wasn't being run before but now you've renamed so it is and needs this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! There are some similar imports where there are XFAIL:ed tests that was missing imports. Obviously they failed, but now they fail for the right reasons.

@oscarbenjamin
Copy link
Contributor

I think that the slow tests are now split in two.

Yeah, it looks like they are.

I'll wait to see what the timings on Travis look like but otherwise I think this is good to merge if no one objects.

@@ -20,7 +20,7 @@
continued_fraction_reduce as cf_r, FiniteSet, elliptic_e, elliptic_f,
powsimp, hessian, wronskian, fibonacci, sign, Lambda, Piecewise, Subs,
residue, Derivative, logcombine, Symbol, Intersection, Union,
EmptySet, Interval, Integral, idiff)
EmptySet, Interval, Integral, idiff, ImageSet, acos)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are two of those.

@oscarbenjamin
Copy link
Contributor

All good. The 2 slow spits run in about 23 minutes which is better than master.

Thanks for this!

I'm going to merge this and see if I can get #16023 working on top.

@oscarbenjamin oscarbenjamin merged commit 33525fa into sympy:master Feb 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing Related to the test runner. Do not use for test failures unless it relates to the test runner itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evalf gives wrong result for 0**I correct integrate() result in python2.7.4, but wrong in python3.3
4 participants