Reduce fragility of GCDiagnosis tests#1668
Conversation
|
I'm starting to think that many of our recent intermittent test failures may also be due to issues like this. Generally our test suite is not robust to irregular long pauses such as may be caused by garbage collection. Do you have any thoughts on if this is likely a problem and, if so, how we might address it? |
|
By "long pauses", do you mean such that GC collections might break timing-dependent tests? |
It's a guess, but yes |
|
Apparently a full |
|
I would not be surpised if the travis-ci machines were 10x slower at times. |
|
Yes, that's certainly possible. |
|
I may be mistaken, but I think this boils down to the fact that our test suite progressively leaks memory (i.e. Python objects)... I'm not sure how that is. My guess is some objects (such as Scheduler, etc.) don't get properly terminated and are left alive by a dangling thread. I'm unaware of other potential sources of leaks in our codebase. |
|
Some stats at the end of the test suite: |
|
FWIW I'd be fine removing the threaded map/scatter/gather code. |
|
Is it unimportant functionally? |
|
Not really. It used to be interesting, I think that other solutions exist for this that are now more attractive. I don't think I've seen anyone use it recently. |
|
I think the iterable form is fine (though |
|
I'd favour removing it FWIW. |
|
In the meantime I'm also merging this PR. |
|
Removing is fine with me. |
|
Do you have thoughts on the other lingering issues? |
|
There are still tests leaking processes: |
This tries to fix the sporadic failure reported in #966 (comment) , where it seems a leftover from a previous test gets collected and triggers a large reduction in heap size.