Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace O(N) lookup with O(1) for task outputs (7.8.x) #3259

Merged
merged 4 commits into from Aug 5, 2019

Conversation

@trwhitcomb
Copy link
Collaborator

commented Jul 30, 2019

This speeds up suite validation significantly for tasks with
large numbers of outputs. The list used required an O(N)
check for membership which, when combined with being run for
each output resulted in O(N**2) complexity.

Using the builtin set should bypass any deprecated libraries in
Python 2.

These changes closes #3255

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • These changes are already covered by existing tests.
  • No change log entry required (e.g. change is small or internal only).
  • No documentation update required for this change.
Tim Whitcomb
Replace O(N) lookup with O(1) for task outputs.
This speeds up suite validation significantly for tasks with
large numbers of outputs.  The list used required an O(N)
check for membership which, when combined with being run for
each output resulted in O(N**2) complexity.

Using the builtin set should bypass any deprecated libraries in
Python 2.

Closes #3255
@kinow

This comment has been minimized.

Copy link
Member

commented Jul 31, 2019

Hmm, kicked the Travis builds twice, but build #1 and #2 refuse to pass. Won't kick them any more, in case it's a real issue introduced by this change.

If the builds have not been restarted, the logs should be here and here 👍

@hjoliver

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2019

From a quick look earlier, the failures aren't obviously related ... but need to investigate. I'll try to have a look after finishing current project management duties (an hour or so...)

@trwhitcomb

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 31, 2019

@kinow I noticed that too - any hints on where to find the log of how it failed? All I saw in the links were the logs that said that a failure had occurred. Not sure how this projects onto failures in cylc review

@kinow

This comment has been minimized.

Copy link
Member

commented Jul 31, 2019

@kinow I noticed that too - any hints on where to find the log of how it failed? All I saw in the links were the logs that said that a failure had occurred. Not sure how this projects onto failures in cylc review

The recent Travis build should contain the tests that failed, and their output - though I normally struggle to find what went wrong/where. What I normally do, when a functional test fails for my change, is try to locate the test that is failing in the Travis log, then run locally with cylc test-battery $test.

If the test passes locally, then it probably means it is a flaky test that is failing for some reason unrelated to your change. In this case normally we kick Travis until is passes the build, or the reviewers simply ignore the Travis output.

@hjoliver

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2019

At the bottom of the Travis CI log for the chunk, click on "after_script" to expand the actual test errors (which are sometimes informative ... but sometimes not, with functional tests):

Screenshot from 2019-07-31 17-12-19

... expanded (just showing the first few lines here though):

Screenshot from 2019-07-31 17-12-24

@trwhitcomb

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 31, 2019

@kinow Thanks - I ran a spot check of some of the tests locally (database/01-broadcast.t, database/00-simple.t) and they're passing here. Also it looks like the Travis build may have succeeded 3 or 4 retries later, but the PR hasn't updated.

@oliver-sanders oliver-sanders self-requested a review Aug 1, 2019

@oliver-sanders oliver-sanders added this to the next-release milestone Aug 1, 2019

@oliver-sanders

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

Profiling results positive, out of interest @trwhitcomb how many outputs do you have on your task(s)? Must be a pretty big number to see performance degrade!

cpu-time
memory

@oliver-sanders

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

Profiling experiment added in trwhitcomb#2

@oliver-sanders

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

Once approved could you re-raise against master (minus profiling experiment).

@trwhitcomb

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 1, 2019

@oliver-sanders - thanks! Those profiling numbers are consistent with what we hoped to see. We have a task that performs an output for each hour the forecast model proceeds through - this didn't cause problems in our NWP suites but ground validation (and therefore the suite server) to a halt once we started running 45 day or 90 day forecasts.

when you say "re-raise against master", are you suggesting to make this change in master (i.e. Cylc 8) as well? If so, that'll take a little longer as we'll need to get up to speed with how things work in the new version. A quick glance at master shows that at least some of this won't be a drop-in replacement. I'm happy to do it, but it'll be a while: if it's desired sooner and someone else is available I have no issues with them taking it.

Merge pull request #2 from oliver-sanders/outputs-optimisation
profiling: add outputs scaling experiment
@oliver-sanders

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

when you say "re-raise against master", are you suggesting to make this change in master (i.e. Cylc 8)

Yes. I think it should be straight forward, but if you prefer we could port this for you.

@trwhitcomb

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 1, 2019

Thanks - I agree that it should be pretty straightforward once I figure out how the code around the fix is working differently. If it drags on too long, feel free to port but this is a good excuse to look some at the cylc 8 internals.

@matthewrmshin

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

(I'll take a look at the failed tests tomorrow in my environment as well.)

@matthewrmshin

This comment has been minimized.

Copy link
Member

commented Aug 2, 2019

Test failures unrelated.

I have just tried cherry-picking the 2 changes in this branch to port to Cylc-8 master. It is a straightforward merge with no conflict. Resulting branch:
https://github.com/matthewrmshin/cylc-flow/tree/validate-set

Build OK as well:
https://travis-ci.org/matthewrmshin/cylc-flow/builds/566883705

Do you want me to raise the PR against master?

@hjoliver

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2019

@trwhitcomb - I would suggest that a change log entry is required here (contrary to your PR requirements check-list above) because this fixes a performance issue that may affect some users.

@matthewrmshin

This comment has been minimized.

Copy link
Member

commented Aug 2, 2019

@trwhitcomb - I would suggest that a change log entry is required here (contrary to your PR requirements check-list above) because this fixes a performance issue that may affect some users.

From 🐌 to 🚀!

@matthewrmshin

This comment has been minimized.

Copy link
Member

commented Aug 2, 2019

(Test failure here related to #3244 - Travis CI hostname issue.)

Tim Whitcomb
@trwhitcomb

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 2, 2019

@hjoliver Done.

@matthewrmshin Thanks for checking that - I must have been reading from somewhere else! Will open the v8 PR shortly.

@trwhitcomb trwhitcomb referenced this pull request Aug 2, 2019
6 of 6 tasks complete

@trwhitcomb trwhitcomb changed the title Replace O(N) lookup with O(1) for task outputs. Replace O(N) lookup with O(1) for task outputs (7.8.x) Aug 2, 2019

@hjoliver
Copy link
Contributor

left a comment

Approving this one (for Cylc 7)

lib/cylc/config.py Show resolved Hide resolved
@hjoliver

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2019

Merging ... thanks @trwhitcomb 👍

@hjoliver hjoliver merged commit 523ce97 into cylc:7.8.x Aug 5, 2019

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
Codacy/PR Quality Review Up to standards. A positive pull request.
Details
@hjoliver hjoliver referenced this pull request Aug 5, 2019
6 of 6 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.