Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Prune unnecessary tasks from ci tasks #7617

Merged
merged 1 commit into from Jul 28, 2020

Conversation

bhearsum
Copy link
Contributor

@bhearsum bhearsum commented Jul 3, 2020

This is a proof of concept for #7598. There's a number of things that need be improved & addressed (see below), but with commands like:
taskgraph decision --pushlog-id=0 --pushdate=0 --project=android-components --message= --owner=skaspari+mozlando@mozilla.com --level=3 --base-repository=https://github.com/mozilla-mobile/android-components --head-repository=https://github.com/mozilla-mobile/android-components --head-ref=refs/heads/master --head-rev=6131efe915b92ad10eec23620a71dc54510bd3e9 --head-tag= --repository-type=git --tasks-for=github-push

...it produces a significantly smaller set of tasks, like:

[u'build-concept-storage',
 u'build-samples-browser-beta',
 u'build-samples-browser-nightly',
 u'build-samples-browser-release',
 u'build-samples-browser-system',
 u'build-service-glean',
 u'build-support-base',
 u'build-support-ktx',
 u'build-support-sync-telemetry',
 u'build-support-utils',
 u'complete-push',
 u'complete-push-1',
 u'complete-push-2',
 u'lint-compare-locales',
 u'lint-detektAndKtlint',
 u'test-android-feature-containers',
 u'test-android-feature-pwa',
 u'test-android-feature-share',
 u'test-android-feature-sitepermissions',
 u'test-android-feature-top-sites',
 u'test-ui-browser',
 u'test-ui-glean',
 u'test-unit-browser-engine-gecko-nightly',
 u'toolchain-linux64-android-gradle-dependencies',
 u'toolchain-linux64-android-sdk-linux-repack']

Known TODOs include:

  • Use a Github API token to avoid getting rate limited - This probably isn't necessary in the end, because we can do 60 requests/hour/IP. Each decision task will make 2 requests, and most decision tasks won't run on the same workers, so the IPs will differ.
  • Figure out how to find all files changed for PRs (something like https://github.com/mozilla-mobile/fenix/blob/master/taskcluster/fenix_taskgraph/parameters.py probably)
  • Ensure we're using the right API endpoints for commits/pushes
  • Figure out what actual entries we need in FILE_PATTERNS_TO_AFFECTED_COMPONENTS
  • See if we can get dependencies dumped from gradle in JSON or another more parseable format
  • Determine whether or not the build-samples-browser tasks should be considered here, too.

@bhearsum bhearsum force-pushed the prune-unnecessary-builds branch 4 times, most recently from 5c8021a to 0b66328 Compare July 8, 2020 14:42
@bhearsum
Copy link
Contributor Author

bhearsum commented Jul 8, 2020

I think this is ready for a round of feedback/review. @JohanLorenzo - you are probably the best reviewer?

I'll note that I haven't gone to the trouble of setting up an entire other repo to push github-push behaviour in context, but I've done lots of local testing by hand with taskgraph for that and github-pull-request. This PR demonstrates the ability to filter away all of the build tasks, and my local tests have testing against other PRs with only some changed components.

I originally implemented this as a new target tasks filter for CI, but that didn't work out because the complete kind depends on all of the builds, and would re-add them to the final task graph even if they were filtered away. Adjusting those dependencies is really difficult, because they're set at task generation time. The easiest way to get rid of them was to not generate them at all. @tomprince suggested the alternative of something like https://hg.mozilla.org/ci/taskgraph/file/tip/src/taskgraph/morph.py#l187, but I think for this case it's best to keep the solution in repo. I could be convinced otherwise, though. The main downside to not generating the tasks at all is that they're totally unrunnable, but I don't see any reason we'd want to run them after the fact, so I think that's an acceptable trade off.

As noted above, there's still a few open questions. @pocmo - do you have any thoughts or insight to the following?

  • Figure out what actual entries we need in FILE_PATTERNS_TO_AFFECTED_COMPONENTS. That is, what files that are not part of a component's source tree imply that we need to do builds of one or more components?
  • See if we can get dependencies dumped from gradle in JSON or another more parseable format (not a blocker, but it would be nice to have).
  • Determine whether or not the build-samples-browser tasks should be subject to conditional building.

@bhearsum
Copy link
Contributor Author

bhearsum commented Jul 8, 2020

I'm also not sure why bors is not reporting in -- I'm wondering if it's being held up on the fact that not all of the tasks have run? (I don't know where to get more information on it.)

@JohanLorenzo JohanLorenzo self-requested a review July 15, 2020 13:23
Copy link
Contributor

@JohanLorenzo JohanLorenzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this patch up! I'm eager to see smaller and pin-pointed graphs!

I understand the approaches you chose and I don't think I would have done better 🙂 I like @tomprince's proposal regarding morphs. I wouldn't have thought of it, but the more I reflect upon it, the more I like it. That said, I understand this is a major change in your PR. If you'd like to land it with the logic in the loader and then address the morph change in a followup, that's fine by me.

I'd like to hear @pocmo about the gradle part of it. I have the feeling the solution exists but I can't get a working piece of code. I'm happy to r+ this patch if there's nothing we can improve there.

taskcluster/ac_taskgraph/target_tasks.py Outdated Show resolved Hide resolved
taskcluster/ac_taskgraph/loader/build_config.py Outdated Show resolved Hide resolved
taskcluster/ac_taskgraph/loader/build_config.py Outdated Show resolved Hide resolved
taskcluster/ac_taskgraph/loader/build_config.py Outdated Show resolved Hide resolved
# TODO: This currently includes some tasks that have been removed from
# target tasks. How do we avoid including these here?
# Best ideas are:
# 1) Move this entire task to a morph (like https://hg.mozilla.org/ci/taskgraph/file/tip/src/taskgraph/morph.py#l187)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda like this idea, actually 🙂

The complete task as it is now, has this known bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1642977. When I first implemented it, I felt like I was fighting against the Taskcluster paradigm. Moving it to morph would actually solve bug 1642977, if I understand correctly. Moreover, it will be easier to maintain the dependency list of the complete task since we want to wait on all tasks. As of now, we have to manually update kind-dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like good follow-up fodder :). Do you want a new bug or issue for it?

@JohanLorenzo
Copy link
Contributor

I'm also not sure why bors is not reporting in -- I'm wondering if it's being held up on the fact that not all of the tasks have run? (I don't know where to get more information on it.)

Oh, that's expected while somewhat confusing, I concede. bors only runs when someone requests it. You get a yellow badge because it's a required Github check. This means: even though bors did nothing, Github waits for it. If I remember correctly, we made this check required in order to prevent manual landings. Bors takes care of everything.

@JohanLorenzo JohanLorenzo requested a review from pocmo July 20, 2020 08:17
Copy link
Contributor

@pocmo pocmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify and make sure I understand it correctly...

This patch will:

  • Schedule only the tasks for components that should be affected by a push

This patch will NOT:

  • Share build artifacts between tasks (when building component C depending on A and B then the task for C will also build A and B).
  • Rebuild the dependency tree in taskcluster, e.g. if A depends on B and B depends on C then we won't build C first, then B then A, but instead build A, B, C in parallel (which makes sense without sharing build artifacts)

@bhearsum
Copy link
Contributor Author

Just to clarify and make sure I understand it correctly...

This patch will:

* Schedule only the tasks for components that should be affected by a push

Correct.

This patch will NOT:

* Share build artifacts between tasks (when building component C depending on A and B then the task for C will also build A and B).

Correct - the internals of the build tasks and their task dependencies are totally unchanged. This patch only prunes out tasks from the existing list of tasks.

* Rebuild the dependency tree in taskcluster, e.g. if A depends on B and B depends on C then we won't build C first, then B then A, but instead build A, B, C in parallel (which makes sense without sharing build artifacts)

Right. We're extracting the dependency tree from gradle in the decision task. I have no desire to maintain a parallel dependency tree :).

I'd love to see the latter two things happen at some point, but I think it's much more work for much less of a win - so I have no intention to look at them in the near future.

@bhearsum
Copy link
Contributor Author

I think this is ready to go now. I fixed all of the review comments, and also a bug I found that caused forced rebuilds of specific components (based on file patterns) to not consider dependencies. We don't have any of those right now (all of the forced ones are ALL_COMPONENTS), but at least it won't bite us in the future.

Copy link
Contributor

@JohanLorenzo JohanLorenzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 it!

Thanks again for making this change. I know it's not going to have the impact you initially forecast, but I feel much about not spinning up so many machines for a change that impacts a handful components.

taskcluster/ac_taskgraph/target_tasks.py Outdated Show resolved Hide resolved
@pocmo pocmo self-assigned this Jul 27, 2020
@JohanLorenzo
Copy link
Contributor

bors r=pocmo,JohanLorenzo

@bors
Copy link

bors bot commented Jul 28, 2020

Build succeeded:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants