Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in primary task ordering #20062

Closed
DanielThomas opened this issue Mar 2, 2022 · 10 comments
Closed

Regression in primary task ordering #20062

DanielThomas opened this issue Mar 2, 2022 · 10 comments
Assignees
Labels
a:bug @core Issue owned by GBT Core in:scheduler execution plan, task graph, work lease, project lock
Milestone

Comments

@DanielThomas
Copy link
Contributor

DanielThomas commented Mar 2, 2022

We have projects that call a number of tasks on the command line, with the previous and documented contract being that those tasks would be executed in the order specified:

You can also specify multiple tasks. For example, the following will execute the test and deploy tasks in the order that they are listed on the command-line and will also execute the dependencies for each task.

https://docs.gradle.org/current/userguide/command_line_interface.html#executing_multiple_tasks

However in 7.4 the task order is no longer ensured, it appears we're seeing the later specified tasks execute first.

Expected Behavior

Task order should reflect the command line task request order (allowing for individual task dependencies of course).

Current Behavior

The order appears to be ignored, or at least, that order has changed. We assume this is due to the fixes for clean task ordering:

#18904

Context

Projects that previously implicitly on this task ordering (no declared dependencies or output/inputs tracked) are seeing failures due to missing inputs. This can also cause an artifact to be released without tests being successful.

Steps to Reproduce

I can't come up with a reduced scenario for this so far, as this contract does seem to hold for simple task graphs. Believe it occurs when clean appears in the task requests.

Your Environment

Gradle 7.4

@DanielThomas
Copy link
Contributor Author

For example, given these task requests:

clean build generate publishManPage

We see this at execution time:

Selected primary task 'clean' from project :
Selected primary task 'build' from project :
Selected primary task 'generate' from project :
Selected primary task 'publishManPage' from project :
Tasks to be executed: [task ':algo-commons-almanac-generator:clean', task ':algo-commons-almanac-generator:compileJava', task ':algo-commons-almanac-generator:compileScala', task ':algo-commons-almanac-generator:processResources', task ':algo-commons-almanac-generator:classes', task ':algo-commons-almanac-generator:createPropertiesFileForJar', task ':algo-commons-almanac-generator:writeManifestProperties', task ':algo-commons-almanac-generator:jar', task ':algo-commons-almanac-generator:sourceJar', task ':algo-commons-almanac-generator:assemble', task ':algo-commons-almanac-generator:buildInvocationInfo', task ':algo-commons-almanac-generator:compileTestJava', task ':algo-commons-almanac-generator:compileTestScala', task ':algo-commons-almanac-generator:processTestResources', task ':algo-commons-almanac-generator:testClasses', task ':algo-commons-almanac-generator:test', task ':algo-commons-almanac-generator:check', task ':algo-commons-almanac-generator:build', task ':writeVersionFiles', task ':algo-commons-almanac-generator:writeVersionFiles', task ':generate', task ':publishManPage']
Tasks that were excluded: []
:algo-commons-almanac-generator:clean (Thread[Execution worker for ':' Thread 4,5,main]) started.

> Task :algo-commons-almanac-generator:clean UP-TO-DATE
Caching disabled for task ':algo-commons-almanac-generator:clean' because:
  Deletion cannot be cached
Task ':algo-commons-almanac-generator:clean' is not up-to-date because:
  Task has not declared any outputs despite executing actions.
:algo-commons-almanac-generator:clean (Thread[Execution worker for ':' Thread 4,5,main]) completed. Took 0.057 secs.
:publishManPage (Thread[Execution worker for ':' Thread 4,5,main]) started.

> Task :publishManPage FAILED

@DanielThomas DanielThomas changed the title Regression in task graph ordering Regression in primary task ordering Mar 2, 2022
@big-guy big-guy added this to the 7.4.1 milestone Mar 2, 2022
@jbartok jbartok added in:invoking-gradle Running Executing in:scheduler execution plan, task graph, work lease, project lock and removed in:invoking-gradle Running Executing to-triage labels Mar 2, 2022
@ghale
Copy link
Member

ghale commented Mar 2, 2022

This smells like a missing task dependency. Why does publishManPage fail? Does it require something from build/generate? And if so, does it have a task dependency (explicit or implicit via some wiring of inputs)? And does publishManPage declare any outputs? Is there a build scan you can share with me (maybe DM via slack)?

In general, the task ordering on the command line is (and has always been) super weak and Gradle has always run things in the fastest order possible (regardless of command line order). And, it would run tasks in command line order if (and only if) other stronger relationships don't exist. In other words, if you specify "a b c" on the command line and b is delayed by task dependencies, but c is ready, we would always go ahead and run c before b. We've essentially added a new kind of relationship that says producers/destroyers on the command line must execute in the order that they appear. I suspect this is somehow delaying build/generate but not publishManPage.

@ljacomet ljacomet modified the milestones: 7.4.1, 7.5 RC1 Mar 7, 2022
@big-guy big-guy added the @core Issue owned by GBT Core label Mar 15, 2022
@big-guy
Copy link
Member

big-guy commented Mar 25, 2022

@ghale do you think fixing #20272 will affect this issue too?

@ghale
Copy link
Member

ghale commented Mar 25, 2022

On the surface, I don't think so, but I don't feel like I understand the problem well enough to say so with confidence.

I suspect this is something where publishManPage is not being classified as a destroyer or producer, so Gradle thinks it can be executed as soon as it's ready (i.e. command line order doesn't have any implied importance to it). On the other hand, the tasks from build are producers and so we're now delaying those until after some of the clean tasks. If publishManPage in fact actually depends on something from the build producers, then it might fail because it now ends up running earlier. That's pure speculation, of course.

@DanielThomas Can you tell us more about the relationships between tasks here? For instance, does publishManPage have any ordering or dependency relationships (real or implicit)? If we've broken something here, we want to fix it.

@DPUkyle
Copy link
Member

DPUkyle commented Apr 5, 2022

@DanielThomas do you mind retrying with v7.4.2? @ghale implemented a fix for a similar issue.

@DanielThomas
Copy link
Contributor Author

I think we can call this a missing task dependency. The primary task requests were the thing providing the undeclared order before, but if I follow correctly, that particular change in behaviour was intentional?

@DPUkyle
Copy link
Member

DPUkyle commented Apr 18, 2022

Hi Danny. Yes, it sounds like the documentation you cited was not 100% accurate to being with, and the fix for #2488 corrected the behavior you were implicitly relying upon. From what I can see, it seems our improvements to the CLI ordering behavior teased out a missing or undeclared task input for publishManPage.

I've created #20486 to clarify this change in the documentation.

@big-guy big-guy modified the milestones: 7.5 RC1, 7.6 RC1 Apr 28, 2022
@DPUkyle
Copy link
Member

DPUkyle commented May 16, 2022

Closing per slack thread.

@DPUkyle DPUkyle closed this as completed May 16, 2022
@hpuac
Copy link

hpuac commented May 17, 2022

Hey @DPUkyle, this is a private slack channel, right?
Could you maybe summarize in short what was discussed there to have it documented for the public as well?

@DPUkyle
Copy link
Member

DPUkyle commented Jun 2, 2022

Hey @DPUkyle, this is a private slack channel, right? Could you maybe summarize in short what was discussed there to have it documented for the public as well?

Indeed - apologies, as I linked for my own record-keeping purposes.

The resolution is that we've made subtle changes to task ordering and scheduling related to our work on the configuration cache feature and improved parallelism.

In this case, the existing documentation was incorrect in that tasks explicitly enumerated on the CLI imply an exact order of execution. This was technically never true but the docs suggested it was.

As for consumers who were affected by this change in behavior (the publishManPage task referenced above), this behavior change exposed some missing task dependencies. The workaround is to add explicit task dependencies. Ideally these would be via chaining providers (wiring task outputs to task inputs), but could also be simple dependsOn relationships too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:bug @core Issue owned by GBT Core in:scheduler execution plan, task graph, work lease, project lock
Projects
None yet
Development

No branches or pull requests

7 participants