Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use worker threads for module extension eval #22729

Closed
Wyverald opened this issue Jun 12, 2024 · 4 comments
Closed

Use worker threads for module extension eval #22729

Wyverald opened this issue Jun 12, 2024 · 4 comments
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request

Comments

@Wyverald
Copy link
Member

As --experimental_worker_for_repo_fetching gets more testing in the wild, we can start considering using worker threads for module extension eval too. That would open up the possibility of completely getting rid of restarts. This would potentially mitigate issues such as #22710.

We'd ideally address some straggler issues like #22680 first.

@Wyverald Wyverald added type: feature request P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. labels Jun 12, 2024
@DavidZbarsky-at
Copy link

@Wyverald does it make sense to fix this as part of the bzlmod push for Bazel 8? This is one of the remaining areas where bzlmod is a regression over WORKSPACE (for example rules_js needs to convert the pnpm lockfile from yaml to json and then parse it to discover all the patches, and then each patch it touches triggers the restart behavior, so it's quite slow)

@Wyverald
Copy link
Member Author

Makes sense to me. I'll raise the priority of this.

@Wyverald Wyverald added P1 I'll work on this now. (Assignee required) area-Bzlmod Bzlmod-specific PRs, issues, and feature requests and removed P2 We'll consider working on this in future. (Assignee optional) labels Aug 28, 2024
copybara-service bot pushed a commit that referenced this issue Aug 29, 2024
SEEF has a few inner classes that can easily be broken out. This CL does just that, without changing any behavior. This should make the size of SEEF.java more manageable while we try to add worker threads to extension eval.

Also took this chance to migrate some AutoValues to records.

Work towards #22729

PiperOrigin-RevId: 668808611
Change-Id: I65a0cb7024ca49e2f4876478fbae16ebf5c325e6
copybara-service bot pushed a commit that referenced this issue Aug 29, 2024
Today, the code we have for using worker threads for repo fetching is tightly coupled with repo fetching. This doesn't need to be the case. This CL refactors some code in repo fetching so that eventually we can reuse the worker thread stuff for other use cases (for example, module extension eval).

Most notably, `RepoFetchingSkyKeyComputeState` has a `recordedInputValues` field. This is conceptually an "output" of the repo fetching stage. By making `RepositoryFunction#fetch` _return_ a map of `recordedInputValues`, rather than taking such a map as an argument and writing into it, we can completely remove the `recordedInputValues` field from `RepoFetchingSkyKeyComputeState`.

This allows us to parameterize the return type of the worker thread future, and thus rename `RepoFetchingSkyKeyComputeState` to just `WorkerSkyKeyComputeState`, and `RepoFetchingWorkerSkyFunctionEnvironment` to just `WorkerSkyFunctionEnvironment`. At this point, these two classes already have nothing to do with repo fetching, and can be moved to a shareable location.

In a follow-up CL, I plan to refactor the code further such that some of the code interacting with these two classes in `StarlarkRepositoryFunction#fetch` can be moved into the two `Worker*` classes as well, so that the user of worker threads can enjoy a cleaner API.

Work towards #22729

PiperOrigin-RevId: 669024825
Change-Id: Ic1529a39ca81096812eff5e0fd74622b2ef0b68f
copybara-service bot pushed a commit that referenced this issue Aug 30, 2024
Following up to d24f947, this CL consolidates the API surface of using worker threads down to one class and one method, `WorkerSkyKeyComputeState.startOrContinueWork`. See Javadoc for that method for more information.

Also took this chance to move these two "Worker"-related classes to jcg/devtools/build/skyframe, as they're now completely decoupled from repo fetching.

Work towards #22729

PiperOrigin-RevId: 669390115
Change-Id: I9f9a743c9f1a92bc65af8bccb98c53bc22439204
@criemen
Copy link
Contributor

criemen commented Aug 31, 2024

Is this something you'd feel comfortable to pull into 7.4.0?

Wyverald added a commit that referenced this issue Sep 3, 2024
SEEF has a few inner classes that can easily be broken out. This CL does just that, without changing any behavior. This should make the size of SEEF.java more manageable while we try to add worker threads to extension eval.

Also took this chance to migrate some AutoValues to records.

Work towards #22729

PiperOrigin-RevId: 668808611
Change-Id: I65a0cb7024ca49e2f4876478fbae16ebf5c325e6
Wyverald added a commit that referenced this issue Sep 3, 2024
Today, the code we have for using worker threads for repo fetching is tightly coupled with repo fetching. This doesn't need to be the case. This CL refactors some code in repo fetching so that eventually we can reuse the worker thread stuff for other use cases (for example, module extension eval).

Most notably, `RepoFetchingSkyKeyComputeState` has a `recordedInputValues` field. This is conceptually an "output" of the repo fetching stage. By making `RepositoryFunction#fetch` _return_ a map of `recordedInputValues`, rather than taking such a map as an argument and writing into it, we can completely remove the `recordedInputValues` field from `RepoFetchingSkyKeyComputeState`.

This allows us to parameterize the return type of the worker thread future, and thus rename `RepoFetchingSkyKeyComputeState` to just `WorkerSkyKeyComputeState`, and `RepoFetchingWorkerSkyFunctionEnvironment` to just `WorkerSkyFunctionEnvironment`. At this point, these two classes already have nothing to do with repo fetching, and can be moved to a shareable location.

In a follow-up CL, I plan to refactor the code further such that some of the code interacting with these two classes in `StarlarkRepositoryFunction#fetch` can be moved into the two `Worker*` classes as well, so that the user of worker threads can enjoy a cleaner API.

Work towards #22729

PiperOrigin-RevId: 669024825
Change-Id: Ic1529a39ca81096812eff5e0fd74622b2ef0b68f
Wyverald added a commit that referenced this issue Sep 3, 2024
Following up to d24f947, this CL consolidates the API surface of using worker threads down to one class and one method, `WorkerSkyKeyComputeState.startOrContinueWork`. See Javadoc for that method for more information.

Also took this chance to move these two "Worker"-related classes to jcg/devtools/build/skyframe, as they're now completely decoupled from repo fetching.

Work towards #22729

PiperOrigin-RevId: 669390115
Change-Id: I9f9a743c9f1a92bc65af8bccb98c53bc22439204
Wyverald added a commit that referenced this issue Sep 3, 2024
Now all the pieces are in place. Could it really be this simple??

Fixes #22729

PiperOrigin-RevId: 669391274
Change-Id: Idc58d24558e6db1331596ba33320a6f30c066b56
@Wyverald
Copy link
Member Author

Wyverald commented Sep 3, 2024

Is this something you'd feel comfortable to pull into 7.4.0?

Sent #23504

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request
Projects
None yet
Development

No branches or pull requests

3 participants