Fix part of #59, part of #3926: Upgrade compute affected tests #4929

BenHenning · 2023-03-29T19:03:43Z

Explanation

Fix part of #59
Fix part of #3926

As part of developing downstream PRs for #59, it was discovered that PRs which change a LOT of files (such as #4937) can run into problems where ComputeAffectedTests simply times out trying to compute the entire list of affected targets.

Critical performance and compatibility fixes

There have been past efforts to optimize the affected tests workflows (bucketing, breaking up some of the computations), but it was discovered that the most expensive part of the process is running the rbuildfiles query to figure out which BUILD files were affected by changed files. It was known this was an expensive step in the past, but it wasn't clear until this PR exactly how to address it. This PR changes the script to now:

Filter rbuildfiles to only run under first-party targets (since it shouldn't be possible for third-party BUILD files to be affected by first-party changes). This reduces the search graph.
Introduce Bazel command sharding for this step like the script already does for others. This breaks up the command to run on a subset of files multiple times, combining the output rather than running a single command with a large input. It seems that rbuildfiles does not scale linearly with its input size, so this drastically improves the script's performance. It's thought that this approach is also more logically correct due to more correct sibling matching semantics, but it's a bit hard to reason about bazel query behavior at times so I can't be 100% confident in this. Nevertheless, the existing tests pass and I haven't seen any issues from using these changes in downstream PRs.

Separately, another issue was discovered wherein some commands (including certain Bazel commands) can actually cause CommandExecutorImpl to soft-lock and always time out. This was due to an issue in the previous implementation wherein it would wait to read a command's stdout until after the timeout has been completed (i.e. it assumed the process would finish). This isn't correct, however: stdout is blocking I/O, and some commands are implemented to only continue execution after their standard output is read. The new implementation makes use of coroutine actors to consume stdout and stderr at the same time as waiting for execution to ensure all commands can continue execution and that they finish within the desired timeout. Note that the new ScriptBackgroundCoroutineDispatcher was actually introduced (in #5313) specifically to support this new CommandExecutorImpl implementation, though it has since been found to have lots of other nice benefits by providing scripts with a reliable mechanism for performing asynchronous operations without having to manage their own execution dispatcher.

Command execution for Bazel commands has also been updated to time out after 5 minutes rather than the previous 1 minute. Despite the optimizations and robustness improvements above, some commands still take quite some time to run for especially large and complex cases. While this change may result in a slower failure turnaround in cases when commands are soft-locked, it should result in better CI and script robustness in the long-term.

Better support for chained PRs & possibly merge queues

ComputeAffectedTests was also updated to use a merge base commit rather than a reference to the develop branch (where this new commit hash is provided by the CI workflow). The idea behind this is that the merge base commit is:

More logically correct (as ComputeAffectedTests is meant to run in contexts where a branch wants to be merged into a destination).
Better compatible with chained PRs. This allows for significantly better CI performance in chained PR situations since now only the tests relevant to the child PR will be run rather than all tests for the child & its parental hierarchy (see downstream PRs' CI runs to see this working in action).
Hopefully better compatibility with merge queues (Integrate with GitHub merge queues #3926). This hasn't been verified, but the flexibility in customizing the destination for affected tests should be the main prerequisite to properly supporting merge queues.

Other changes

GitClient was updated to have a peace-of-mind check to ensure the base commit (provided as explained in the previous section) matches the merge base of the current branch. This should always be true (except maybe in merge queues--this will need to be verified). Note that this is only a soft warning, not an assertion failure.

RepositoryFile was cleaned up slightly to be a bit more consistent with other directory management approaches done in scripts. I can see this being refactored more in the future. Callsite behavior isn't expected to be affected by these changes.

Some script tests were updated to have consistent formatting (which required updating the TODO exemptions). TodoIssueResolvedCheckTest and TodoOpenCheckTest also had some of their test file management cleaned up a bit.

A note on testing

These are inherently difficult things to test. I've verified what I could via CI and general observation, but I've also largely relied on existing tests to catch regressions (and many were caught during changes to the script). Since these are mainly implementation and not behavioral changes, I'm comfortable with the level of testing that was done.

Essential Checklist

The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
Any changes to scripts/assets files have their rationale included in the PR explanation.
The PR follows the style guide.
The PR does not contain any unnecessary code changes from Android Studio (reference).
The PR is made from a branch that's not called "develop" and is up-to-date with "develop".
The PR is assigned to the appropriate reviewers (reference).

For UI-specific PRs only

N/A -- This is an infrastructure-only change.

This addresses an underlying bug with the command executor that can, in some cases, break compute_affected_tests. It also refines some of its internal mechanisms for much better performance on expensive PRs. It also prepares the base support needed for merge queues, but the CI workflows aren't being updated in this change.

This prepares for merge queues (but doesn't quite configure the workflow for them--that will happen in a different PR), and improves how tests are computed for stacked PRs.

Also, update TODO check script to have nicer output, and support generating the exemption textproto file for easier updates in the future.

oppiabot · 2023-04-08T03:24:52Z

Hi @BenHenning, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

The new proto target isn't used anywhere so this was missed.

oppiabot · 2023-05-18T23:50:42Z

Hi @BenHenning, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

BenHenning · 2023-05-22T18:49:03Z

Still active.

oppiabot · 2023-05-29T19:42:50Z

Hi @BenHenning, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

Conflicts: scripts/assets/todo_open_exemptions.textproto

These issues were found after I started using a new development environment.

ProfileAndDeviceIdFragmentTest had been updated to use a newer fragment initialization pattern, but that's no longer needed and seems to be causing what appears to be timing discrepancies between local dev and CI.

The issue ultimately arose from test parameters being initialized after they're needed in the launched UI. This type of change was tried earlier in the branch, but reverted since it didn't seem necessary. It is, however, necessary when there are environment differences (e.g. local vs. CI) or when running certain tests individually. Due to the difficulty in finding this issue, ActivityScenarioRule has been added as a prohibited pattern in the static regex checks (along with ActivityTestRule since that's deprecated and discouraged, anyway).

The test was suffering from some proto encoding inconsistencies that seem to occur between some development machines vs. on CI. The fix improves the test's robustness by extracting the raw encoded string, verifying that the other outputs in the intent message correctly correspond to that string, and that the string (as a parsed proto) contains the correct values. As a result, the test no longer depends on a hardcoded encoding value to be present for verification. This does result in a bit more logic than is generally good to have in a test (and it lengthened the test code quite a bit), but it seems necessary in this particular case.

…cted-tests Conflicts: scripts/assets/todo_open_exemptions.textproto scripts/src/javatests/org/oppia/android/scripts/ci/ComputeAffectedTestsTest.kt scripts/src/javatests/org/oppia/android/scripts/common/GitClientTest.kt scripts/src/javatests/org/oppia/android/scripts/todo/TodoIssueResolvedCheckTest.kt scripts/src/javatests/org/oppia/android/scripts/todo/TodoOpenCheckTest.kt

BenHenning

Self-reviewed latest changes.

BenHenning · 2024-05-17T00:01:54Z

@adhiamboperes PTAL.

adhiamboperes

Thanks @BenHenning! These chnages LGTM.

adhiamboperes · 2024-05-17T12:40:34Z

Please feel free to merge this PR.

oppiabot · 2024-05-17T12:42:58Z

Hi @BenHenning, this PR is ready to be merged. Please address any remaining comments prior to merging, and feel free to merge this PR once the CI checks pass and you're happy with it. Thanks!

BenHenning · 2024-05-22T21:24:05Z

Thanks @adhiamboperes! Will be merging this shortly.

## Explanation Fixes #5407 This introduces a proper fallback for the base branch used for computing affected tests when off-develop: ``origin/develop`` (which was the same base used prior to #4929 being merged). This can't easily be verified without being merged, but I have high confidence it'll work since we've used the JavaScript ``||`` fallback syntax in GitHub workflow ``if`` conditionals in the past (and the environment should be the same for ``env`` variables). Note that this is being fixed forward rather than reverted because the only verification is merging the PR, so it's actually faster and simpler to fix forward than to revert the original PR and re-submit a fix (which has the same likelihood of fixing CI as this fix forward). ## Essential Checklist - [x] The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".) - [x] Any changes to [scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets) files have their rationale included in the PR explanation. - [x] The PR follows the [style guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide). - [x] The PR does not contain any unnecessary code changes from Android Studio ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)). - [x] The PR is made from a branch that's **not** called "develop" and is up-to-date with "develop". - [x] The PR is **assigned** to the appropriate reviewers ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)). ## For UI-specific PRs only N/A -- This only affects CI infrastructure.

## Explanation Fixes part of #1719 and #3709 This PR introduces a new script & CI workflow for computing build stats to compare both AABs and universal APKs between develop and the changes in a given PR, as part of fixing #1719 (though this PR doesn't cover everything outlined in that PR). This information is then detailed and uploaded as a CI build artifact, and summarized & posted as a comment in the PR. Some details included in the summary report: - APK file/download size differences - Method count differences - Feature/permission differences - New/removed resources & assets The script supports computing differences for multiple "profiles" at the same time, and the CI workflow has been set up to compute four: 1. dev 2. alpha 3. beta 4. GA This workflow will be optional since it's very expensive to run (it has to assemble 8 builds, 6 of which are Proguarded). It also doesn't really need to be run in order to approve a PR, though reviewers may insist on waiting for large or suspicious changes (such as PRs introducing new dependencies) to ensure the actual affected changes are as expected. In order to mitigate this expense, the CI workflow runs on a scheduled cron job off of develop across all open PRs and checks them in a group. It runs at most once per day (based on https://github.com/orgs/community/discussions/55768#discussioncomment-5941720) so multiple changes to a PR will be picked up with a single check in the next cron run. Currently, it will run even for a PR that hasn't changed since the last run (but this is something that can be improved in the future if it needs to be). It's being scheduled for 2:30am (02:30) UTC which seems to have a few specific benefits: - Per GitHub documentation, initiating the workflow outside the start of the hour should reduce likelihood of cancellation (since the start of the hour tends to use the most resources): https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule. - This corresponds to 7:30pm PT, 2:30am GMT, 8:00am IST, 5:30am EAT, and 12:30pm AEST (just as a basis for a different part of the world). It's actually a very nice time that shouldn't overlap with almost any development in main locations around the world, so it hopefully won't impact Oppia organization GitHub CI resources. The example output from these workflows can be observed in a few places: - Later in this PR (for back when the PR was configured to run the new workflow per PR change). - In #4261 which demonstrates the large math PRs and how they changed the builds back before those were merged. - https://github.com/BenHenning/oppia-android/actions/workflows/stats.yml and https://github.com/BenHenning/oppia-android/pulls (specifically: BenHenning#14, BenHenning#13, BenHenning#12) which demonstrates the workflow running correctly from a scheduled cron (https://github.com/BenHenning/oppia-android/actions/runs/9232187176) and posting the updates to open PRs. Beyond that, implementing this utility involved several significant changes to various systems, including the build graph: - Three new utilities were added for the script: Aapt2Client, ApkAnalyzerClient, and BundleToolClient. Each of these correspond to Android CLI utilities, but each required special considerations: - Aapt2Client requires direct access to the Android SDK, but fortunately android_sdk_repository exposes this as a target so it's trivial to pass it in & call it. Some build information is needed, too (see next outer point). - ApkAnalyzerClient couldn't use the apkanalyzer CLI contained within the SDK since it's not exported by android_sdk_repository. Instead, we needed to depend on the CLI's internal implementation library (which I suspect is what Android Studio probably uses for its own APK Analyzer tool). This required some new implementation. - BundleToolClient fortunately can call right into the bundle tool library that we use when building AABs, but unfortunately that tool appears to not be designed to be called multiple times in the same process. Because Java doesn't support forking, we actually needed to fake a fork function by starting a new Java process using the current process's classpath in order to re-run bundle tool for each needed routine. Additionally, bundle tool required https://github.com/oppia/archive-patcher (which needed new BUILD files since it only supported Gradle building previously) and a non-Android version of Guava (see below for the changes this has caused). - A new build_vars.bzl was introduced to define the build SDK & build tools versions (this is done in a way where they can actually be passed to the new script's utilities since it needs to access aapt2). - rules_kotlin had a bug where resources wouldn't be pulled in properly for kt_jvm_library (see bazelbuild/rules_kotlin#281), but this was mitigated in a previous PR by upgrading rules_kotlin past alpha 2. - The new functionality required the JRE-compatible version of Guava (over the Android-constrained library used in the codebase today), but this introduces a one-version issue. The solution ended up being isolating the JRE-compatible Guava library to its own library with a slightly hacky direct reference to it in BundleToolClient. Some of the other attempts at solving this resulted in some Maven reference cleanups in existing script documentation. This functionality will be improved in downstream PRs, but other attempts that were originally made to isolate this cleanly were: - Introduce multiple maven_install files and isolate dependencies into: production, tests, scripts. This has a number of nice benefits (more correct licenses and faster Maven dependency fetches for production), but it results in very tricky one-version violations for test targets that cross dependencies between production and tests. - Isolated maven_install just for scripts. This is closer to the solution we'll want long-term, but it was too much complexity to fully introduce in this PR so it's been reworked into a downstream PR that can focus on cleaning up third-party dependency management across the whole codebase. This PR is introducing a few new dependencies that, in turn, pull in a *bunch* of transitive dependencies. These are all due to the new ``apkanalyzer`` dependency. While it will affect licenses for this specific PR, once third-party dependencies for scripts are cleaned up in a downstream PR they will be moved out (since they are script-only dependencies). Separately, also note that the AAPT2 utility requires stdout to be processed continuously in order for the process to finish. This was one of the primary reasons CommandExecutorImpl was reworked in #4929. For testing: most of the changes in this PR have been extensively manually tested. However, the new utilities are lacking significant automated tests. Since this utility is a nice-to-have for the rest of the Bazel PR chain, it's being prioritized to be merged in spite of lacking code coverage. #4971 has been filed to track adding these missing tests in the long-term. ## Essential Checklist - [x] The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".) - [x] Any changes to [scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets) files have their rationale included in the PR explanation. - [x] The PR follows the [style guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide). - [x] The PR does not contain any unnecessary code changes from Android Studio ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)). - [x] The PR is made from a branch that's **not** called "develop" and is up-to-date with "develop". - [x] The PR is **assigned** to the appropriate reviewers ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)). ## For UI-specific PRs only N/A -- This only affects CI workflows & the build system. Technically, some dependency changes in the build system could have UI effects, but there should be no such changes in this PR. --------- Co-authored-by: Adhiambo Peres <59600948+adhiamboperes@users.noreply.github.com> Co-authored-by: Sean Lip <sean@seanlip.org>

BenHenning added 4 commits March 28, 2023 20:43

Use base SHA for computing affected tests.

c325bc9

This prepares for merge queues (but doesn't quite configure the workflow for them--that will happen in a different PR), and improves how tests are computed for stacked PRs.

Fix updated TODOs.

8ae5aa8

Also, update TODO check script to have nicer output, and support generating the exemption textproto file for easier updates in the future.

Merge branch 'develop' into upgrade-compute-affected-tests

aaa0299

BenHenning mentioned this pull request Apr 4, 2023

Fix part of #3926: Add support for CI checks in merge queues [Blocked: #4929] #4913

Closed

6 tasks

oppiabot bot added the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label Apr 8, 2023

oppiabot bot closed this Apr 15, 2023

BenHenning added 2 commits May 11, 2023 15:12

Merge branch 'develop' into upgrade-compute-affected-tests

8a19b4e

Fix breakage from develop.

773f7ff

The new proto target isn't used anywhere so this was missed.

BenHenning reopened this May 11, 2023

oppiabot bot removed the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label May 11, 2023

oppiabot bot added the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label May 18, 2023

oppiabot bot removed the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label May 22, 2023

oppiabot bot added the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label May 29, 2023

BenHenning closed this Jun 2, 2023

BenHenning added the TODO(Ben): Revive label Jun 3, 2023

BenHenning and others added 10 commits June 5, 2023 22:47

Merge branch 'develop' into upgrade-compute-affected-tests

525a601

Conflicts: scripts/assets/todo_open_exemptions.textproto

Fix a variety of dev platform-specific issues.

667bf89

These issues were found after I started using a new development environment.

Tidy some things up, and revert fragment test.

fb59232

ProfileAndDeviceIdFragmentTest had been updated to use a newer fragment initialization pattern, but that's no longer needed and seems to be causing what appears to be timing discrepancies between local dev and CI.

Merge branch 'develop' into fix-platform-specific-issues

714f3ea

Merge branch 'develop' into fix-platform-specific-issues

e3d4091

Post-merge fix.

5671673

Merge branch 'develop' into fix-platform-specific-issues

bd1466d

oppiabot bot removed the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label May 16, 2024

BenHenning changed the base branch from make-todo-check-locally-runnable to develop May 16, 2024 22:27

An error occurred while trying to automatically change base from make-todo-check-locally-runnable to develop May 16, 2024 22:27

BenHenning changed the title ~~Upgrade compute affected tests~~ Fix part of #59, part of #3926: Upgrade compute affected tests May 16, 2024

Address self-review comments.

4cf06d4

BenHenning commented May 17, 2024

View reviewed changes

BenHenning marked this pull request as ready for review May 17, 2024 00:01

BenHenning requested a review from a team as a code owner May 17, 2024 00:01

BenHenning requested a review from adhiamboperes May 17, 2024 00:01

BenHenning assigned adhiamboperes and unassigned BenHenning May 17, 2024

adhiamboperes approved these changes May 17, 2024

View reviewed changes

adhiamboperes assigned BenHenning and unassigned adhiamboperes May 17, 2024

oppiabot bot added the PR: LGTM label May 17, 2024

Merge branch 'develop' into upgrade-compute-affected-tests

3dc6b7b

BenHenning enabled auto-merge (squash) May 22, 2024 21:25

BenHenning disabled auto-merge May 22, 2024 21:25

BenHenning enabled auto-merge (squash) May 22, 2024 21:25

BenHenning merged commit 24b1721 into develop May 22, 2024
26 checks passed

BenHenning deleted the upgrade-compute-affected-tests branch May 22, 2024 21:56

This was referenced May 22, 2024

[BUG]: Bazel tests won't run on develop CI #5407

Closed

Fix #5407: Fix Bazel CI runs on develop #5408

Merged

BenHenning mentioned this pull request May 25, 2024

Fix part of #1719, part of #3709: Add build stats CI workflow #4092

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix part of #59, part of #3926: Upgrade compute affected tests #4929

Fix part of #59, part of #3926: Upgrade compute affected tests #4929

BenHenning commented Mar 29, 2023 •

edited

Loading

oppiabot bot commented Apr 8, 2023

oppiabot bot commented May 18, 2023

BenHenning commented May 22, 2023

oppiabot bot commented May 29, 2023

BenHenning left a comment

BenHenning commented May 17, 2024

adhiamboperes left a comment

adhiamboperes commented May 17, 2024

oppiabot bot commented May 17, 2024

BenHenning commented May 22, 2024

Fix part of #59, part of #3926: Upgrade compute affected tests #4929

Fix part of #59, part of #3926: Upgrade compute affected tests #4929

Conversation

BenHenning commented Mar 29, 2023 • edited Loading

Explanation

Essential Checklist

For UI-specific PRs only

oppiabot bot commented Apr 8, 2023

oppiabot bot commented May 18, 2023

BenHenning commented May 22, 2023

oppiabot bot commented May 29, 2023

BenHenning left a comment

Choose a reason for hiding this comment

BenHenning commented May 17, 2024

adhiamboperes left a comment

Choose a reason for hiding this comment

adhiamboperes commented May 17, 2024

oppiabot bot commented May 17, 2024

BenHenning commented May 22, 2024

BenHenning commented Mar 29, 2023 •

edited

Loading