Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only build the GC simulator tests in the GC simulator leg #68177

Merged
merged 7 commits into from
Apr 22, 2022

Conversation

jkoritzinsky
Copy link
Member

Only build the GC simluator tests instead of all Pri1 tests. When we build all Pri1 tests in this leg, nearly all tests are skipped as only the GCSimluator tests are actually run. These skipped tests leave a ton of files on disk, causing issues in Helix. By only building the actual tests we are going to run, we can avoid this additional infrastructure cost.

Contributes to #68176

cc: @dotnet/gc

@ghost
Copy link

ghost commented Apr 18, 2022

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

Only build the GC simluator tests instead of all Pri1 tests. When we build all Pri1 tests in this leg, nearly all tests are skipped as only the GCSimluator tests are actually run. These skipped tests leave a ton of files on disk, causing issues in Helix. By only building the actual tests we are going to run, we can avoid this additional infrastructure cost.

Contributes to #68176

cc: @dotnet/gc

Author: jkoritzinsky
Assignees: -
Labels:

area-Infrastructure

Milestone: -

Only build the GC simluator tests instead of all Pri1 tests. When we build all Pri1 tests in this leg, nearly all tests are skipped as only the GCSimluator tests are actually run. These skipped tests leave a ton of files on disk, causing issues in Helix. By only building the actual tests we are going to run, we can avoid this additional infrastructure cost.
@jkoritzinsky
Copy link
Member Author

/azp run runtime-coreclr gc-simulator

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Jeremy for fixing this. As another way of attacking this problem I'm looking into modifying CoreclrTestWrapperLib to remove all zero-size error files and all stdout captures for successful tests, that should substantially reduce the number of low-value files.

@jkoritzinsky
Copy link
Member Author

Yeah I think a multi-pronged approach is the best way to solve the problem.

@jkoritzinsky
Copy link
Member Author

@trylek I can't remember the gesture to manually run a pipeline on a PR from the AzDO UI, and it looks like the GC simulator pipeline's configuration requires that. Can you trigger the pipeline for me?

@jkoritzinsky
Copy link
Member Author

Or maybe wait until after #68172 is merged as that fix will be important to get an accurate picture of what build perf will be like.

@trylek
Copy link
Member

trylek commented Apr 18, 2022

I have triggered the gc simulator run; it should let us assess the Helix results. My PR will attack the other side of the equation, the duration of the managed test build phase in AzDO.

@trylek
Copy link
Member

trylek commented Apr 18, 2022

(FWIW the entire trick is to use refs/pull/<PR ID>/head as the "Commit" in the "Run new" AzDO UI for the pipeline.)

@trylek
Copy link
Member

trylek commented Apr 18, 2022

(I don't expect the perf hit due to filtering to be that detrimental on the AzDO build machine as it should be mostly empty as opposed to my HDD full of crap ;-).)

@jkoritzinsky
Copy link
Member Author

The build filter looks like it is set up correctly, but I think we need your change in #68172 to fix the msbuild issue in the Send To Helix step

@trylek
Copy link
Member

trylek commented Apr 18, 2022

Not sure if that's gonna help. This error usually means that AllProjects item group is empty. I'm going to look at the artifacts to try to understand why.

@trylek
Copy link
Member

trylek commented Apr 18, 2022

OK, I think I see the problem. It's another missing Condition but not one addressed by my pending PR. As the pending PR is almost baked I'm inclined to merge it in as is. Would you mind if I pushed a commit fixing this on top of this PR?

@jkoritzinsky
Copy link
Member Author

You can push the fix to this PR as it's the one hitting the issue

@trylek
Copy link
Member

trylek commented Apr 19, 2022

I believe I have fixed the primary build error; the execution legs still failed a bunch of tests; I retried them but frankly speaking I haven't found any passing ones in the history of this run so I'm not super hopeful.

@jkoritzinsky
Copy link
Member Author

It looks like I need to add filtering on the "copy native project assets" step as that step also generates the execution scripts.

@jkoritzinsky
Copy link
Member Author

jkoritzinsky commented Apr 21, 2022

I believe this change is ready for merge as soon as CI completes. I've verified that the only tests in src/tests that we're building are GCSimulator tests, we're only generating the xunit wrapper for those tests, and we're passing the right flags to run them.

I still expect the gc-simulator leg to be red as I think some of the tests in the leg are regularly failing (which is why all of the files were being uploaded and causing #68176 in the first place.

Also, this PR should be ported to any branch where we run gc-simulator tests. I think we still run them on 3.1 in dotnet/coreclr, so this change should be backported there or the test leg should be disabled to further mitigate #68176.

@trylek I'm going to be out, so can you drive this if I don't get it merged by Friday morning Redmond time?

@trylek
Copy link
Member

trylek commented Apr 22, 2022

Thanks Jeremy for driving this. I have booked vacation for tomorrow as I need to leave with my children for the weekend but I will be available in early PST morning hours; after that, I'll be back at my desk on Sunday PST afternoon. For 3.1 dotnet/coreclr testing, I'm not sure what is the right way to proceed as I only added the test filtering parameters (test / dir / tree) last year and backporting them to 3.1 would be at the very least tricky due to the number of changes in the test build scripts we made in the last couple of years.

@jkoritzinsky
Copy link
Member Author

GC Simulator legs are now only running the GC Simulator tests. This PR only touches runtime test infra, so the libraries test failures are unrelated. Merging this in to help relieve the issues we’re causing Helix.

@jkoritzinsky jkoritzinsky merged commit a806647 into dotnet:main Apr 22, 2022
@jkoritzinsky jkoritzinsky deleted the gc-simulator-test-filter branch April 22, 2022 06:17
@ghost ghost locked as resolved and limited conversation to collaborators May 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants