Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

ViktorHofer
Copy link
Member

@ViktorHofer
Copy link
Member Author

@wtgodbe is the branch open so that we can this infra only change in?

@wtgodbe
Copy link
Member

wtgodbe commented Sep 11, 2020

is the branch open so that we can this infra only change in?

Yup, branches are open until next friday

@danmoseley
Copy link
Member

Unrelated failure. Merging

@danmoseley danmoseley merged commit 2a4152d into release/3.1 Sep 11, 2020
@danmoseley danmoseley deleted the ViktorHofer-patch-1 branch September 11, 2020 17:46
@danmoseley
Copy link
Member

@MattGal we've now removed this from master, 5.0-rc2, and 3.1 branches. Looks like it's not in 2.1. Should this fix all runs that start from now on?

So specifically., we can now close and reopen dotnet/aspnetcore#25010 to restart it and it should receive machines that are unaffected? Then we can check that Ubuntu is giving us a dump.

@danmoseley
Copy link
Member

I guess put another way, clearly the machines are not "cleaned" (with respect to your scripts setting uname, etc) between jobs. So I'm wondering how long between us "stopping doing harm" here and the point at which they are reset/clean. Does it require explicit action on the engineering team to reset the queues?

cc @Chrisboh

@MattGal
Copy link
Member

MattGal commented Sep 11, 2020

As soon as there's no work, and actually now any time work declines, the machines get effectively cleaned.

The good news for ubuntu.1804.amd64.open, at least, is that the oldest machine right now is 2020-09-11T18:25:07.064Z (about 9 minutes old as of this writing) and new ones were being created right when I looked, so at least this queue (this was one of the ones called out) the machines are quite clean.

I'm happy to dig into this (and help move along the testing) for any other queues today; once you've removed it from your main branches, I expect there may be a few stragglers here and there running private branch builds or submitting work from local machine, then it should subside.

@danmoseley
Copy link
Member

Great, well let's look at what the latest iteration of dotnet/aspnetcore#25010 uncovers.

Incidentally, that job only verifies the queues that aspnet uses in its PR validation. Can you think of any way we can easily run that same test on other queues - ones that aren't wired up to PR validation, but are only used in other runs? As opposed to, just wait to see whether we get test failures in rolling builds that are missing dumps.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants