Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Helix workitem timeout for libraries outerloop runs #49876

Merged

Conversation

akoeplinger
Copy link
Member

@akoeplinger akoeplinger commented Mar 19, 2021

System.IO.Compression.Brotli.Tests in debug configuration (which we use in PR builds) takes 870 seconds already which is quite close to the existing 900s timeout.

=== TEST EXECUTION SUMMARY ===
   System.IO.Compression.Brotli.Tests  Total: 121, Errors: 0, Failed: 0, Skipped: 0, Time: 870.888s

This caused the timeout seen in https://helix.dot.net/api/2019-06-17/jobs/4f8db33f-41d8-43c4-97b9-a65dfd229437/workitems/System.IO.Compression.Brotli.Tests/console

@ghost
Copy link

ghost commented Mar 19, 2021

Tagging subscribers to this area: @Anipik, @safern, @ViktorHofer
See info in area-owners.md if you want to be subscribed.

Issue Details

System.IO.Compression.Brotli.Tests takes 870 seconds already which is quite close to the 900s timeout.

This caused the timeout seen in https://helix.dot.net/api/2019-06-17/jobs/4f8db33f-41d8-43c4-97b9-a65dfd229437/workitems/System.IO.Compression.Brotli.Tests/console

Author: akoeplinger
Assignees: -
Labels:

area-Infrastructure-libraries

Milestone: -

@ghost ghost added this to In Progress in Infrastructure Backlog Mar 19, 2021
@akoeplinger
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

System.IO.Compression.Brotli.Tests takes 870 seconds already which is quite close to the 900s timeout.
@akoeplinger akoeplinger force-pushed the bump-outerloop-workitem-timeout branch from 0868a90 to e3e8f84 Compare March 19, 2021 16:03
@akoeplinger
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ViktorHofer
Copy link
Member

cc @jeffhandley @adamsitnik @carlossanlop @Jozkee

Any idea why the Brotli tests take that long on Outerloop?

@akoeplinger
Copy link
Member Author

Any idea why the Brotli tests take that long on Outerloop

I think the reason is that in PR builds we run with the Debug config while on rolling builds it uses Release (which takes about 230sec)

@ViktorHofer
Copy link
Member

My concern was more an overall question about if Brotli tests are intended to take that long unrelated to the configuration used.

@akoeplinger
Copy link
Member Author

Can we move forward with this? The timeout bump is needed right now and the failed tests in this PR are already known/tracked by #44352.

@adamsitnik
Copy link
Member

The Brotli tests predate me, could you please tell me how I can check the execution times of given tests using our infra? Once I have at least a single troublemaking test name I can take a look

@ViktorHofer
Copy link
Member

That should i.e. be possible via a Kusto query. @safern @wfurt do you have such a query handy?

@wfurt
Copy link
Member

wfurt commented Apr 30, 2021

 Jobs
| project JobId, Name, Source, Type, Build, Started, Finished, QueueName
| where Started >= ago(10d) 
| where Source ==  "ci/public/dotnet/runtime/refs/heads/main" and Type == "test/functional/cli/outerloop/"
| join kind=inner
(WorkItems | project FriendlyName, JobId, Status, Finished - Started, ExitCode, FailCount
| where FriendlyName == "System.IO.Compression.Brotli.Tests") on JobId 

ARM seems 2x-3x slower than x64.

@akoeplinger
Copy link
Member Author

@wfurt I think that's a bit misleading since from what I can see we run different build configs in main vs. PR outerloop runs.

E.g. if you just look at PR outerloop runs then all of the System.IO.Compression.Brotli.Tests runs that took longer than 12mins in the last 10 days were on amd64:

 Jobs
| project JobId, Name, Source, Type, Build, Started, Finished, QueueName, Repository
| where Started >= ago(10d) 
| where Source startswith "pr/public/dotnet/runtime" and Type == "test/functional/cli/outerloop/"
| join kind=inner
(WorkItems | project FriendlyName, JobId, Status, Duration=Finished - Started, ExitCode, FailCount
| where FriendlyName == "System.IO.Compression.Brotli.Tests") on JobId
| where Duration > 12m
QueueName Status Duration
windows.10.amd64.server19h1.es.open.rt Pass 00:12:04.0480000
windows.10.amd64.serverrs5.open.rt Pass 00:13:16.4310000
windows.81.amd64.open.rt Pass 00:12:16.1240000
windows.10.amd64.server19h1.es.open.rt Pass 00:13:18.4820000
windows.10.amd64.server20h1.open.rt Pass 00:13:51.4430000
windows.10.amd64.serverrs5.open.rt Pass 00:12:13.1330000
windows.81.amd64.open.rt Pass 00:12:23.2540000
windows.10.amd64.server20h1.open.rt Pass 00:12:41.5990000
windows.81.amd64.open.rt Pass 00:13:10.5980000
windows.10.amd64.server20h1.open.rt Pass 00:13:11.5580000
ubuntu.1804.amd64.open.rt Pass 00:12:07.4310000
windows.10.amd64.server19h1.es.open.rt Pass 00:12:24.9460000
windows.10.amd64.server20h1.open.rt Pass 00:13:30.9770000
windows.10.amd64.serverrs5.open.rt Pass 00:13:18.4470000
windows.81.amd64.open.rt Pass 00:12:47.8390000
windows.10.amd64.server19h1.es.open.rt Pass 00:12:42.3510000
windows.10.amd64.server20h1.open.rt Pass 00:12:57.0060000
windows.10.amd64.serverrs5.open.rt Pass 00:12:41.2640000
windows.10.amd64.server19h1.es.open.rt Pass 00:12:30.0630000

@adamsitnik unfortunately we don't track data about passing tests in Kusto, only in AzDO (and they're not quite easy to extract from there)

@ericstj
Copy link
Member

ericstj commented Jun 7, 2021

@akoeplinger are you still interested in this change?

@akoeplinger
Copy link
Member Author

@ericstj Yeah I still think this is a good idea since the outerloop runs tend to take longer by design.

@ericstj
Copy link
Member

ericstj commented Jun 9, 2021

OK, what are next steps here? My original comment was due to the PR age. Do you just need reviewers to approve?

@akoeplinger
Copy link
Member Author

@ericstj yeah, just need someone to approve.

@ViktorHofer
Copy link
Member

@adamsitnik does the data that @akoeplinger provided help?

@ViktorHofer
Copy link
Member

This is good to go in but our policy says that before merging the CI build should be recent so let me retrigger CI.

@ViktorHofer
Copy link
Member

/azp run runtime-libraries-coreclr outerloop

@ViktorHofer
Copy link
Member

/azp run runtime

@ViktorHofer
Copy link
Member

/azp run dotnet-linker-tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ViktorHofer
Copy link
Member

/azp run runtime-dev-innerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ViktorHofer
Copy link
Member

/azp run runtime-staging

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@akoeplinger akoeplinger merged commit 5bc8807 into dotnet:main Jun 15, 2021
Infrastructure Backlog automation moved this from In Progress to Done Jun 15, 2021
@akoeplinger akoeplinger deleted the bump-outerloop-workitem-timeout branch June 15, 2021 12:10
@ghost ghost locked as resolved and limited conversation to collaborators Jul 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants