Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in System.Threading.Tests.Perf_Timer #79409

Closed
performanceautofiler bot opened this issue Dec 8, 2022 · 8 comments
Closed

Regressions in System.Threading.Tests.Perf_Timer #79409

performanceautofiler bot opened this issue Dec 8, 2022 · 8 comments
Assignees
Labels
area-System.Threading.Tasks tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark

Comments

@performanceautofiler
Copy link

Run Information

Architecture arm64
OS Windows 10.0.25094
Baseline aa8fe36e105a1c0498c733b69685b12f820175df
Compare 843441ecc0831b0621bdf7561e73c5d77cd4301b
Diff Diff

Regressions in System.Threading.Tests.Perf_Timer

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ScheduleManyThenDisposeMany - Duration of single invocation 595.44 ms 779.50 ms 1.31 0.22 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Threading.Tests.Perf_Timer*'

Payloads

Baseline
Compare

Histogram

System.Threading.Tests.Perf_Timer.ScheduleManyThenDisposeMany


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 779.496296 > 631.6900120588236.
IsChangePoint: Marked as a change because one of 12/1/2022 7:29:59 PM, 12/8/2022 3:49:24 AM falls between 11/29/2022 2:55:01 PM and 12/8/2022 3:49:24 AM.
IsRegressionStdDev: Marked as regression because -27.275364789733967 (T) = (0 -760402180.7058823) / Math.Sqrt((287760363016342.4 / (18)) + (291508008401776.56 / (17))) is less than -2.034515297446192 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (17) - 2, .025) and -0.2601983022924083 = (603398829.6307381 - 760402180.7058823) / 603398829.6307381 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture arm64
OS Windows 10.0.25094
Baseline aa8fe36e105a1c0498c733b69685b12f820175df
Compare 843441ecc0831b0621bdf7561e73c5d77cd4301b
Diff Diff

Regressions in System.Tests.Perf_UInt32

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ToString - Duration of single invocation 2.54 ns 4.42 ns 1.74 0.90 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_UInt32*'

Payloads

Baseline
Compare

Histogram

System.Tests.Perf_UInt32.ToString(value: 0)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 4.421718289376452 > 2.6488778258769807.
IsChangePoint: Marked as a change because one of 10/3/2022 4:48:37 PM, 12/1/2022 7:29:59 PM, 12/8/2022 3:49:24 AM falls between 11/29/2022 2:55:01 PM and 12/8/2022 3:49:24 AM.
IsRegressionStdDev: Marked as regression because -11.30156067733412 (T) = (0 -4.295665548960617) / Math.Sqrt((0.3113295496751816 / (18)) + (0.2531362416139169 / (17))) is less than -2.034515297446192 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (17) - 2, .025) and -0.8939492296978264 = (2.2680996309737282 - 4.295665548960617) / 2.2680996309737282 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture arm64
OS Windows 10.0.25094
Baseline aa8fe36e105a1c0498c733b69685b12f820175df
Compare 843441ecc0831b0621bdf7561e73c5d77cd4301b
Diff Diff

Regressions in System.Tests.Perf_UInt64

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ToString - Duration of single invocation 2.33 ns 5.05 ns 2.17 0.84 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_UInt64*'

Payloads

Baseline
Compare

Histogram

System.Tests.Perf_UInt64.ToString(value: 0)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 5.051131508520759 > 2.868795317741063.
IsChangePoint: Marked as a change because one of 9/12/2022 8:20:58 PM, 10/3/2022 10:42:38 PM, 12/1/2022 7:29:59 PM, 12/8/2022 3:49:24 AM falls between 11/29/2022 2:55:01 PM and 12/8/2022 3:49:24 AM.
IsRegressionStdDev: Marked as regression because -11.318325125981792 (T) = (0 -4.894545995643823) / Math.Sqrt((0.16085245528718428 / (18)) + (0.5745113894330254 / (17))) is less than -2.034515297446192 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (17) - 2, .025) and -0.9157635919576466 = (2.554879952928989 - 4.894545995643823) / 2.554879952928989 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added arm64 untriaged New issue has not been triaged by the area owner labels Dec 8, 2022
@dakersnar dakersnar removed refs/heads/main untriaged New issue has not been triaged by the area owner labels Dec 8, 2022
@dakersnar
Copy link
Contributor

ToString regressions are by design.

@dakersnar dakersnar transferred this issue from dotnet/perf-autofiling-issues Dec 8, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Dec 8, 2022
@dakersnar dakersnar changed the title [Perf] Windows/arm64: 3 Regressions on 12/2/2022 2:38:36 AM Regressions in System.Threading.Tests.Perf_Timer Dec 8, 2022
@dakersnar dakersnar added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark labels Dec 8, 2022
@dakersnar
Copy link
Contributor

Commit range for System.Threading.Tests.Perf_Timer regression is 41f57b7...9d7ffb5.

Because it is threading related, my guess is it is caused by #79091. There is also technically a DateTime change that could be related #79107.

@ghost
Copy link

ghost commented Dec 8, 2022

Tagging subscribers to this area: @dotnet/area-system-threading-tasks
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture arm64
OS Windows 10.0.25094
Baseline aa8fe36e105a1c0498c733b69685b12f820175df
Compare 843441ecc0831b0621bdf7561e73c5d77cd4301b
Diff Diff

Regressions in System.Threading.Tests.Perf_Timer

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ScheduleManyThenDisposeMany - Duration of single invocation 595.44 ms 779.50 ms 1.31 0.22 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Threading.Tests.Perf_Timer*'

Payloads

Baseline
Compare

Histogram

System.Threading.Tests.Perf_Timer.ScheduleManyThenDisposeMany


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 779.496296 > 631.6900120588236.
IsChangePoint: Marked as a change because one of 12/1/2022 7:29:59 PM, 12/8/2022 3:49:24 AM falls between 11/29/2022 2:55:01 PM and 12/8/2022 3:49:24 AM.
IsRegressionStdDev: Marked as regression because -27.275364789733967 (T) = (0 -760402180.7058823) / Math.Sqrt((287760363016342.4 / (18)) + (291508008401776.56 / (17))) is less than -2.034515297446192 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (17) - 2, .025) and -0.2601983022924083 = (603398829.6307381 - 760402180.7058823) / 603398829.6307381 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture arm64
OS Windows 10.0.25094
Baseline aa8fe36e105a1c0498c733b69685b12f820175df
Compare 843441ecc0831b0621bdf7561e73c5d77cd4301b
Diff Diff

Regressions in System.Tests.Perf_UInt32

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ToString - Duration of single invocation 2.54 ns 4.42 ns 1.74 0.90 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_UInt32*'

Payloads

Baseline
Compare

Histogram

System.Tests.Perf_UInt32.ToString(value: 0)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 4.421718289376452 > 2.6488778258769807.
IsChangePoint: Marked as a change because one of 10/3/2022 4:48:37 PM, 12/1/2022 7:29:59 PM, 12/8/2022 3:49:24 AM falls between 11/29/2022 2:55:01 PM and 12/8/2022 3:49:24 AM.
IsRegressionStdDev: Marked as regression because -11.30156067733412 (T) = (0 -4.295665548960617) / Math.Sqrt((0.3113295496751816 / (18)) + (0.2531362416139169 / (17))) is less than -2.034515297446192 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (17) - 2, .025) and -0.8939492296978264 = (2.2680996309737282 - 4.295665548960617) / 2.2680996309737282 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture arm64
OS Windows 10.0.25094
Baseline aa8fe36e105a1c0498c733b69685b12f820175df
Compare 843441ecc0831b0621bdf7561e73c5d77cd4301b
Diff Diff

Regressions in System.Tests.Perf_UInt64

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
ToString - Duration of single invocation 2.33 ns 5.05 ns 2.17 0.84 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_UInt64*'

Payloads

Baseline
Compare

Histogram

System.Tests.Perf_UInt64.ToString(value: 0)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 5.051131508520759 > 2.868795317741063.
IsChangePoint: Marked as a change because one of 9/12/2022 8:20:58 PM, 10/3/2022 10:42:38 PM, 12/1/2022 7:29:59 PM, 12/8/2022 3:49:24 AM falls between 11/29/2022 2:55:01 PM and 12/8/2022 3:49:24 AM.
IsRegressionStdDev: Marked as regression because -11.318325125981792 (T) = (0 -4.894545995643823) / Math.Sqrt((0.16085245528718428 / (18)) + (0.5745113894330254 / (17))) is less than -2.034515297446192 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (17) - 2, .025) and -0.9157635919576466 = (2.554879952928989 - 4.894545995643823) / 2.554879952928989 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: EgorBo
Labels:

area-System.Threading.Tasks, tenet-performance, tenet-performance-benchmarks, untriaged

Milestone: -

@stephentoub
Copy link
Member

Commit range for System.Threading.Tests.Perf_Timer regression is 41f57b7...9d7ffb5. Because it is threading related, my guess is it is caused by #79091. There is also technically a DateTime change that could be related #79107.

I don't think either of those PRs touch code that would be used by that test.

@dakersnar
Copy link
Contributor

Fair enough. Could it be another "by design" regression from #79061? None of the other commits seem noteworthy to me.

@stephentoub
Copy link
Member

I don't think that could be involved either.

@stephentoub
Copy link
Member

I don't see anything in the diffs that should have been able to impact this test: all the test does is creating a bunch of Timers (with a huge timeout such that they'll never fire) and then destroys them all, so really it's just creating a bunch of objects, adding them to a linked list (taking and releasing a lock each time), and then removing those objects from the linked list (again taking and releasing a lock each time. Looking at the history, the test was reasonably stable until around July, then it dipped down, and then this cited regression is it coming back up to approximately where it was before:
image
and looking at where it dipped previously, I don't see anything in the diff to explain that "improvement". I think this is just noise.

@stephentoub stephentoub closed this as not planned Won't fix, can't repro, duplicate, stale Jan 20, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jan 20, 2023
@dotnet dotnet locked as resolved and limited conversation to collaborators Feb 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Threading.Tasks tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

3 participants