Suggestion: Public Compilation Performance Tracker #670

Open
chrisaut opened this Issue Feb 19, 2015 · 17 comments

Projects

None yet

8 participants

@chrisaut

As performance is treated as a feature (http://blogs.msdn.com/b/csharpfaq/archive/2014/01/15/roslyn-performance-matt-gertz.aspx) I think it would be nice if there was a public site somewhere that tracked compilation performance of various solutions/projects over time (say nightly).

Something like Firefox's http://arewefastyet.com/

Every night it would get the latest source, and then time compilation of a handfull of projects. Perhaps a simple Hello World, and medium sized solution and a large solution to start. I was thinking it could time compiling roslyn itself, but since code gets updated there it's probably not a good idea, except perhaps if it always compiled a fixed snapshot. I'm sure microsoft has sample solutions that can be made open, or worst case the community could contribute here. A good source might also be the biggest/most popular C#/VB applications hosted on github.

This would help catch performance regressions. For solutions that do not make use of new features, there could also be a baseline compilation measurement with the old/current compilers. Perhaps it could also measure memory usage and even other stats (IO activity, etc)?

If this were automated further, it could also be used to make sure pull requests don't cause performance regressions before they are merged.

@sharwell
Member

Anything that encourages people to pay attention to (and work to reduce) compilation times is great. The amount of developer time wasted daily on this task is disturbing.

One of the interesting things about the StyleCopAnalyzers project is one of the contributors created an independent project to track statistics about the project. Since the test you describe involves aggregating information available across multiple independent repositories, I imagine it could be implemented the same way.

@chrisaut

@sharwell I think you misunderstood, I don't mean get the latest Roslyn source to measure compiling it, but get the latest Roslyn source, compile that, and than use the output of that to measure compilation times of other solutions. I'm not phrasing this very well :-(

I agree with you that it's best if the exact same source is always compiled: "I was thinking it could time compiling roslyn itself, but since code gets updated there it's probably not a good idea, except perhaps if it always compiled a fixed snapshot."

@sharwell
Member

@chrisaut Yes I misunderstood, so I edited my comment now. 👍

@Pilchie
Member
Pilchie commented Feb 19, 2015

Tagging @jaredpar - we do generate perf results for a few solutions (including an older snapshot of Roslyn.sln) nightly. Would it be interesting to publish the results, even if we didn't publish the actual solutions and test scripts?

@chrisaut

Yeah sure. As long as a description is given. I don't think it's critical that those are open.

@pharring
Contributor

Thanks for the suggestion. It's right in line with what I've been discussing internally with my colleagues in CoreCLR.

Today, all our performance testing is 'closed' and that's unfortunate. Below is my 'backlog' of user stories. This isn't a commitment, because we've only just begun discussing it and we haven't made a firm plan, but I recognize it's a gap that we have to fill.

As a Roslyn/CoreCLR contributor, I should be able to:

  1. view daily performance metrics. What metrics are being tracked and how is the current build performing against those metrics?
  2. suggest and contribute new performance tests/scenarios/metrics to the regular runs.
  3. examine historical trends of performance metrics so I can see progress over time and spot long-term regressions or improvements.
  4. measure the performance impact of a pull request prior to merging with upstream/master.

Compiler throughput tests are the easiest ones to get going since they don't require a full Visual Studio install and they can be run from pretty much anywhere. We'll probably do those first.

@ManishJayaswal ManishJayaswal added this to the Unknown milestone Feb 26, 2015
@pharring
Contributor
pharring commented May 6, 2015

The first visible signs of this work is likely to be reporting of current performance on the CoreCLR and CoreFX builds. We expect to have something public by the end of May.

@chrisaut

Any news for this?

@pharring
Contributor

Thanks for the continued interest.

We experimented during May to get something running, but no-one in their right mind would call it anything more than a toy. You can see the "results" here: http://dotnet-ci.cloudapp.net/job/dotnet_roslyn_tp_windows/
In the "Azure artifacts" links, you'll see a "profile.etl" and "Report.xml" file. These are the two outputs of a single perf test (compiling "Hello, World!" in C# with Roslyn ten times). As you can see, this is running on every commit (continuous integration build). The only interesting metric in the report is the total GC bytes allocated. We're doing nothing with the data other than archiving at the moment.

Work stopped on this during June because the team was focused on shipping Dev14. In the last two weeks, we've picked this up again and I'm working full time on it for the next couple of sprints. I hope to have something to share in about 6 weeks (mid September). We'll build out the contributor experience first (how to author perf tests and how to run perf tests locally). Next, we'll figure out the PR workflow (hooking into Jenkins and using either cloud VMs or "perf lab" hardware). We probably won't get into detailed reporting (historical trend-lines, automated regression detection, and so on) until much later this year.

@chrisaut

Thanks, please keep us informed.

Anything the community can help with here? Maybe come up with test projects/cases that stress different parts of the compiler, etc.?

@pharring
Contributor

Thanks for the enthusiasm. Hold tight for now. Eventually, yes, we want authoring a new performance test to be as easy as adding a new unit test and use the same workflow. Once it's up and running, we'll be accepting pulls to add new performance workloads.

@pharring
Contributor

Quick update on this: I've been working with the .Net engineering infrastructure team on two projects that we'll be piloting with Roslyn in the next few weeks. The first is an extension to xunit that allows us to author performance tests ("Benchmarks") in xunit and run them while capturing under a profiler. The second is a distributed testing system which supports multiple platforms (Windows, Linux, MacOS) and allows us to run tests on arbitrary 'agents'. These agents may be running on physical or virtual hardware and may be pooled together to share work items.

In the next two weeks we'll begin authoring new performance tests for Roslyn and we'll integrate our Jenkins CI builds with the distributed test system.

There's one, large missing piece, and that's reporting. We have a very simple tool that can compare the output of two runs and report on regressions/improvements. It's primitive, but it'll have to do for now. Eventually, I'd like to have a reporting system with historical trends (for daily runs) along with annotations. It's on the backlog, but not currently scheduled.

@pharring
Contributor

Progress on this has slowed significantly since the team has been focused on Update 1 bug fixing. However, the xunit-based performance testing was introduced in #6025 and I've been testing our Jenkins integration in #6201.

Reporting is still missing.

@AdamSpeight2008
Contributor

@pharring
What is the current status of this? Are reports available (especially for Pull Requests)?

@pharring pharring assigned KevinH-MS and unassigned pharring Nov 17, 2015
@pharring
Contributor

@AdamSpeight2008 I am no longer on the Roslyn team. @KevinH-MS please can you comment and/or redirect. Thanks.

@KevinH-MS
Contributor

We're still working on it, but it's still a ways out... @AdamSpeight2008, if you're looking to get some validation of scanner changes, I'd suggest just building some of the VB projects in Roslyn.sln repeatedly and timing it (msbuild prints time elapsed at the end of a build).

@mattwarren

@KevinH-MS @pharring I know you've done some work on implementing your own Benchmarking framework, but I'm just putting it out there that there are some open source alternatives that might already give you what you need.

For instance BenchmarkDotNet (full disclaimer, I'm one of the contributors), let's you write benchmarks like so:

public class IL_Loops
{
    [Params(1, 5, 10, 100)]
    int MaxCounter = 0;

    private int[] initialValuesArray;

    [Setup]
    public void SetupData()
    {
        initialValuesArray = Enumerable.Range(0, MaxCounter).ToArray();
    }

    [Benchmark]
    public int ForLoop()
    {
        var counter = 0;
        for (int i = 0; i < initialValuesArray.Length; i++)
            counter += initialValuesArray[i];
        return counter;
    }

    [Benchmark]
    public int ForEachArray()
    {
        var counter = 0;
        foreach (var i in initialValuesArray)
            counter += i;
        return counter;
    }
}

The main difference from your version is that we have separate methods for Setup, rather that structuring everything in one method.

Also BenchmarkDotNet doesn't require a modified test runner, it can run in any existing test runner. However you do need to add add a [Test]/[Fact] method as well as the [Benchmark] method, like so:

public class TestClass
{
    [Fact]
    public void Test()
    {
        // run all the benchmarks in this class
        var reports = new BenchmarkRunner().Run<TestClass>();
    }

    [Benchmark]
    [BenchmarkTask(mode: BenchmarkMode.SingleRun, processCount: 1, warmupIterationCount: 1, targetIterationCount: 1)]
    public void Benchmark()
    {
        // write the Benchmark code here
    }
}
@chrisaut chrisaut referenced this issue Aug 26, 2016
Open

Performance Testing Strategy #13336

6 of 9 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment