Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a benchmark target's return value in a column #784

Open
Warpten opened this issue Jun 7, 2018 · 18 comments
Open

Using a benchmark target's return value in a column #784

Warpten opened this issue Jun 7, 2018 · 18 comments

Comments

@Warpten
Copy link

Warpten commented Jun 7, 2018

I'm writing a deserialization library that is basically taking a file stream and producing a List<T>:

        [Benchmark(Description = "Achievement (WDC1)")]
        public StorageList<WDC1.AchievementEntry> WDC1()
        {
            using (var fs = OpenFile("Achievement.WDC1.db2"))
                return new StorageList<WDC1.AchievementEntry>(fs);
        }

And it goes on for a couple versions.

However, the execution times don't make much sense since they are a measurement of the total amount of entries per file, which changes per version (I have no control over the data). It would make a lot more sense to display the average time needed to load one single element of each version.

I'd love to be able to have an implementation of IColumn that is able to obtain the return value from a benchmark target and work on it.

I have a local project that is basically a dumbed down performance meter but it has its limitations, namely in regards to results output - producing histograms is a chore.

Is there a way to do that, or am I out of luck? By giving the source a quick glance through ILspy, I don't see much.

@adamsitnik
Copy link
Member

hi @Warpten

I am aware of this limitation. The problem is that we run the benchmark in a separate process, so passing any data from one to another is not trivial.

I know that @KrzysztofCwalina has faced this issue in ML.NET

@Warpten @KrzysztofCwalina would it be enough if I would add a mechanism to return a string and print it in a dedicated column? Sth like:

[ExtraData]
public string Size() => Serializer(sth).Size.ToString();

The method would be executed just once.

@adamsitnik adamsitnik self-assigned this Jun 8, 2018
@Warpten
Copy link
Author

Warpten commented Jun 8, 2018

Maybe exposing a method's result as an object would be less limiting? It would be great to have an extra column where I can just do

return (benchmark.Target.Result as IList).Count

Or even more complex things like

// Suppose this is a variation of MeanColumn
return /* mean time here */ / (benchmark.Target.Result as IList).Count;

This is however assuming that Target.Result never changes between executions, since this would only retrieve the last result. However, I think this is fine, since you wouldn't typically want to do this sort of things if you write a benchmark that has its result vary (at least that's not obvious to me).

@adamsitnik
Copy link
Member

Maybe exposing a method's result as an object would be less limiting?

but how should I then serialize it and pass from one process to another without introducing any dependencies to BenchmarkDotNet?

@gulbanana
Copy link

i think the string is fine, we could implement serialisation on top of that ourselves for advanced cases

@Warpten
Copy link
Author

Warpten commented Oct 14, 2018

Coming back at this, I didn't know at the time that BenchmarkDotNet was spawning subprocesses (and I also diagonally read your answer, Adam - sorry about that!). So I guess strings are the best way out.

@Konard
Copy link

Konard commented Sep 30, 2019

Well, something like this is possible even now. Look at Config class, SQLiteOutput and DoubletsOutput methods.

using System.IO;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using Comparisons.SQLiteVSDoublets.Model;
using Comparisons.SQLiteVSDoublets.SQLite;
using Comparisons.SQLiteVSDoublets.Doublets;

namespace Comparisons.SQLiteVSDoublets
{
    [ClrJob, CoreJob]
    [MemoryDiagnoser]
    [WarmupCount(2)]
    [IterationCount(1)]
    [Config(typeof(Config))]
    public class Benchmarks
    {
        private class Config : ManualConfig
        {
            public Config() => Add(new SizeAfterCreationColumn());
        }

        [Params(1000, 10000, 100000)]
        public int N;
        private SQLiteTestRun _sqliteTestRun;
        private DoubletsTestRun _doubletsTestRun;

        [GlobalSetup]
        public void Setup()
        {
            BlogPosts.GenerateData(N);
            _sqliteTestRun = new SQLiteTestRun("test.db");
            _doubletsTestRun = new DoubletsTestRun("test.links");
        }

        [Benchmark]
        public void SQLite() => _sqliteTestRun.Run();

        [IterationCleanup(Target = "SQLite")]
        public void SQLiteOutput() => File.WriteAllText($"disk-size.sqlite.{N}.txt", _sqliteTestRun.Results.DbSizeAfterCreation.ToString());

        [Benchmark]
        public void Doublets() => _doubletsTestRun.Run();

        [IterationCleanup(Target = "Doublets")]
        public void DoubletsOutput() => File.WriteAllText($"disk-size.doublets.{N}.txt", _doubletsTestRun.Results.DbSizeAfterCreation.ToString());
    }
}

You will also need a custom column implementation to fit data from files into the report.

using System;
using System.IO;
using System.Linq;
using BenchmarkDotNet.Columns;
using BenchmarkDotNet.Reports;
using BenchmarkDotNet.Running;

namespace Comparisons.SQLiteVSDoublets
{
    public class SizeAfterCreationColumn : IColumn
    {
        public string Id => nameof(SizeAfterCreationColumn);

        public string ColumnName => "SizeAfterCreation";

        public string Legend => "Allocated memory on disk after all records are created (1KB = 1024B)";

        public UnitType UnitType => UnitType.Size;

        public bool AlwaysShow => true;

        public ColumnCategory Category => ColumnCategory.Metric;

        public int PriorityInCategory => 0;

        public bool IsNumeric => true;

        public bool IsAvailable(Summary summary) => true;

        public bool IsDefault(Summary summary, BenchmarkCase benchmarkCase) => false;

        public string GetValue(Summary summary, BenchmarkCase benchmarkCase) => GetValue(summary, benchmarkCase, SummaryStyle.Default);

        public string GetValue(Summary summary, BenchmarkCase benchmarkCase, SummaryStyle style)
        {
            var benchmarkName = benchmarkCase.Descriptor.WorkloadMethod.Name.ToLower();
            var parameter = benchmarkCase.Parameters.Items.FirstOrDefault(x => x.Name == "N");
            if (parameter == null)
            {
                return "no parameter";
            }
            var N = Convert.ToInt32(parameter.Value);
            var filename = $"disk-size.{benchmarkName}.{N}.txt";
            return File.Exists(filename) ? File.ReadAllText(filename) : "no file";
        }

        public override string ToString() => ColumnName;
    }
}

But there is a problem with that approach, it works only if ClrJob is used, CoreJob for some reason does not write the files.

The complete example: https://github.com/linksplatform/Comparisons.SQLiteVSDoublets

@sandersaares
Copy link

sandersaares commented Sep 4, 2020

Being able to pass a simple string would be a great start for this functionality.

Being able to pass a Dictionary<string, string> (for multiple columns) would be even better.

Really, I don't care how I get the data from the benchmark into the table - as long as I can show it to the user in a more convenient way than having to search logs for it, I am happy.

A nice stretch goal might be something in line with @Konard example - allow benchmark to emit events that are serialized to file and can be processed then by a custom column (e.g. to measure average or count "interesting" events or). But maybe this deviates too much from BenchmarkDotNet's core vision?

@scalablecory
Copy link

A nice stretch goal might be something in line with @Konard example - allow benchmark to emit events that are serialized to file and can be processed then by a custom column (e.g. to measure average or count "interesting" events or).

I'd love to see this. This would be very useful for networking benchmarks to show stats about the socket.

@timcassell
Copy link
Collaborator

Reposting here since I didn't realize it was a duplicate #1730. (Same idea as @sandersaares)


I was thinking it could be done by utilizing a method returning a Dictionary<string, string> with a new attribute that is called as the final thing before GlobalCleanup.

public class Benchmark
{
    [Benchmark]
    public void MyBenchmark
    {
        // ...
    }
    
    [Benchmark]
    public void MyBenchmark2
    {
        // ...
    }
    
    [CustomTableResults(Target = nameof(MyBenchmark))]
    public Dictionary<string, string> GetCustomResults()
    {
        return new Dictionary<string, string>()
        {
            ["Example Result"] = "Custom result for " + nameof(MyBenchmark)
        };
    }
}

Would output like this:

|       Method |     Mean |     Error |    StdDev |                Example Result |
|------------- |---------:|----------:|----------:|------------------------------:|
|  MyBenchmark | 1.368 ns | 0.0038 ns | 0.0032 ns | Custom result for MyBenchmark |
|------------- |----------|-----------|-----------|-------------------------------|
| MyBenchmark2 | 1.364 ns | 0.0094 ns | 0.0083 ns |                             - |

Results could be calculated after the benchmark runs inside the method, or calculated in GlobalSetup, cached, and read in that method. It's quite versatile. The fact that it's a dictionary prevents duplicate keys (though some care would need to be taken concerning existing column names).


There would be no need to muck about with files with this method (though it lacks the post-processing after all the results are calculated that the current columns have).

@timcassell
Copy link
Collaborator

timcassell commented Sep 13, 2021

We could also have a custom post-processor:

    [CustomTableResults(Target = nameof(MyBenchmark))]
    public Dictionary<string, string> GetCustomResults()
    {
        return new Dictionary<string, string>()
        {
            ["Example Result"] = "Custom result for " + nameof(MyBenchmark)
        };
    }
    
    [CustomTableResultsPostProcess]
    public static void PostProcessCustomResults(IReadOnlyDictionary<string, string[]> customResults)
    {
        foreach(var customColumn in customResults)
        {
            for (int i = 0; i < customColumn.Value.Length; ++i)
            {
                customColumn.Value[i] += " post-processed";
            }
        }
    }

@BryanEuton
Copy link

Is there any expectation to add this capability to Benchmark?

@adamsitnik
Copy link
Member

adamsitnik commented Oct 19, 2022

With #2092 we made it possible to serialize any data to bytes and send it from benchmark to host process over an anonymous pipe. Which finally makes it possible to implement this issue without a lot of hassle.

Some hints for the contributor:

  • The IHost is being created here and it's passed to the Engine ctor so it's available during benchmark execution, but not exposed. We could extend it with a new method that would be passing an IReadOnlyDictionary<string, string> from the benchmark to host process.
  • We would need to somehow expose the possibility to pass the data to the end user. We could do that by introducing a new attribute similar to GlobalSetup/GlobalCleanup like proposed by @timcassell above:
[AdditionalMetrics(Target = nameof(MyBenchmark))]
public IReadOnlyDictionary<string, string> GetCustomResults()
    => new Dictionary<string, string>()
    {
        ["Compressed Size"] = _output.Length.ToString()
    };

The disadvantage is that we would need to introduce yet another attribute and implement the support for in-process and out-process toolchains. The alternative would be to extend [GlobalSetup] methods with the possibility to accept IHost argument and just pass it in the engine:

[GlobalCleanup(Target = nameof(MyBenchmark))]
public void GlobalCleanup(IHost host)
{
    Dictionary<string, string>() metrics = new()
    {
        ["Compressed Size"] = _output.Length.ToString()
    }

    host.ReportMetrics(metrics);

   _someField.Dispose(); // what typical cleanup does
}

Another alternative, which is easier to implement but we have never done it in BDN is exposing a new public static property and make Engine initialize it before GlobalCleanup call and null it after the call. Sth like:

// Engine
Host.Instance = _host;
GlobalCleanup();
Host.Instance = null;

// GlobalCleanup:
Host.Instance.ReportMetrics(metrics);
  • Once the user passes data, it needs to be handled by the Broker which reads from the pipe. The broker would need to handle that by storing the data somewhere. Then every executor that uses it would need to store it in ExecuteResult next to standard output, exit code etc. This would be needed to convert into metrics which would make it work with everything else out of the box by MetricsColumnProvider.

@timcassell
Copy link
Collaborator

timcassell commented Oct 19, 2022

It seems dangerous to encourage users to use IHost. That looks like it should actually be internal, and is only public for the code-gen project to use, especially if we're going to be changing its contract. (See my other comment about that.) And exposing a static property just feels dirty and unsafe.

It may take more work, but I think the new attribute approach is the safest.

@ahmadi-ali
Copy link

Hi All,
Any update regarding this topic?

@adamsitnik
Copy link
Member

Any update regarding this topic?

The issue is still up-for-grabs, I've provided the hints for potential contributor in the comment above: #784 (comment) Until somebody grabs it, implements and sends a PR there will be no updates.

@timcassell
Copy link
Collaborator

I might take a look at implementing this soon.

Upon looking at your idea again, I kind of like this. But what would you think about exposing a new interface instead of IHost @adamsitnik?

public interface IMetricReporter
{
    void ReportMetrics(IEnumerable<CustomMetric> metrics);
}
public class CustomMetric
{
    public CustomMetric(string name, string value) { }
    public CustomMetric(string name, double value, UnitType unitType, string numberFormat = "0.##") { }
}
[GlobalCleanup(Target = nameof(MyBenchmark))]
public void GlobalCleanup(IMetricReporter metricReporter)
{
    var metrics = new[]
    {
        new CustomMetric("Compressed Size", _output.Length, UnitType.Dimensionless)
    };

    metricReporter.ReportMetrics(metrics);
}

This would allow us to extend functionality through the CustomMetric class if we need to without breaking the interface. That's much more maintainable than updating all the toolchains for any changes we want (example here is adding numeric type post-processing like the built-in metrics).

@timcassell
Copy link
Collaborator

I realized that GlobalCleanup can be called multiple times if memory randomization is active. So I think that it will have to be a separate method that runs after GlobalCleanup.

Also, this involves updating the code-gen, and I already have a lot of open PRs touching that and I don't want to add more before they're merged, so I'm holding off on starting the implementation for this for now.

@stonstad
Copy link

Interim solution for presenting custom table results: https://gist.github.com/stonstad/43e027ecc580612c5abd5e1fcf1e30d8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants