Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

You should use version 5 of the D/csv_test.d for benchmarking #1

Closed
Fatmice opened this issue May 27, 2017 · 15 comments
Closed

You should use version 5 of the D/csv_test.d for benchmarking #1

Fatmice opened this issue May 27, 2017 · 15 comments

Comments

@Fatmice
Copy link

@Fatmice Fatmice commented May 27, 2017

They said explicitly that version 5 dropped Append and used manual iteration through the fields. That version benchmark at 1.1 seconds, similar to your Nim code. While they never specify the entire version 5, they did say what it was. Basically version 2 with the associative array and loop exit. I've edited version two with those changes. I have not compile it yet so do compile with LDC or correct any mistakes I made.
https://dlang.org/blog/2017/05/24/faster-command-line-tools-in-d/

int main(string[] args)
{
    import std.algorithm : max, maxElement, splitter;
    import std.conv : to;
    import std.range : enumerate;
    import std.stdio;

    if (args.length < 4)
    {
        writefln ("synopsis: %s filename keyfield valuefield", args[0]);
        return 1;
    }

    string filename = args[1];
    size_t keyFieldIndex = args[2].to!size_t;
    size_t valueFieldIndex = args[3].to!size_t;
    size_t maxFieldIndex = max(keyFieldIndex, valueFieldIndex);
    string delim = "\t";

    long[string] sumByKey;

    foreach(line; filename.File.byLine)
    {
        string key;
        long value;
        bool allFound = false;

        foreach (i, field; line.splitter(delim).enumerate)
		{
			if (i == keyFieldIndex) key = field.to!string;
			if (i == valueFieldIndex) value = field.to!long;
			if (i == maxFieldIndex)
			{
				allFound = true;
				if (auto sumValuePtr = key in sumByKey) *sumValuePtr += value;
				else sumByKey[key] = value;
				break;
			}
		}
    }

    if (sumByKey.length == 0) writeln("No entries");
    else
    {
        auto maxEntry = sumByKey.byKeyValue.maxElement!"a.value";
        writeln("max_key: ", maxEntry.key, " sum: ", maxEntry.value);
    }
    return 0;
}
@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented May 27, 2017

@Fatmice

This comment has been minimized.

Copy link
Author

@Fatmice Fatmice commented May 30, 2017

@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented May 30, 2017

Thanks a lot, I'll give that a whirl tomorrow after work. I'm not sure I'll update the blog post with the optimised IO version as it seems to require a third party library rather than purely using the standard library.

@Fatmice

This comment has been minimized.

Copy link
Author

@Fatmice Fatmice commented May 30, 2017

Naturally, that was simply an example of further optimization on the io since approximately half of the time was spent reading the input file. I'm sure the io optimization can be done without the third party library and purely with the standard library, which is what the third party built upon.

@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented May 31, 2017

Interestingly, the version you provided above actually seems slower on my machine! The stats for the above version built with LDC are:

MIN TIME: 1.70s
PEAK MEM: 3.3Mb

Nim:

MIN TIME: 1.17s
PEAK MEM: 1.9Mb

No idea why!

@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented May 31, 2017

Hm, I'm not entirely sure that the code you provided is actually accurate to the version 5 in the blog post. I'm going to try contacting the author of the blog post for some clarification.

@jondegenhardt Any chance you can shed some light on the Version 5 mentioned in your original blog post?

@Fatmice

This comment has been minimized.

Copy link
Author

@Fatmice Fatmice commented May 31, 2017

I may have made a mistake then? I've not actually seen the Version 5 mentioned in the original blog post. Maybe the slow down is due to nested if/else. Try this one? I removed the bool allFound allocation as it was not needed.

int main(string[] args)
{
    import std.algorithm : max, maxElement, splitter;
    import std.conv : to;
    import std.range : enumerate;
    import std.stdio;

    if (args.length < 4)
    {
        writefln ("synopsis: %s filename keyfield valuefield", args[0]);
        return 1;
    }

    string filename = args[1];
    size_t keyFieldIndex = args[2].to!size_t;
    size_t valueFieldIndex = args[3].to!size_t;
    size_t maxFieldIndex = max(keyFieldIndex, valueFieldIndex);
    string delim = "\t";

    long[string] sumByKey;

    foreach(line; filename.File.byLine)
    {
        string key;
        long value;

        foreach (i, field; line.splitter(delim).enumerate)
		{
			if (i == keyFieldIndex) key = field.to!string;
			if (i == valueFieldIndex) value = field.to!long;
			if (auto sumValuePtr = key in sumByKey) *sumValuePtr += value;
			else sumByKey[key] = value;
			if (i == maxFieldIndex) break;
		}
    }

    if (sumByKey.length == 0) writeln("No entries");
    else
    {
        auto maxEntry = sumByKey.byKeyValue.maxElement!"a.value";
        writeln("max_key: ", maxEntry.key, " sum: ", maxEntry.value);
    }
    return 0;
}
@jondegenhardt

This comment has been minimized.

Copy link

@jondegenhardt jondegenhardt commented Jun 1, 2017

The inner loop should be one of
version a

foreach (i, field; line.splitter(delim).enumerate)
{
    if (i == keyFieldIndex) key = field.to!string;
    if (i == valueFieldIndex) value = field.to!long;
    if (i == maxFieldIndex)
    {
        if (auto sumValuePtr = key in sumByKey) *sumValuePtr += value;
        else sumByKey[key] = value;
        break;
    }
}

or
version b

bool allFound = false;
foreach (i, field; line.splitter(delim).enumerate)
{
    if (i == keyFieldIndex) key = field.to!string;
    if (i == valueFieldIndex) value = field.to!long;
    if (i == maxFieldIndex) 
    {
        allFound = true;
        break;
    }
}
if (allFound)
{
    if (auto sumValuePtr = key in sumByKey) *sumValuePtr += value;
    else sumByKey[key] = value;
}

The difference is purely stylistic. I tested using the latter.

@Fatmice

This comment has been minimized.

Copy link
Author

@Fatmice Fatmice commented Jun 1, 2017

Ah so I was right to put the AA lookup within if (i == maxFieldIndex). I think version b is cleaner due to my hate for nested if/else. I should get LDC running to compile this code. I have GDC but apparently it generates codes that runs a little slower than LDC.

{
    import std.algorithm : max, maxElement, splitter;
    import std.conv : to;
    import std.range : enumerate;
    import std.stdio;

    if (args.length < 4)
    {
        writefln ("synopsis: %s filename keyfield valuefield", args[0]);
        return 1;
    }

    string filename = args[1];
    size_t keyFieldIndex = args[2].to!size_t;
    size_t valueFieldIndex = args[3].to!size_t;
    size_t maxFieldIndex = max(keyFieldIndex, valueFieldIndex);
    string delim = "\t";

    long[string] sumByKey;

    foreach(line; filename.File.byLine)
    {
        string key;
        long value;
		bool allFound = false;
		foreach (i, field; line.splitter(delim).enumerate)
		{
			if (i == keyFieldIndex) key = field.to!string;
			if (i == valueFieldIndex) value = field.to!long;
			if (i == maxFieldIndex) 
			{
				allFound = true;
				break;
			}
		}
		if (allFound)
		{
			if (auto sumValuePtr = key in sumByKey) *sumValuePtr += value;
			else sumByKey[key] = value;
		}
    }

    if (sumByKey.length == 0) writeln("No entries");
    else
    {
        auto maxEntry = sumByKey.byKeyValue.maxElement!"a.value";
        writeln("max_key: ", maxEntry.key, " sum: ", maxEntry.value);
    }
    return 0;
}
@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented Jun 1, 2017

@jondegenhardt

This comment has been minimized.

Copy link

@jondegenhardt jondegenhardt commented Jun 1, 2017

Oops, that's still wrong. It needs to be using char[] to capture the intermediate key value, not string. I shouldn't have tried correcting a small piece! Here's a correct version:

int main(string[] args)
{
    import std.algorithm : max, maxElement, splitter;
    import std.conv : to;
    import std.range : enumerate;
    import std.stdio;

    if (args.length < 4)
    {
        writefln ("synopsis: %s filename keyfield valuefield", args[0]);
        return 1;
    }

    string filename = args[1];
    size_t keyFieldIndex = args[2].to!size_t;
    size_t valueFieldIndex = args[3].to!size_t;
    size_t maxFieldIndex = max(keyFieldIndex, valueFieldIndex);
    string delim = "\t";

    long[string] sumByKey;

    foreach(line; filename.File.byLine)
    {
        char[] key;
        long value;
        bool allFound = false;

        foreach (i, field; line.splitter(delim).enumerate)
        {
            if (i == keyFieldIndex) key = field;
            if (i == valueFieldIndex) value = field.to!long;
            if (i == maxFieldIndex)
            {
                allFound = true;
                break;
            }
        }

        if (allFound)
        {
            if (auto sumValuePtr = key in sumByKey) *sumValuePtr += value;
            else sumByKey[key.to!string] = value;
        }
    }

    if (sumByKey.length == 0) writeln("No entries");
    else
    {
        auto maxEntry = sumByKey.byKeyValue.maxElement!"a.value";
        writeln("max_key: ", maxEntry.key, " sum: ", maxEntry.value);
    }
    return 0;
}
@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented Jun 1, 2017

Initial testing with the above gives the following:

LDC:

MIN TIME: 1.26s
PEAK MEM: 2.4Mb

Nim:

MIN TIME: 1.31s
PEAK MEM: 1.9Mb

So D does now beat Nim in speed, but still loses on memory usage. I'll update my blog post 😄

@euantorano euantorano closed this Jun 1, 2017
@pftbest

This comment has been minimized.

Copy link

@pftbest pftbest commented Jun 8, 2017

On my setup, Nim code is 25% slower than D for some reason

Versions:
Nim Compiler Version 0.17.0 (2017-05-22) [Linux: amd64]
LDC - the LLVM D compiler (1.2.0): based on DMD v2.072.2 and LLVM 4.0.0

Results:

Nim:

MIN TIME: 1.12s
PEAK MEM: 2.1Mb

D:

MIN TIME: 0.82s
PEAK MEM: 7.1Mb

Rust:

MIN TIME: 0.67s
PEAK MEM: 5.6Mb
@euantorano

This comment has been minimized.

Copy link
Owner

@euantorano euantorano commented Jun 8, 2017

@pftbest Interesting. I'd guess that Nim is compiling via GCC on your system since you're on Linux? I wonder if I got such good results thanks to Clang... Thanks for sharing your Rust version too!

@pftbest

This comment has been minimized.

Copy link

@pftbest pftbest commented Jun 8, 2017

wonder if I got such good results thanks to Clang

Yes, probably. clang is a great compiler.

I have the gcc (GCC) 7.1.1 20170528 installed on my system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.