Base64 Zig test has been added. #188

nuald · 2019-10-20T21:00:49Z

Partially addressing #185. Used test.c as a reference except using standard base64 implementation within the Zig language. I've tried both 0.5.0 and the master version - no differences whatsover, hence I've decided to use the master version as it looks like that for non-stable releases we use the nightly builds (like for V).

andrewrk

Here is an improved version:

const std = @import("std");
const time = std.time;
const Timer = time.Timer;

const str_size = 10000000;
const tries = 100;

var str1: [str_size]u8 = undefined;
var str2: [std.base64.Base64Encoder.calcSize(str1.len)]u8 = undefined;
var str3: [str_size]u8 = undefined;

pub fn main() !void {
    var stdout_file = try std.io.getStdOut();
    const stdout = &stdout_file.outStream().stream;

    std.mem.set(u8, &str1, 'a');
    var timer = try Timer.start();
    var s: usize = 0;

    var i: usize = 0;
    var start = timer.lap();
    while (i < tries) : (i += 1) {
        std.base64.standard_encoder.encode(&str2, &str1);
        s += str2.len;
    }
    var end = timer.read();
    var elapsed_s = @intToFloat(f64, end - start) / time.ns_per_s;
    try stdout.print("encode: {}, {d:.2}\n", s, elapsed_s);

    i = 0;
    s = 0;
    start = timer.lap();
    while (i < tries) : (i += 1) {
        std.base64.standard_decoder.decode(&str3, &str2) catch unreachable;
        s += str3.len;
    }
    end = timer.read();
    elapsed_s = @intToFloat(f64, end - start) / time.ns_per_s;
    try stdout.print("decode: {}, {d:.2}\n", s, elapsed_s);
}

Results:

direct_allocator:
encode: 1333333600, 1.13
decode: 1000000000, 0.12

no allocator:
encode: 1333333600, 0.46
decode: 1000000000, 0.14

andrewrk · 2019-10-20T21:28:43Z

base64/test.zig

+    var i: usize = 0;
+    var start = timer.lap();
+    while (i < tries) : (i += 1) {
+        const str2 = try allocator.alloc(u8, std.base64.Base64Encoder.calcSize(str1.len));


Using std.heap.direct_allocator inside the hot path is no good. This is doing a mmap per iteration, no wonder it is so slow.

https://ziglang.org/documentation/master/#Choosing-an-Allocator

That's intentionally - the benchmark is about measuring allocating + coding/decoding at once. Surely, low level languages could allocate memory in advance, use memory pools and other advance techniques and it would give them an unfair advantage comparing with programming languages with GC.

Moreover, direct_allocator using mmap is an implementation detail that could change in the future - the same way as malloc uses either sbrk or mmap, the allocator could implement more advanced methods in the future, thus making the general use case (allocate a lot of memory & encode the blob into it) tested in this benchmark faster.

That's intentionally - the benchmark is about measuring allocating + coding/decoding at once.

This is a poorly defined benchmark. Your malloc call in the C benchmark might or might not make a syscall, for example. The appropriate allocator for this code is none. If you want to measure allocation performance then you need to come up with an example where allocating heap memory actually makes sense. And again, your C code is not guaranteed either to make syscalls, because there is code that runs before main() which uses malloc/free.

Moreover, direct_allocator using mmap is an implementation detail that could change in the future

No it's not, it's defined to make a syscall for every allocation and free.

The benchmark is about how average programmers would code the given task. Undoubtedly, you would write the optimized version, but the average Joe uses NPM package for left padding (https://www.davidhaney.io/npm-left-pad-have-we-forgotten-how-to-program/). Surely, I don't like the current state of the industry and would prefer everyone write the optimized and carefully thoughtful code.

The same is about other tests - like bf2. It's not optimized version, but it resembles the code that would average Joe would write. And good compilers try to optimize the code like that (a good example is the loop unrolling, in perfect world the compiler won't care about it as the decent developers could implement that on their own).

As for the direct_allocator, sorry, it's the first day I'm using your language, so I don't know all the peculiarities yet. I meant that in the future you may have some smart allocator by default that may have better heuristics than malloc's.

No, this is a disingenuous argument. I did not write an optimized version. My version is clearly simpler than yours, both to read and to write.

it's the first day I'm using your language, so I don't know all the peculiarities yet

The zig community welcomes newbies with open arms. However, newbies should not be writing benchmarks that publicly represent the language, if they are not willing to read the documentation for the functions they are calling.

I suggest you take my advice to fix the benchmark because if you merge this pull request as is, your reputation and this project's reputation will be harmed, being associated with dishonest benchmarks.

Not using unnecessary memory allocations in a loop isn't "fancy optimizing," it's basic logic. I was going to comment that there was no way that the other implementations were this bad but some of them are. As is, these benchmarks are entirely useless, because you're not measuring the speed of the language, you're measuring completely different variables for different languages.

The base64 benchmark currently is more of a test of the allocator (Zig, C, etc) and/or VM/GC (JS, etc) than it is of actual base64 performance, and I have no doubt the rest of the benchmarks are equally off-base from what they claim to be.

If you're going to do this, at least document it. (edit: for instance, call it a "base64/allocation" benchmark even. That'd at least make it clear this wasn't just raw performance. Preferably, if "The benchmark is about how average programmers would code the given task," just say so. If you only claim that on a PR for a specific language, it doesn't look very good. Literally, just add "The benchmark is about how average programmers would code the given task, and does not reflect the performance of well-written, optimized code" to the README. Problem solved).

as @nuald said, we measuring something like what real task would look like, if you not like this, i guess we not going to merge this.

like what real task would look like

real task will not use std.heap.direct_allocator.

if you not like this, i guess we not going to merge this.

thank you

andrewrk · 2019-10-20T21:31:24Z

base64/test.zig

+    while (i < tries) : (i += 1) {
+        const str3 = try allocator.alloc(u8, try std.base64.standard_decoder.calcSize(str2));
+        defer allocator.free(str3);
+        try std.base64.standard_decoder.decode(str3, str2);


Also, here, we can use catch unreachable (which you can see in my proposed code), because this is guaranteed to not fail, because the input just came from base64 encoded output and is therefore guaranteed to be valid. This provides additional guarantees to the optimizer. I did not notice performance differences in this case, however, but it is still correct, and could improve performance in future versions of LLVM.

Base64 Zig test has been added.

69c29c3

nuald mentioned this pull request Oct 20, 2019

Would you like to add ziglang for benchmark? #185

Open

nuald requested a review from kostya October 20, 2019 21:11

kostya approved these changes Oct 20, 2019

View reviewed changes

andrewrk suggested changes Oct 20, 2019

View reviewed changes

andrewrk reviewed Oct 20, 2019

View reviewed changes

nuald closed this Oct 21, 2019

nuald mentioned this pull request Aug 17, 2020

Couple suggestions #264

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Base64 Zig test has been added. #188

Base64 Zig test has been added. #188

nuald commented Oct 20, 2019 •

edited

Loading

andrewrk left a comment

andrewrk Oct 20, 2019 •

edited

Loading

nuald Oct 20, 2019

andrewrk Oct 20, 2019

nuald Oct 20, 2019

andrewrk Oct 20, 2019

pixelherodev Oct 21, 2019 •

edited

Loading

kostya Oct 21, 2019 •

edited

Loading

andrewrk Oct 21, 2019

andrewrk Oct 20, 2019 •

edited

Loading

Base64 Zig test has been added. #188

Base64 Zig test has been added. #188

Conversation

nuald commented Oct 20, 2019 • edited Loading

andrewrk left a comment

Choose a reason for hiding this comment

andrewrk Oct 20, 2019 • edited Loading

Choose a reason for hiding this comment

nuald Oct 20, 2019

Choose a reason for hiding this comment

andrewrk Oct 20, 2019

Choose a reason for hiding this comment

nuald Oct 20, 2019

Choose a reason for hiding this comment

andrewrk Oct 20, 2019

Choose a reason for hiding this comment

pixelherodev Oct 21, 2019 • edited Loading

Choose a reason for hiding this comment

kostya Oct 21, 2019 • edited Loading

Choose a reason for hiding this comment

andrewrk Oct 21, 2019

Choose a reason for hiding this comment

andrewrk Oct 20, 2019 • edited Loading

Choose a reason for hiding this comment

nuald commented Oct 20, 2019 •

edited

Loading

andrewrk Oct 20, 2019 •

edited

Loading

pixelherodev Oct 21, 2019 •

edited

Loading

kostya Oct 21, 2019 •

edited

Loading

andrewrk Oct 20, 2019 •

edited

Loading