Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange benchmark results with std::string #4

Closed
Morwenn opened this issue Jan 29, 2019 · 5 comments
Closed

Strange benchmark results with std::string #4

Morwenn opened this issue Jan 29, 2019 · 5 comments

Comments

@Morwenn
Copy link
Contributor

Morwenn commented Jan 29, 2019

Hi,

I recently tried to rerun the benchmarks to check another simpler Inv-adaptive algorithm against drop-merge-sort. I got expected results with int, but the std::string benchmark gives the following result:

results

In the graph above, pdq_sort is an algorithm similar to std::sort, split_sort is the algorithm I was working on and drop_merge_sort is similar to the implementation from this repository (I tried to use the implementation from here though, and it gave me similar results). The relative curves of pdq_sort and split_sort, while not as pretty as the ones in your README, are rather expected. However I can't explain the one for drop_merge_sort. Do you have any idea what might be happening here, keeping in mind that the benchmark for integers has the expected results?

EDIT: it says "sorting 10^6 int" but it's only because I forgot to change the graph title. It's effectively the results of the string benchmark.

@Morwenn
Copy link
Contributor Author

Morwenn commented Jan 29, 2019

I just investigated a bit and apparently when drop-merge-sort is benchmarked with std::string, the resulting dropped vector never contains a single element prior to the merge. I'd say that the strange behaviour comes from the data generation in the benchmark more than from the algorithm itself.

So far it seems that the condition if (begin != write && comp(*read, *(write - 1))) never evaluates to true in the benchmark. Check showed that randomize doesn't produce an already sorted collection? So I don't really know what's going on there.

@Morwenn
Copy link
Contributor Author

Morwenn commented Jan 30, 2019

Ok, I eventually found the issue: when I uncomment the "sanity check" in the benchmarks, it throws an exception. The issue does come from dmsort and not from the benchmark itself. It is in the following piece of code:

} else {
    *write = std::move(*read);
    ++read;
    ++write;
    num_dropped_in_row = 0;
}

Here *write = std::move(*read); actually triggers a self-move when write == read. Adding a self-assignment check solves the problem and I get coherent results again:

} else {
    if (read != write) {
        *write = std::move(*read);
    }
    ++read;
    ++write;
    num_dropped_in_row = 0;
}

For performance reasons, I don't think the algorithm needs to perform that check above for trivially copyable types.

better benchmark

@adrian17
Copy link
Owner

adrian17 commented Feb 4, 2019

Thank you for investigating this! Just curious, which compiler/flag version triggered the exception? I'd experiment with it a bit myself.

In the next weekend I'll test it out, rerun the benchmarks and submit the patch.

@Morwenn
Copy link
Contributor Author

Morwenn commented Feb 4, 2019

I'm using MinGW-w64, either 32 or 64 bits. I otherwise still have issues (not with your code this time) with libc++ on OSX with other std::string tests which don't fail on Linux with libstdc++, so... Maybe check either libc++ on OSX or MinGW-w64 depending on what you have.

@adrian17
Copy link
Owner

adrian17 commented Feb 4, 2019

Alright. If I remember correctly, back then I tested it on Linux with g++ 5.4 and clang++ 3.8. Will try to repeat tests on Windows and on newer Linux VMs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants