Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.algorithm.cmp with default ordering can use memcmp for all size 1 unsigned types (instead of just char) #6776

Merged
merged 1 commit into from Apr 12, 2019

Conversation

n8sh
Copy link
Member

@n8sh n8sh commented Nov 19, 2018

No description provided.

@dlang-bot
Copy link
Contributor

dlang-bot commented Nov 19, 2018

Thanks for your pull request, @n8sh!

Bugzilla references

Auto-close Bugzilla Severity Description
19412 enhancement std.algorithm.cmp with default ordering can use memcmp for all size 1 unsigned types (instead of just char)

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#6776"

std/algorithm/comparison.d Outdated Show resolved Hide resolved
@n8sh n8sh force-pushed the issue-19412 branch 5 times, most recently from ae8fa67 to 281bc3a Compare November 19, 2018 20:24
return threeWay(r1.length, r2.length);
}();
}
if (*p1 != *p2) return cast(int) *p1 - cast(int) *p2;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should cache *p1 and *p2
the additional dereference may be expensive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefan-koch-sociomantic I've verified that the compiler performs this optimization without needing any hint from the programmer. Both cmp1 and cmp1 in the example below are compiled to the same machine code by ldc -O2:

https://run.dlang.io/is/cz5hay

import std.algorithm : min;

int cmp1(const dchar[] r1, const dchar[] r2) @trusted
{
    auto p1 = r1.ptr, p2 = r2.ptr, pEnd = p1 + min(r1.length, r2.length);
    for (; p1 != pEnd; ++p1, ++p2)
    {
        if (*p1 != *p2)
            return cast(int)*p1 - cast(int)*p2;
    }
    return int(r1.length > r2.length) - int(r1.length < r2.length);
}

int cmp2(const dchar[] r1, const dchar[] r2) @trusted
{
    auto p1 = r1.ptr, p2 = r2.ptr, pEnd = p1 + min(r1.length, r2.length);
    for (; p1 != pEnd; ++p1, ++p2)
    {
        const v1 = *p1, v2 = *p2; // manually saving result of *p1 and *p2
        if (v1 != v2)
            return cast(int) v1 - cast(int) v2;
    }
    return int(r1.length > r2.length) - int(r1.length < r2.length);
}

void main()
{
}

@thewilsonator
Copy link
Contributor

(Blocked on dscanner - so I stop relooking at this PR)

@radcapricorn
Copy link
Contributor

I'm against this change. We should be removing C dependencies from Phobos, not finding more ways to plug them in. The language should be capable enough without them.

@wilzbach
Copy link
Member

I'm against this change. We should be removing C dependencies from Phobos, not finding more ways to plug them in. The language should be capable enough without them.

memcmp is already used in this function.

@radcapricorn
Copy link
Contributor

@wilzbach As I said, "removing, not finding more ways" ;)

@n8sh
Copy link
Member Author

n8sh commented Mar 9, 2019

Benchmark of this PR's change:

range compiler speedup
ubyte[16] dmd -O -inline 2.5x
ubyte[128] dmd -O -inline 10.5x
ubyte[16] ldc2 -O2 2.4x
ubyte[128] ldc2 -O2 13.5x

Note: in this benchmark ldc2 produced a 64-bit binary and dmd produced a 32-bit binary.

For wider types whether this was an improvement depended on length, so I've restricted it to just byte-width types.

@radcapricorn instead of using memcmp directly I changed this PR to use core.internal.string.dstrcmp which internally uses memcmp with a separate path for CTFE. Does that satisfy your concerns?

@n8sh n8sh changed the title std.algorithm.cmp with default ordering can use memcmp for all big-endian or size 1 unsigned types (instead of just char) std.algorithm.cmp with default ordering can use memcmp for all size 1 unsigned types (instead of just char) Mar 9, 2019
@n8sh n8sh removed the Blocked label Mar 9, 2019
…cmp for all size 1 unsigned types

... instead of just char.
Copy link
Contributor

@thewilsonator thewilsonator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just address Stefan's comment and I think this is good.

@wilzbach wilzbach merged commit 3144226 into dlang:master Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants