Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make group, chain, and choose compatible with RefRange #6346

Merged
merged 3 commits into from
Mar 26, 2019

Conversation

aG0aep6G
Copy link
Contributor

The first three of probably many functions that need to be fixed in order to resolve issue 18657.

Related forum discussion: https://forum.dlang.org/post/p96gs4$22fp$1@digitalmars.com.

I'm not sure if this is the right direction. The alternative would be to add assignment to the range interface, and remove opAssign from RefRange. That would be less busywork, and being able to use assignment with ranges might be better for everyone's sanity. But it would also be more of a breaking change.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @aG0aep6G! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
18657 normal std.range and std.algorithm can't handle refRange

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "stable + phobos#6346"

typeof(this) ret = this;
ret._input = this._input.save;
auto saved = this._input.save;
move(saved, ret._input);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the right direction.

I'm not sure either. If we're doing it this way though we could at least put this in a branch that checks for hasElaborateAssign and otherwise use the normal behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could at least put this in a branch that checks for hasElaborateAssign and otherwise use the normal behavior.

What would that accomplish? It looks like move boils down to target = source; anyway when there's nothing special to be done.

static if (hasElaborateAssign!T || !isAssignable!T)
memcpy(&target, &source, T.sizeof);
else
target = source;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would that accomplish?

It avoids unnecessary function calls and unnecessary data copying and potentially reduces template bloat. Compare the generated assembly for aaaCopyImplAAA and bbbCopyImplBBB with compile options "-O -inline": https://run.dlang.io/is/1aofWf

struct Inner
{
    uint meaninglessField;
    @property typeof(this) save() { return this; }
}

struct ExampleStruct
{
    Inner exampleField;
    @property typeof(this) aaaCopyImplAAA()
    {
        typeof(this) ret = this;
        ret.exampleField = this.exampleField.save;
        return ret;
    }
    @property typeof(this) bbbCopyImplBBB()
    {
        import std.algorithm.mutation : move;
        typeof(this) ret = this;
        auto saved = this.exampleField.save;
        move(saved, ret.exampleField);
        return ret;
    }        
}

void main()
{
}

aaaCopyImplAAA:

onlineapp.ExampleStruct.aaaCopyImplAAA():
		push	RBP
		mov	RBP,RSP
		mov	EAX,[RDI]
		pop	RBP
		ret

bbbCopyImplBBB:

onlineapp.ExampleStruct.bbbCopyImplBBB():
		push	RBP
		mov	RBP,RSP
		sub	RSP,010h
		mov	EAX,[RDI]
		mov	-8[RBP],EAX
		mov	ECX,[RDI]
		mov	-4[RBP],ECX
		lea	RSI,-4[RBP]
		lea	RDI,-8[RBP]
		call	  pure nothrow @nogc void std.algorithm.mutation.moveImpl!(onlineapp.Inner).moveImpl(ref onlineapp.Inner, ref onlineapp.Inner)@PLT32
		mov	EAX,-8[RBP]
		mov	RSP,RBP
		pop	RBP
		ret
		add	[RCX],AH

pure nothrow @nogc void std.algorithm.mutation.moveImpl!(onlineapp.Inner).moveImpl(ref onlineapp.Inner, ref onlineapp.Inner):
		push	RBP
		mov	RBP,RSP
		sub	RSP,010h
		mov	-010h[RBP],RBX
		mov	RCX,RDI
		mov	RDX,RSI
		cmp	RDX,RCX
		jne	L20
		mov	RBX,-010h[RBP]
		mov	RSP,RBP
		pop	RBP
		ret
L20:		cmp	RDX,RCX
		je	L32
		mov	EBX,[RDX]
		mov	[RCX],EBX
		mov	RBX,-010h[RBP]
		mov	RSP,RBP
		pop	RBP
		ret
L32:		mov	R8D,055Fh
		lea	RCX,_TMP0@PC32[RIP]
		mov	EAX,040h
		mov	RDX,RAX
		mov	-8[RBP],RDX
		lea	RDX,_TMP0@PC32[RIP]
		mov	EDI,027h
		mov	RSI,RDX
		mov	RDX,-8[RBP]
		call	  _d_assert_msg@PLT32
		mov	RBX,-010h[RBP]
		mov	RSP,RBP
		pop	RBP
		ret

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always compare ASM from LDC and never from DMD unless you're working on DMD's backend. DMD's optimizer is out of date, and therefore should not be catered to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please don't "optimize" code based on what dmd does. Due to dmd's inferior optimizer, this tends to greatly uglify code needlessly, where ldc/gdc would produce code better than dmd anyway, even when written in the original "unoptimized" form.

We should rather improve dmd's optimizer than make Phobos a rats' nest of workarounds for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, and LDC handles this example nicely. (The output is still littered with unused functions produced by template expansion, but I believe that's being actively worked on.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output is still littered with unused functions produced by template expansion

-L--gc-sections should be able to remove unused sections.
(see e.g. https://sourceware.org/binutils/docs/ld/Options.html)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rather improve dmd's optimizer than make Phobos a rats' nest of workarounds for it.

No, we should just stop catering for dmd's optimizer and use our resources towards more useful work.

@aG0aep6G
Copy link
Contributor Author

This has been open for three weeks, and the only discussion has been about optimizing for DMD or not (consensus: not). Ping?

Copy link
Member

@JackStouffer JackStouffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Albeit a small one, this is a breaking change, as opAssign is no longer called during these functions. It's possible behavior in range types with custom opAssign will be broken by not being called, although it's unlikely.

But, this is a band-aid over the issue of what to do with funky operator overloads in ranges. In conjunction with this issue, you also have to assume that one could do stuff in the ctor that would mess up existing range code in Phobos. What really needs to happen is the range spec needs to be updated with a clear list of assumptions that Phobos makes in regards to these functions. I'm pretty sure that RefRange would have been designed differently if the designer realized the results of overloading opAssign like that.

In a perfect world, we'd have a pre-constructed test suite that people could plug their range (with some data in it) into, where all the tests make sure all Phobos assumptions hold true.

For the changes in this PR, it fixes an underlying issue and keeps current behavior for ranges without elaborate assigns, but burdens a large maintenance cost to the rest of Phobos. In my opinion, it would be better to update the range spec and go from there.

typeof(this) ret = this;
ret._input = this._input.save;
auto saved = this._input.save;
move(saved, ret._input);
ret._current = this._current;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this line also invoke refRange's opAssign?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this line also invoke refRange's opAssign?

_current is the front element. It's not the range itself.

@aG0aep6G
Copy link
Contributor Author

In my opinion, it would be better to update the range spec and go from there.

I wouldn't mind that at all.

@MartinNowak
Copy link
Member

MartinNowak commented Apr 17, 2018

This has been open for three weeks, and the only discussion has been about optimizing for DMD or not (consensus: not). Ping?

This is because this is a complex and deep rooted topic that needs proper understanding, also related to unsupported unique ownership ranges.
FWIW using copy construction + move to avoid a broken opAssign is a hack that might bite us in the long-run. It's also problematic to leak such workarounds into the standard library, where people look up how to implement certain behaviors.
Whether the design issue is within refRange, the algorithms, the range spec, or the language someone still needs to be exactly pointed out. And btw, it's the purpose of refRange to mutate the original range, so maybe it's not even reasonable to implement save.

Thanks for your work on this fix, but I think a deeper understanding of the problem would be more appropriate than simply fixing a few symptoms.

@JackStouffer
Copy link
Member

I've reconsidered based on @jmdavis's comments.

I think the option that will cause the least pain will keeping in the spirt of what RefRange is supposed to do, is to remove RefRange's forward range capabilities.

The range semantics still need to be defined, but that doesn't solve the issue of breaking something about RefRange. By removing save we keep the current intended behavior and remove the buggy behavior.

@aG0aep6G
Copy link
Contributor Author

I think the option that will cause the least pain will keeping in the spirt of what RefRange is supposed to do, is to remove RefRange's forward range capabilities.

Maybe. I'm not so sure if the original goal of RefRange is worthwhile, though.

What the opAssign achieves is that the RefRange binds more tightly to its target than a pointer or class reference would. It's more like a ref parameter, which also cannot be rebound.

I think this behavior is actually somewhat surprising. I'd expect refRange(&foo) to work like &foo, just with the necessary adjustments to make it a proper range itself. I've had uses for that part of RefRange. I don't think I ever had a use for the stronger non-rebindable aspect. But it's of course possible that I just wasn't aware of relying on it.

Example:

import std.range: refRange;
import std.stdio: writeln;

void main()
{
    string s = "foo";
    
    auto p = &s;
    auto r = refRange(&s);
    
    auto p2 = p;
    auto r2 = r;
    
    p2 = ["bar"].ptr; /* Rebind p2. */
    writeln(*p); /* Still "foo". */
    
    r2 = refRange(["bar"].ptr); /* Can't actually rebind r2. */
    writeln(r); /* Surprisingly(?), it's "bar" now. */
}

The range semantics still need to be defined, but that doesn't solve the issue of breaking something about RefRange. By removing save we keep the current intended behavior and remove the buggy behavior.

You also lose the intended behavior of RefRange being a forward range. That's ok if it was a mistake in the first place, but does it buy us anything beyond that? Seems mostly a separate issue to me.

If non-surprising assignment becomes a requirement for ranges, we have to fix RefRange in that regard. And if funky opAssign overloads get deliberately allowed, generic code can't assume anything about assignment, and we have to do what this PR does. In both cases, it doesn't really matter if RefRange has save or not.

One scenario where RefRange.save would become relevant is if we say that forward ranges must have non-surprising assignment, but input ranges don't have to. Doesn't really make sense to me, but that's the one case I can think of where removing save from RefRange would have a meaningful impact.

@MartinNowak
Copy link
Member

I'm not so sure if the original goal of RefRange is worthwhile, though.

My point was, refRange is a very important primitive, in particular with unique ownership to circumvent the copy vs. move problem. It's prolly also in widespread use.
If we need to fix sth. about this then we need to consider the whole ecosystem, including fallout, future directions, and learning hurdle/complexity.

I'd expect refRange(&foo) to work like &foo, just with the necessary adjustments to make it a proper range itself.

It this a valid expectation though? Pointers to ranges are valid InputRanges (try isInputRange!(typeof(&rng)).
Ironically where that breaks down is for auto saved = ptrRng.save; ptrRng = saved;, so RefRange just seems to fix that bit, thus supporting isForwardRange!(typeof(refRange(&rng))).
If RefRange behaved like a pointer, why would we need it in the first place?

Intuitively it seems there is a need for at least 2 use-cases:

  • Using refRange to consume (i.e. empty) a range with a normally preserving algorithm.
    Particularly useful for forward ranges that only support O(N) popFrontN.
    e.g. refRange(&fwdRng).until!(v => v > 10).sum
  • Using refRange to workaround internal copying of ranges, expecting that it preserves the current range just as passing a normal copyable one.
    e.g. refRange(&uniqRng).sum; assert(!uniqRng.empty);

There might be other use-cases I'm not aware of right now!

Please re-read your bug report https://issues.dlang.org/show_bug.cgi?id=18657#c0, it's a bit one-sided and uses complex examples to argue for "doing the obvious thing".

Sorry to play bouncer here, I fully understand the urge to just fix sth., but this problem seems important enough to warrant an open mindset.

@aG0aep6G
Copy link
Contributor Author

Pointers to ranges are valid InputRanges (try isInputRange!(typeof(&rng)).

Not generally. A pointer to an array is not an input range:

import std.range: isInputRange;
static assert(isInputRange!(int[])); /* of course */
static assert(isInputRange!(int[]*)); /* nope */

With other ranges, it breaks down after one level:

struct R { bool empty; int front; void popFront() {}}
import std.range: isInputRange;
static assert(isInputRange!R); /* obviously */
static assert(isInputRange!(R*)); /* yup */
static assert(isInputRange!(R**)); /* nope */

RefRange works in those cases.

Ironically where that breaks down is for auto saved = ptrRng.save; ptrRng = saved;, so RefRange just seems to fix that bit, thus supporting isForwardRange!(typeof(refRange(&rng))).
If RefRange behaved like a pointer, why would we need it in the first place?

I'm not saying it should behave exactly like a pointer. It should behave like a pointer, plus whatever is needed to make it work as a range. RefRange!(int[]) is a range. RefRange!SomePointerThatIsARange is a range. And, as you point out, RefRange!SomeForwardRange is a forward range. That's what RefRange has over a pointer.

As far as I see, opAssign isn't needed for those.

Intuitively it seems there is a need for at least 2 use-cases:

Using refRange to consume (i.e. empty) a range with a normally preserving algorithm.
Particularly useful for forward ranges that only support O(N) popFrontN.
e.g. refRange(&fwdRng).until!(v => v > 10).sum

Took me a bit to understand what you're getting at here. Correct me if I'm wrong.

fwdRng is of a type that behaves the same when you copy it as when you save it. For example, it's an int[]. Now, when you call until etc. on that directly, it doesn't affect the original (because the copy was a de-facto save). But when you do it with refRange, the range gets emptied.

I agree that that's a use case.

But I'd like to point out that the algorithm is only "normally preserving" when the range meets that requirement. Generally, you must assume that until invalidates the original range. That's where save and refRange come in: Both guarantee that the original remains valid. With save, the original isn't affected by pops in until. With refRange, it is.

To me, that's the point of refRange (or it should be). It's the counterpart to .save.

Using refRange to workaround internal copying of ranges, expecting that it preserves the current range just as passing a normal copyable one.
e.g. refRange(&uniqRng).sum; assert(!uniqRng.empty);

I'm not sure if I understand this one. This is what I got:

uniqRng is of a type where copying invalidates the original. It is a forward range, though. We think that sum naively makes copies instead of calling save as it should. So we can't just do uniqRng.save.sum. With the RefRange, the original range doesn't get copied, only a pointer to it, so sum can't mess it up.

I don't think we should actively cater to that. If sum has bugs, they should be fixed. But I feel like I'm missing the point here.

There might be other use-cases I'm not aware of right now!

In particular, use cases for RefRange's opAssign would be interesting.

Please re-read your bug report https://issues.dlang.org/show_bug.cgi?id=18657#c0, it's a bit one-sided and uses complex examples to argue for "doing the obvious thing".

The five examples of broken behavior aren't being used to argue for any particular solution. They just show that stuff is broken. They need fixing one way or another.

The bug report stands, even when the decision is that RefRange shall remain exactly as it is right now. I have already accepted that as a very likely outcome. Which is why this PR is here.

Sorry to play bouncer here, I fully understand the urge to just fix sth., but this problem seems important enough to warrant an open mindset.

Sure. I'm in no hurry. A little ping after three weeks of silence isn't being pushy, is it?

@thewilsonator
Copy link
Contributor

A little ping after three weeks of silence isn't being pushy, is it?

Not at all ;) sorry I've only just got around to this. This needs a rebase.

@aG0aep6G
Copy link
Contributor Author

This needs a rebase.

Done. But this needs a decision more than it needs a rebase.

Part of a series on issue 18657.
Part of a series on issue 18657.
Part of a series on issue 18657.
@aG0aep6G
Copy link
Contributor Author

Happy birthday, PR #6346! 🎉

Still waiting on someone to decide if this is the way to fix issue 18657.

@aG0aep6G
Copy link
Contributor Author

Thank you for pulling this, @thewilsonator. I hope you're aware that this was just the first step towards fixing the Bugzilla issue.

Here's the next batch: #6935. More will be needed.

@aG0aep6G aG0aep6G deleted the 18657 branch March 26, 2019 20:50
@WalterBright
Copy link
Member

This sounds like exactly the sort of "Generality Creep" identified by Andrei https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html

Perhaps RefRange needs to be fundamentally rethought instead of changing every range algorithm.

@jmdavis
Copy link
Member

jmdavis commented Mar 29, 2019

This sounds like exactly the sort of "Generality Creep" identified by Andrei https://digitalmars.com/d/archives/digitalmars/D/Generality_creep_324867.html

Perhaps RefRange needs to be fundamentally rethought instead of changing every range algorithm.

Honestly, I think that it was a huge mistake to add RefRange in the first place (and I'm the one who added it). It's trying to fight against how ranges work to make certain use cases work better, and while it works in some situations, it's ultimately a mess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants