Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new lazy range: merge() #3315

Merged
merged 18 commits into from Jun 12, 2015
Merged

Add new lazy range: merge() #3315

merged 18 commits into from Jun 12, 2015

Conversation

nordlow
Copy link
Contributor

@nordlow nordlow commented May 27, 2015

In compliance with C++ STL std::merge: http://en.cppreference.com/w/cpp/algorithm/merge
but variadic and with bidirectional access, of course! :)

Implementation is a tweak of roundRobin extended to support bidirectional access iff all its input ranges does.

Respects CommonType of the ElementTypes of the input ranges.

Propagates infiniteness if any of the inputs is infinite.

  • How about putting CommonElementType in std.traits?
  • How do we assert that the inputs are sorted according to the template parameter pred?
  • How about putting isSortedRange in std/range/primitives.d?
  • There might be some clever optimization of the min calculations in successive calls to popFront for a large number of parameters. But that's perhaps outside the scope of this implementation.

@nordlow nordlow changed the title Add lazy range merge() Add new lazy range merge() May 27, 2015
@nordlow nordlow changed the title Add new lazy range merge() Add new lazy range: merge() May 27, 2015
@@ -8647,3 +8647,237 @@ if (is(typeof(fun) == void) || isSomeFunction!fun)

auto r = [1, 2, 3, 4].tee!func1.tee!func2;
}

template CommonElementType(Rs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this package-qualified? If not, keep it package for now because it's not supposed to be public (not documented). Let's decide in a future pull whether this belongs to std.traits (or possibly std.range).

@andralex
Copy link
Member

andralex commented Jun 1, 2015

Great work! Please address these nits and then let's get this in. Thank you, really impressed!

@nordlow
Copy link
Contributor Author

nordlow commented Jun 1, 2015

How do we assert that the inputs are sorted according to the template parameter pred? I'm missing an extension of

package alias isSortedRange(R) = isInstanceOf!(SortedRange, R);

as

package alias isSortedRangeAs(R, alias pred) = isInstanceOf!(SortedRange, pred, R);

but this is not currently supported by isInstanceOf.

Could isInstanceOf be extended or do we need a new trait?

@nordlow
Copy link
Contributor Author

nordlow commented Jun 1, 2015

Added propagation of infiniteness.

@nordlow
Copy link
Contributor Author

nordlow commented Jun 1, 2015

@andralex Ping :)

@andralex
Copy link
Member

andralex commented Jun 5, 2015

Yah I think you use auto ref with front and back in a subsequent diff. We did the same with SortedRange, BinaryHeap etc - i.e. we do offer sorting and sorted access, but do not stop the user from breaking it if they so wish.

@andralex
Copy link
Member

andralex commented Jun 5, 2015

@nordlow ping :)

@nordlow
Copy link
Contributor Author

nordlow commented Jun 5, 2015

  • So, again
foreach (ref e; m)
{
    e = 100;
}

has currently no sideeffect on m. Can anybody tell me why?

  • What do you mean with your statement

"Yah I think you use auto ref with front and back in a subsequent diff."

I've always used auto reffor front and back in this PR.

  • And again:

How do we assert that the inputs are sorted according to the template parameter pred (see comment above)?


void popFront()
{
final switch (_lastFrontIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a bug that final switch is allowed here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roundRobin uses the final switch pattern aswell.

If I remove the final I get

Error: switch statement without a default; use 'final switch' or add 'default: assert(0);' or add 'default: break;'

I think we should leave it as is and remove all uses of final switch in one go if this is wrong.

@lomereiter
Copy link
Contributor

Hello,
I find it very confusing that now we'd have two very similar functions in Phobos - the other one is std.algorithm.setops.setUnion which seems to differ only in not offering bidirectionality.
Since no one even mentioned it, maybe it's just a documentation failure, but maybe functions from std.algorithm.setops should also mirror the respective C++ counterparts and not include duplicated elements in the output.

@nordlow
Copy link
Contributor Author

nordlow commented Jun 10, 2015

Is see two directions here:

  • Add support for bidirectional access in setUnion and make an alias merge to it.
  • The reverse; make setUnion be an alias for merge.

In both cases there is one problem however; what to do about the fact that setUnion returns a non-voldemort type SetUnion?

@andralex ?

@lomereiter I don't under stand why mention duplicate elements; setUnion doesn't deduplicate equal consecutive elements. That's the purpose of uniq. And neither does merge.

@andralex
Copy link
Member

@nordlow I'm not sure what the matter with that loop is without context. It should work if all ranges have the same type. I'll pull this because that matter doesn't seem to block it. If anything fishy, please submit a bug report as a runnable program using the already-pulled merge().

@andralex
Copy link
Member

Auto-merge toggled on

andralex added a commit that referenced this pull request Jun 12, 2015
@andralex andralex merged commit a99e1c4 into dlang:master Jun 12, 2015
@lomereiter
Copy link
Contributor

@nordlow I mentioned duplicates because you referred to std::merge from STL, and std::set_union performs deduplication.

@nordlow
Copy link
Contributor Author

nordlow commented Jun 12, 2015

@andralex Shouldn't I have done a rebase first?

@nordlow
Copy link
Contributor Author

nordlow commented Jun 12, 2015

@lomereiter Why doesn't setUnion do deduplication by default if std::set_union does that? An oversee? Anyhow the C++-style set-union can no be implemented in Phobos by reusing merge and uniq to reduce code duplication.

@lomereiter
Copy link
Contributor

@nordlow I agree, and I'd prefer to have setUnion calling merge and uniq (but that's hard now that there's a policy of non-breaking changes). git-blame tells that it was added long ago (2009) by @andralex so better ask his opinion on this

@nordlow
Copy link
Contributor Author

nordlow commented Jun 12, 2015

Ok, great. Then I know.

Thanks @andralex for a quick response! Made my day :)

@andralex
Copy link
Member

Wait, what are the commonalities and distinctions among setUnion, nWayUnion, and merge?

@andralex
Copy link
Member

The idea behind setUnion not doing deduplication is that it's technically easier to remove information than add it back. In this case uniq is the ticket.

Sadly it seems to me we should not add merge, but instead add its capabilities into setUnion. @nordlow what do you think?

@nordlow
Copy link
Contributor Author

nordlow commented Jun 13, 2015

I see the following directions forward:

  1. Rewrite setUnion to support the goodies introduced in PR and then make merge and alias for setUnion. However, this will make merge suffer from the fact that setUnion returns a non-Voldemort type This is a bad pattern and can be especially confusing for the user as its naming has nothing with merge to do.
  2. Keep merge and setUnion as is but add a deprection warning for setUnion that its return type will be changed to a Voldemort type in the future. Then after a year or so change setUnion to be an alias for merge. Or vice versa.
  3. Apply the change now possibly causing breakage for users who explictly refer to the struct SetUnion.

I vote for the second alternative.

@CyberShadow
Copy link
Member

Sadly it seems to me we should not add merge, but instead add its capabilities into setUnion. @nordlow what do you think?

So, any thoughts on this? The release is looming. If anything we should revert this pull or de-document the function to buy more time.

@nordlow
Copy link
Contributor Author

nordlow commented Jun 25, 2015

As already mentioned above I stand with the second alternative. @andralex?

@CyberShadow
Copy link
Member

@nordlow Since @andralex is busy with the move, what do you say about reverting this until everyone has more time for a more involved resolution?

@nordlow
Copy link
Contributor Author

nordlow commented Jun 27, 2015

Ok.

@CyberShadow
Copy link
Member

#3451

@wilzbach
Copy link
Member

Great work! Please address these nits and then let's get this in. Thank you, really impressed!
...
Sadly it seems to me we should not add merge, but instead add its capabilities into setUnion. @nordlow what do you think?

@nordlow would you have time to revive your work from the dead and incorporate the additional capabilities into merge?

@nordlow
Copy link
Contributor Author

nordlow commented Aug 29, 2016

I'm waiting for a decision from @andralex on which of the 3 directions to go (see above). Shall I give a ping?

Update: Ahh, I just saw your work on #4249 (comment)

I'll look into it.

Update: @wilzbach Is deduplication of consecutive elements the only thing that is missing in this PR? If so should we just add a parameter bool deduplicate?

@wilzbach
Copy link
Member

Update: @wilzbach Is deduplication of consecutive elements the only thing that is missing in this PR? If so should we just add a parameter bool deduplicate?

Are you referring to adding a parameter deduplicate to setUnion? I am not sure whether this is necessarily a good idea as the plan was to go through the deprecation road, remove it completely and later reintroduce it with changed behavior. Additionally SetUnion!(FilterDuplicates.yes) looks more verbose than setUnion.uniq

.. on which of the 3 directions to go (see above).

Rewrite setUnion to support the goodies introduced in PR and then make merge and alias for setUnion. However, this will make merge suffer from the fact that setUnion returns a non-Voldemort type This is a bad pattern and can be especially confusing for the user as its naming has nothing with merge to do.

You now have Merge as exposed struct. There are a lot of publicly exposed structs and afaik there is no plan to deprecate them because the breakage comes with no real benefit.
-> I believe that's the way to go

(I have tried to deprecate public structs before, it's really not worth it)

Keep merge and setUnion as is but add a deprection warning for setUnion that its return type will be changed to a Voldemort type in the future. Then after a year or so change setUnion to be an alias for merge. Or vice versa.

The deprecation warning is now triggered for setUnion, but apart from that we can't change the behavior.

Apply the change now possibly causing breakage for users who explictly refer to the struct SetUnion.

Breakage needs to be avoided.

@nordlow
Copy link
Contributor Author

nordlow commented Aug 29, 2016

Ok. AFAICT the plan is:

  • Make setUnion in the long run do deduplication by default. Which doesn't affect this PR.
  • Add my merge implementation without no adding nor removal of features.

Is this correct?

Further, I'm all for making Merge a Voldemort type (private static struct Result)

To resurrect this work, is a new PR the way to go?

@wilzbach
Copy link
Member

Make setUnion in the long run do deduplication by default. Which doesn't affect this PR.
Add my merge implementation without no adding nor removal of features.

I am not sure what you mean with removing features. I had a quick look (from my phone) and I think your PR contains at least these goodies:

  • infiniteness propagation
  • keeping track of the back index and thus bidirectional access
  • using size_t for the index
  • using final switch instead of adjustPosition

So yeah - you can basically paste it over the existing merge, but please don't remove existing public symbols like ElementType as that might cause breakage.

Further, I'm all for making Merge a Voldemort type (private static struct Result)

The symbol Merge hasn't been released yet, so this could happen without a deprecation.
-> Probably the best to attempt this in a separate PR that just sets the package visibility toMerge?

To resurrect this work, is a new PR the way to go?

Yes please (there is no way to reopen a merged PR)

@nordlow nordlow deleted the add-merge-range branch October 19, 2021 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants