Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 13595: Extend groupBy to handle non-equivalence relations. #2654

Merged
merged 5 commits into from
Dec 1, 2014

Conversation

quickfur
Copy link
Member

@quickfur quickfur commented Nov 4, 2014

Update DDocs.

Add unittest for 13595.

Supporting non-equivalence relations (treated as a predicate between adjacent elements) allows for more interesting predicates, like maximum adjacent difference (added as ddoc example), word boundary testing, etc.. This comes at a performance cost, though, so the isEquivRelation parameter was added to fallback to the original implementation when the user is sure their predicate is an equivalence relation.

@quickfur quickfur changed the title Extend groupBy to handle non-equivalence relations. Issue 13595: Extend groupBy to handle non-equivalence relations. Nov 4, 2014
@bearophile
Copy link

Beware of "Boolean Blindness" (http://existentialtype.wordpress.com/2011/03/15/boolean-blindness/ ).

@quickfur
Copy link
Member Author

quickfur commented Nov 4, 2014

I thought about that, but recently Andrei seems ill-disposed towards the use of Flags for this purpose. It's trivial to add, though.

@quickfur
Copy link
Member Author

quickfur commented Nov 4, 2014

On second thoughts, using an enum for it (like the rest of std.algorithm) seems to be widespread, so I'll settle with that for now. It does make the code more readable, for sure.

@quickfur quickfur mentioned this pull request Nov 5, 2014
@yebblies
Copy link
Member

yebblies commented Nov 6, 2014

Why not a second name? eg groupByRelation? Then you can avoid the downsides of both a raw boolean and a single-use enum. They'd both forward to the same impl function anyway. Also, maybe normal groupBy should check that the first element compares equal to itself.

@quickfur
Copy link
Member Author

quickfur commented Nov 6, 2014

Hmm. That's definitely a possibility, but I don't know if groupByRelation is a good name. The distinction is more in whether the predicate is an equivalence relation or not; the current default is to assume it's not. Sadly, I can't think of a better name to use. Any other ideas?

@quickfur
Copy link
Member Author

Rebased.

@mihails-strasuns
Copy link

I like the changes, it is much less confusing for newcomers that way. No idea if groupByRelation is proper name though. @bearophile ?

@quickfur
Copy link
Member Author

Hmm. What about groupByEquivRelation? The idea being that by default, we assume nothing about the predicate, so the nice, short name groupBy handles that case, but if the user knows specifically that something is an equivalence relation, then they can get a slight performance boost by using groupByEquivRelation instead. What do you think?

@mihails-strasuns
Copy link

Yeah that makes sense - fits D principle "safe and powerful by default, extra performance if asked for explicitly". Again, won't judge naming choice :P

@MetaLang
Copy link
Member

groupByEquivRelation seems far too long to me, in the same realm as binaryReverseArgs, and so nobody will want to use it.

@bearophile
Copy link

I like to write, but I am not native English, so my choice of very short names is sometimes not the best.

I think groupByEquivRelation is a bad name. I agree that "binaryReverseArgs" is a too much long name and I am discouraged to use such function just because of its name, it should be renamed "flipTwo" or something similar.

I think that using an enum or Flag is better for this specific case because inventing a fitting name is not easy and it risks becoming a too much specialized function. As usual you need to balance different desires and constraints.

@mihails-strasuns
Copy link

Keeping groupBy with flag but making default behavior to not require equivalence is also a valid option.

@mihails-strasuns
Copy link

(misclick)

@quickfur
Copy link
Member Author

So everyone agrees that the previous version (two overloads of groupBy with compile-time flag) is better? Should I revert this last commit then?

@mihails-strasuns
Copy link

Well, unless some mathematician comes and proposes short and clear name for groupByEquivRelation :P

@quickfur
Copy link
Member Author

What about just groupbyEquiv? With Equiv being short for "equivalence", and an equivalence being shorthand of "equivalence relation". Or is that too much of a stretch? :D

@quickfur
Copy link
Member Author

ping

@mihails-strasuns
Copy link

With no new opinions I'd call for going back to version with flag (and using slower version that does not require equivalence as default)

@quickfur
Copy link
Member Author

Alright, reverted back to the version with the flag. Should Flag be used, or is the enum OK for now?

@bearophile
Copy link

In my code I'll use this function only rarely. What I usually look for is a hashGroup (that returns an associative array).

@mihails-strasuns
Copy link

Should Flag be used, or is the enum OK for now?

I have no preferences on this topic so whatever suits you.

@bearophile I think it should be possible to define .array overload that turns returned range in such associative array. What would be the key though?

@quickfur
Copy link
Member Author

Recently there were some PR's that changed single-use enums into aliases of Flag, like:

alias KeepTerminator = Flag!"keepTerminator";

I think it's a good idea, even though std.algorithm doesn't currently use it. It also lets you use the Yes and No templates -- of questionable value, but at least Phobos should dogfood its own utilities, otherwise what's the use, right?

H. S. Teoh added 2 commits November 27, 2014 10:20
Update DDocs.

Add unittest for 13595.
To prevent Boolean Blindness.
@quickfur
Copy link
Member Author

Rebased, use Flag instead of single-use enum.

@quickfur
Copy link
Member Author

Renamed.

@mihails-strasuns
Copy link

LGTM

@quickfur
Copy link
Member Author

Thanks!

@bearophile
Copy link

@Dicebot, the semantics is different so we'll probably need a different function (like hashGroup or something similar).

@DmitryOlshansky
Copy link
Member

Auto-merge toggled on

@DmitryOlshansky
Copy link
Member

Fine by me

DmitryOlshansky added a commit that referenced this pull request Dec 1, 2014
Issue 13595: Extend groupBy to handle non-equivalence relations.
@DmitryOlshansky DmitryOlshansky merged commit 5986e74 into dlang:master Dec 1, 2014
@quickfur quickfur deleted the issue13595b branch December 1, 2014 02:44
@bearophile
Copy link

@quickfur I've found a problem, but I am not sure if the cause is groupBy. I have put the code in Bugzilla, in issue 13595.

@andralex
Copy link
Member

andralex commented Jan 9, 2015

I think we need to revert this until https://issues.dlang.org/show_bug.cgi?id=13936 reaches a conclusion.

@quickfur
Copy link
Member Author

quickfur commented Jan 9, 2015

Then you'll have to take out #1453 as well, since that's the original implementation. This one is just a fixup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants