Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Ternary comparison operator #4108

Open
1 of 4 tasks
CyrusNajmabadi opened this issue Nov 5, 2020 · 37 comments
Open
1 of 4 tasks

Proposal: Ternary comparison operator #4108

CyrusNajmabadi opened this issue Nov 5, 2020 · 37 comments
Assignees
Labels
Needs Approved Specification This issue needs an LDM-approved specification Proposal champion
Milestone

Comments

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Nov 5, 2020

Ternary comparison operator

  • Proposed
  • Prototype: Not Started
  • Implementation: Not Started
  • Specification: Not Started

Summary

This would allow users to write a simplified x < y < z test as shorthand for x < y && y < z.

Detailed design

This code is already parseable today and already has meaning. Indeed, here is a (albeit pathological) case where this would compile today:

using System;
public static class Program {
    public static void Main() {
        var (x, z) = (new C(), new C());
        Console.WriteLine(x < 5 < z);
    }
}

public struct C
{
    public static bool operator <(C x, C y) => true;
    public static bool operator >(C x, C y) => true;
    
    public static C operator <(C x, int y) => x;
    public static C operator >(C x, int y) => x;
}

In order for this type of code to compile today, you'd need to do very strange operator overloading which would not be expected in practice. However, because of back compat, we would likely have to support this.

As such, when processing a binary expression of hte form expr1 op expr2 op expr3 we would have to bind in the same fashion as today. However, if that form failed to bind the operators successfully, we would now reinterpre the above as:

expr1 op1 expr2 op2 expr3, where op1 and op2 are one of >, <, >=, <= is reinterpretted as:

var __t1 = expr1;
var __t2 = expr2;

var r = false;
if (__t1 op1 __t2)
{
    var __t3 = expr3;
    if (__t2 op2 __t3)
        r = true;
}

This should match intuition here and would mean code executes (including order of evaluation and short-circuiting) as expected.

Drawbacks

Potential confusion over a piece of code potentially having two different meanings. However, this is so unlikey as no real codebases should ever have been using a < b < c. It just isn't a reasonable code pattern today, so no one uses it or expects it to work the way it does today. Practically all users looking at this would expect it to have the semantics this proposal is suggesting.

Design meetings

@HaloFour
Copy link
Contributor

HaloFour commented Nov 5, 2020

It was noted that there is a NuGet package which uses an extension method and some overloaded operator chicanery to mimic this language feature: dotnet/roslyn#136 (comment)

However, if the compiler will only treat this expression as a range comparison if it wouldn't otherwise compile then I would imagine any existing code that might rely on this package would continue to compile and work as expected. But it should probably be included as a part of the test suite to ensure no unintended breaks.

@CyrusNajmabadi
Copy link
Member Author

Good to know. This would fall under the portion of the design of: we will existing semantics and always interpret that way if it succeeds. We only use the new semantics if it doesn't.

That example is also interesting in that it's trying to provide these semantics. So a great part about this is that once we have this feature, you can stop using that lib :)

@YairHalberstadt
Copy link
Contributor

we would have to bind in the same fashion as today.

How would this be defined to avoid making further changes to binding not overly complex?

For example if C# adds increases the number of places implicit operators can be used in the future, or changes lookup, or improves overload resolution, we would need to make sure all those things run only after we check if it matches this new ternary comparison operator. This might force the binding code to become very contorted to allow this.

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Nov 5, 2020

honestly, i would say: given an tree of the form

      op1
     /  \
expr1    op2
        /  \
   expr2    expr3

We have the existing rules. We then have a clause htat says that if that produces an error (which we use terminology for in lambda-resolution) then rebind with such-and-such expected semantic rewrite. In general we discuss validity, and binding-time errors. This would apply here. If the normal interpretation results in an invalid (or binding-time-error), we try the new interpretation.

I'm not sure any of the cases mentioned so far necessary matter. Or, if they do, they are starting to get to the cornerest corner of all corners :)

@alrz
Copy link
Contributor

alrz commented Nov 5, 2020

Linking back to my comment here #4106 (reply in thread)

Since you're going to deal with some semantic ambiguities anyways, the postfix pattern does seem to be a viable alternative, except that there will be syntax ambiguities instead (but not in any useful scenarios mentioned here).

Opened a new discussion: #4110

@LRC-H
Copy link

LRC-H commented Nov 6, 2020

很好

@huoyaoyuan
Copy link
Member

Can there be a warning wave when bindinh to the old behavior?

@CyrusNajmabadi
Copy link
Member Author

Can there be a warning wave when bindinh to the old behavior?

I don't see any point in doing that

@hez2010
Copy link

hez2010 commented Nov 9, 2020

There may be some ambiguities need to be resolved, such as A < B < C > D > F.

@AdamSpeight2008
Copy link

Wouldn't this potentially break compatibility? Those operators are not guaranteed to represent comparisons. the user is free to give them different semantics.

@AdamSpeight2008
Copy link

AdamSpeight2008 commented Nov 22, 2020

This would have to be under a feature flag, so that the change in semantic doesn't change errors in previous version.

Can work just via the binder and lowering.
eg. a < b <= c
Requires all the operands a b c to be the same type;
and that that type have operator < equivalent to Func< T, T, Bool>.
and that that type have operator <= equivalent to Func< T, T, Bool>.
Then in lowering compiler generate ( (a < b) && (b <= c).

It was noted that there is a NuGet package which uses an extension method and some overloaded operator chicanery to mimic this language feature: dotnet/roslyn#136 (comment)

I wrote that nuget package. Note it use IComparable(Of T) has a the check to see if the type has "comparison operators".
It also introduces 2 new class types _0(Of T) (the lifted type) and _1(Of T) to provide access to the second "comparison operator.

@canton7
Copy link

canton7 commented Nov 22, 2020

Compatibility concerns are fully addressed in the OP, no?

@CyrusNajmabadi
Copy link
Member Author

Wouldn't this potentially break compatibility?

No. The specification can effectively be read to say:

If this has legal semantics under C# 9. The use those semantics. Othewise, try the new semantics. So nothing can break.

@AdamSpeight2008
Copy link

AdamSpeight2008 commented Nov 22, 2020

OP doesn't mention if a b c can or can not be different types.
My addition to the proposal, by restricting to only the same focuses it to the conventional comparison operations. Func<Func<T, T, Bool>, T, Bool>
Which is currently an error.

@CyrusNajmabadi
Copy link
Member Author

I wrote that nuget package.

Your package will retain its semantics.

@CyrusNajmabadi
Copy link
Member Author

OP doesn't mention if a b c can or different types.

It can operator on different types as per the rewrite rule i've specified. The rule is agnostic to that.

@AdamSpeight2008
Copy link

AdamSpeight2008 commented Nov 22, 2020

@CyrusNajmabadi Are you restricting to the "comparison" operator only if it returns boolean?
As if wasn't, it would cause that nuget package to have incorrect semantics.

lowerValue.__() <= value <= upper
' types produced during evaluation.
T.__()  --> _0(Of T) ' The "Lifted" type
_0(Of T) <= T --> _1(Of T)  ` provide the first operators (<, <=)
_1(Of T) <= T --> Boolean   ' provides the second operators (<. <=)

@CyrusNajmabadi
Copy link
Member Author

@CyrusNajmabadi Are you restricting to the "comparison" operator only if it returns boolean?

No

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Nov 22, 2020

As if wasn't, it would cause that nuget package to have incorrect semantics.

Any existing semantics would be preserved.

the general intuition for the algorithm is:

  1. try existing C# 9.0 semantics. If that succeeds, those are the semantics to use.
  2. otherwise, if that fails, interpret a < b < c < d < e as a < b && b < c && c < d && d < e and try to bind. if that succeeds, emit in that fashion.
  3. otherwise, fail.

@NN---
Copy link

NN--- commented Dec 20, 2020

With C# 9 it can be written
a is > 5 and < 10 which doesn’t have ambiguity and allows to mix and match comparisons.

@FaustVX
Copy link

FaustVX commented Dec 20, 2020

@NN---

With C# 9 it can be written
a is > 5 and < 10 which doesn’t have ambiguity and allows to mix and match comparisons.

No, because, Pattern Matching works only with constant, and with this proposal, it could works with any value (variables, properties, …)

@333fred 333fred added this to the Any Time milestone Feb 2, 2021
@333fred 333fred added the Needs Approved Specification This issue needs an LDM-approved specification label Feb 2, 2021
@333fred 333fred removed this from TRIAGE NEEDED in Language Version Planning Feb 6, 2021
@Daynvheur
Copy link

Daynvheur commented Jul 30, 2021

Little comment here (thanks to @alrz for the link) as I've opened a very similar discussion at #4980 , but with an or comparison.
Thus, question: Would this ternary comparison only work on and mode? if not, how would you express A < C [or] A > B?

@Xyncgas
Copy link

Xyncgas commented Jan 13, 2022

Potential confusion over a piece of code potentially having two different meanings.

We have <= and => today, tell me they are the same thing.

@Korporal
Copy link

Korporal commented Jan 20, 2023

There may be some ambiguities need to be resolved, such as A < B < C > D > F.

What ambiguity is in that fragment? Creating the parse tree for that is not ambiguous so far as I can see.

@CyrusNajmabadi
Copy link
Member Author

It would need appropriate lookahead to ensure that generics, variables, and comparisons were properly handled. It's likely not a full ambiguity, but a local one.

@333fred
Copy link
Member

333fred commented Jan 20, 2023

What ambiguity is in that fragment?

Is it A<B<C>D> F (a declaration of a variable F with type A<B<C>D> and a syntax error on a missing ,) or a chained comparison operator?

@Korporal
Copy link

Korporal commented Jan 20, 2023

What ambiguity is in that fragment?

Is it A<B<C>D> F (a declaration of a variable F with type A<B<C>D> and a syntax error on a missing ,) or a chained comparison operator?

Yes I see that when there's no context, but in the case of an if statement we have the context, the grammar rules of the if.

A declaration isn't legal (well I can't see it myself) inside the expression that's part of an if. The parser would only encounter this when parsing a conditional expression. C# already correctly distinguishes between generics and expressions.

C# seems to already parse this correctly too:

int A = 0;
int B = 0;
int C = 0;
int D = 0;
int F = 0;

if (A<B<C>D>F)
{
    ;
}

The diagnostic it reports is not about syntax but a semantic problem, the types that are used either side of the operator <.

This too suggests that if the semantic checking were replaced, the expression interpreted as a chained compare, it would work.

@FaustVX
Copy link

FaustVX commented Jan 20, 2023

There may be some ambiguities need to be resolved, such as A < B < C > D > F.

As far as I can test, the compiler is already happy with that syntax
sharplab.io

@CyrusNajmabadi
Copy link
Member Author

Yes I see that when there's no context, but in the case of an if statement we have the context, the grammar rules of the if.

Yes. This is an example of us addressing the potential issue. The situations brought up are there so we ensure we think through it and make sure the language is spec'ed such that it is not an issue and so that the compiler operates as expected with appropriate tests.

@theking2
Copy link

I really don't understand this. There is also ambiguity with another ternary operator ?:. It was addressed there why is op cmp op cmp op different?
Clearly there are ambiguities but with ?: we seem perfectly fine with those.

@CyrusNajmabadi
Copy link
Member Author

I really don't understand this. There is also ambiguity with another ternary operator ?:

What ambiguity are you referring to?

Clearly there are ambiguities but with ?: we seem perfectly fine with those.

No one said we are not fine with ambiguities... As i said in the post above yours, we would just need to be cognizant so we can spec hte language out properly, and ensure the compiler does the right thing.

@theking2
Copy link

what does a? b? c: d: do and what does a < b < c < d do. both ambiguous . () needed to fix this

@CyrusNajmabadi
Copy link
Member Author

a? b? c: d:

The code is incomplete and will be a failure to parse. Did you miss a e at the end? I'm not sure if there's an accidental omission or a purposeful one.

Assuming it's accidental, and you meant: a? b? c: d: e, then there is no ambiguity here (syntactic or otherwise).

Can you please clarify?

@CyrusNajmabadi
Copy link
Member Author

and what does a < b < c < d do

Currently it is syntactically legal, and may have semantic meaning in esoteric cases. This proposal discusses giving it legal meaning in the case where it is illegal today.

@Daynvheur
Copy link

Random concerns:

  • What would be the interest (if any) of a non- short-circuited version of this? (Making sure that a < b < c < d < e do evaluate the e part.)
  • Syntactic analysis simplifications: a < 3 < 1 could be simplified as a < 1? (a required to be less than 3 and less than 1 is satisfied with just a required to be less than 1.)
  • Warn about impossible conditions: 3 > a < 1? (a required to be more than 3 and less than 1 can never be satisfied.)

@CyrusNajmabadi
Copy link
Member Author

What would be the interest (if any) of a non- short-circuited version of this? (Making sure that a < b < c < d < e do evaluate the e part.)

My thought here is that this is shorthand for a < x && x < b. So it shoudl have the same shortcircuiting behavior.

Syntactic analysis simplifications: a < 3 < 1 could be simplified as a < 1? (a required to be less than 3 and less than 1 is satisfied with just a required to be less than 1.)

Looks like the space for an optimization pass, or an analyzer message. Not really in scope for the language itself.

Warn about impossible conditions:

That seems reasonable. We already do constant analysis and requisite flow control for similar htings today:

image

So this could similarly evaluate to always being false (if 'a' is an integer of course).

@MgSam
Copy link

MgSam commented May 29, 2024

C++ is also looking to add this; albeit with disallowing the potentially ambiguous forms like a < b > c. I think if C# adds this the ambiguous forms should also be disallowed; they add no benefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Approved Specification This issue needs an LDM-approved specification Proposal champion
Projects
None yet
Development

No branches or pull requests