Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabling string compairson for operators: < <= > >= #109

Merged
merged 1 commit into from
Jul 11, 2020

Conversation

HelgeffegleH
Copy link
Contributor

Split from #107.

Non-numeric operands causes a 'Type mismatch.' error.

Reason: Superseded by StrCompare which doesn't cause a loss of information. Simplifies the behaviour of the operators.
@Lexikos
Copy link
Collaborator

Lexikos commented Aug 18, 2018

From #107

Even being familiar with this type of comparison function, I have to stop and think (and refer to the documentation) every time I use it with < 0 or > 0 (because that isn't often).

Same for me, with emphasis on because that isn't often.

So do we agree that StrCompare </>/<=/>= 0 is not intuitive and must be learned through frequent repetition?

I think your implication is that you would be using it more often, and therefore would learn to recognize it more readily. I do not think it can be trivialized in that way. I do not believe I would be using StrCompare often for any kind of comparison, and by extension I do not believe the average user would become accustomed to this kind of comparison. Even if they did, there would be a period for each new user where it is hard to understand.

That aside, I do not believe that your arguments for removing the operators thus far are sufficient.

@HelgeffegleH
Copy link
Contributor Author

Hello 👋 .

I agree that StrCompare </>/<=/>= 0 is not intuitive. But it is not difficult to use.

I think your implication is that you would be using it more often, and therefore would learn to recognize it more readily.

No, I would use StrCompare as often as I use < >, and that isn't often. I never learn it, I look in the documentation when I use this kind of string compairson functions. I also always look in the documentation on what the sort callback should return, and every time I need to stop and think on how to translate, eg, stra < strb to < > 0, it is the exact same problem, and people manage it just fine.


That aside, the point isn't that StrCompare is better than the operators, although I think it is. The point is that the ability to do string compairson with the operators implies that the, by far most common, use of the operators gets more complicated and error-prone, i.e., numeric compairsons. If only numeric compairson is possible, one never has to look in the documentation on how to use this very common operator, and there is no risk of hard-to-find bugs due to accidently using a non-numeric string in a compairson, a likely real case would be a blank string.

The only reasonable cases for these operators, imo, are to only allow,

  • a) two pure numbers. numeric compairson,

together with,

  • b) two strings, numeric or not, string compairsion,

or only,

  • c) two numeric values, numeric compairson.

Compare to the current state which is,

  • d) pure number and numeric string, numeric compairson,

together with,

  • e) pure number and non-numeric string, string compairson,

plus a) and b), this is by far less intuitive and hard to keep track of than the use of StrCompare.

I'm fine with a) and b) instead of c) (this PR), but as I have understood, you value the ability to use numeric strings as numbers, hence I picked c), which is also consistent with math operators and functions. The StrCompare function is in line with the other string functions, that is, pure numeric input is converted to string. Hence, the changes #108 and #109 covers all possibilities, a-e, with non of the potential problems and better error detection 👍.

I'm not passionate about having it exactly this way, but just that a change is needed. I would be interested in seeing any other suggestions, from anyone.


A few more comments on #107

now you have to store the result of a (more verbose) function call and compare it to zero, instead of comparing A to B. CPU efficiency is improved at the expense of human efficiency.

With the function, reversing the result is as easy as multiply by -1, with the operators, you would also have to store the result or multiply each return value by -1 if you wanted to reverse the result.

For example, compare A_AhkVersion >= "2." with StrCompare(A_AhkVersion, "2.") > 0.

Even if you continue to support string compairson with operators, the better option for the above is, A_AhkVersion.Major >= 2.0, and similar for A_OsVersion.


Cheers ☕️ .

@Lexikos
Copy link
Collaborator

Lexikos commented Aug 18, 2018

But it is not difficult to use.

It is more verbose and less intuitive, and requires referring to the documentation more often. In other words, it is more difficult to write, read and use. "Difficult" is subjective, and there are degrees.

... a likely real case would be a blank string.

That reminds me I have utilized this. In a loop finding the smallest number, (n > min) is true for every numeric n if min is empty. Conversely, "N/A" is always greater than any number. It avoids the need to know and use the minimum or maximum numeric value (which is actually negative or positive infinity, which one can produce with NumPut).

The point is that [...] use of the operators gets more complicated and error-prone

I think you exaggerate the issue, and I disagree with this proposed solution. (But perhaps if we take this to the forum and a majority of other vocal users offer opinions aligned with yours, it will wear down my resistance to the idea.)

I chose the current behaviour because I felt it was intuitive at the time. I would not expect to have to refer to the documentation. Aside from the change regarding two numeric strings (which I've already explained and admitted might be a bad idea), v1 has had this behaviour for a long time, and I do not see it as a significant source of problems.

With the function, reversing the result is as easy as multiply by -1, with the operators, you would also have to store the result or multiply each return value by -1 if you wanted to reverse the result.

I don't understand. If you're not already storing the result, there is no need to store it to reverse it. I was talking about (in what you quoted) simple comparison of two values, giving a boolean result. Reversing the result is even easier than multiplying by -1.

If you are multiplying by -1, it is because you have a three-way result; for that case, it was already clear that using something which gives a three-way result is better than combining multiple operators to get it. But even so, reversing the result means exactly the same amount of additional code either way.

the better option for the above is, A_AhkVersion.Major >= 2.0

Hypothetically, this expression would always be true (and therefore pointless), since A_AhkVersion.Major is an empty string on v1. If there is any one feature that should be backward-compatible, it is A_AhkVersion.

This option implies A_AhkVersion is one of:

  • an object which has a property Major
  • a new type of value, version, which has a property Major
  • a string, where all strings have a Major property

If A_AhkVersion is not a string, it must be implicitly convertible to string. That isn't yet possible for objects. This isn't necessary in the first place, because string comparisons work for A_AhkVersion, and I take care to preserve this. For instance, I chose 2.0-a001 over 2.0.00.00-a001 because "2.0-a001" < "2.0." (and with any suffix on the latter) or even "2.0-a001" < "2.00".

Also, A_AhkVersion.Major >= 2 && A_AhkVersion.Minor >= 1 is somewhat more verbose than A_AhkVersion >= "2.1.". In theory, features like overloaded operators and custom literals can rectify this; but we don't have those luxuries yet. The alternative is a function for version comparison, which we don't need for A_AhkVersion so long as we can use string comparison.

We might even avoid the issue by using a floating-point version number (just major and minor), but that can't take into account alpha/test versions.

@nnnik
Copy link
Contributor

nnnik commented Aug 19, 2018

The fact that "" compares to numbers successfully does not seem like an advantage. How many times would you actually want to compare the number to a string and how many times did that happen just because you had a bug.

For me the latter is far greater and I can even accurately tell the amount of times I did the former: once

Is it really necessary to overload one of the most commonly used operators with a feature that is rarely used and heavily impairs the ability to find bugs related to that operator?

But I also disagree with Helgefs removal suggestion. The feature is already implemented and coded - its fairly easy to use and rather intuitive. I wouldn't want to completely remove it.

I think it might be for the best if we split the operator into 2 parts: one number comparison and one string comparison.

We already use the ~ prefix for regexmatch. It could also be used for this string comparison:

Result := var1 ~< var2

@HelgeffegleH
Copy link
Contributor Author

Hello, I'm not sure the following is containing anything new, it might just be a rephrasing (but maybe a clarification) of what I have already tried to say.

It is more verbose and less intuitive, and requires referring to the documentation more often. In other words, it is more difficult to write, read and use. "Difficult" is subjective, and there are degrees.

StrCompare is easier to use because it is clearly (not verbosely) doing string compairson, the operators are harder to use because of all the issues that have been mentioned. Again, it is not clear what type of compairson they do. For it to be clear you have to have at least one quoted non-numeric string as one of the operands, or use two literal numbers, in which case you know you do numeric compairson, hence their use is impaired with variables. So the operators are the opposite of verbose, which is much worse. Even if you clearly name your variables to indicate their type and/or value, mistakes are not causing exceptions. To be safe with the operators, much more verbose code is needed, eg,

my_less_than(var1, var2) {
	if (type(var1) == 'Integer' || type(var1) == 'Float') && (type(var2) == 'Integer' || type(var2) == 'Float')
		return var1 < var2
	throw
}

That reminds me I have utilized this

You have exemplified one of the only good candidates for built-in variables, constant values which can be resolved at load time, such as A_MinInt, ....

I think you exaggerate the issue

That is clear, it should also be clear that I think you underestimate the issue and exaggerate the usefulness of being able of to do string compairsons with the operators.

Aside from the change regarding two numeric strings (which I've already explained and admitted might be a bad idea), v1 has had this behaviour for a long time, and I do not see it as a significant source of problems.

It was not a bad idea, it solved the need to do 'str' . numeric_strA < 'str' . numeric_strB for string compairson, which I hope we do not revert to, it should be obvious that String(var1) < String(var2) is meant to do string compairson, but it might not do that with the v1 behaviour. Still, it didn't address the unnecessary complexity of the operators' behaviour depending on operand type and value. Only rarely one wants to do var < 'non_numeric_str' which is the only actual clear and reasonable way of using the operators with strings (with the current behaviour).

The greatest reliefs of leaving v1 in favour of v2 has been less of these kinds of quirks and smart behaviour / assumptions to keep track of, the addition of more exceptions, and the distinction between string and number types. My proposed changes are in line with this, imo.

But perhaps if we take this to the forum and a majority of other vocal users offer opinions aligned with yours, it will wear down my resistance to the idea.

The topic was touched on, in a recent thread, no-one agreed with me, one agreed with you. I am not trying to wear down your resistance, I recognise the fact that your view on the language is based on much more experience and knowledge of both implementation and usage, I respect that.


reverse, -1

I probably misunderstood what I quoted.

a string, where all strings have a Major property

So now you need verbosity 😉 ? I meant it to be an object with a Major property (returning a pure number).

For instance, I chose 2.0-a001 over 2.0.00.00-a001 because "2.0-a001" < "2.0." (and with any suffix on the latter) or even "2.0-a001" < "2.00".

It looks messy to me.

A_AhkVersion.Major >= 2 && A_AhkVersion.Minor >= 1

I like it very much 😉 .

string comparisons work for A_AhkVersion, and I take care to preserve this.

That is fine then, but A_OsVersion is still a problem, and probably much more relevant to compair. Being able to do string compairson with A_AhkVersion might only make users expect they can do the same things with the OS variable.

If there is any one feature that should be backward-compatible, it is A_AhkVersion

It needs only be backwards-compatible within ahk >=2.0, imo.


@nnnik, it is not a bad suggestion, I thought of having a separate set of operators too, but couldn't think of any good ones. It could also handle dependency on stringcasesense, where ~<< would be case sensitive compairson, the sacrifice is being less clear on first sight.

It also makes it harder to have compairsons with A_AhkVersion backwards-compatible with v1. I'm not concerned about that though 😄 👍 .

I am biased against the prefix ~, since ~= means != in another language I like, so ~< looks like not less than to me, there is ofc no problem in practice, once one have learned. Maybe @<, @lphabetically less, it looks a bit like a blob though 😄 .


Cheers.

@nnnik
Copy link
Contributor

nnnik commented Aug 19, 2018

I have not seen ~< been used in any way before therefore I dont think we should pay any mind to the collision. Also ~= already has a meaning in AHK. Would you value the meaning of ~= in another language over AHKs own ~=? Do you suggest to rename ~= to @=?

@HelgeffegleH
Copy link
Contributor Author

Hello 👋.

No I do not suggest that. Disregarding my initial associations, the fact that ~ is part of the regex short hand doesn't necessarily make it more suitable as a prefix for the string operators. Since ~ means bitwise-not, the negation association might be natural. But as I said, it is not a problem once learnt, but I wonder, is the ~ an important part of your suggestion? I doubt it is.

Btw, I do not value the meaning of ~= in ahk at all.

Cheers.

@jeeswg
Copy link
Contributor

jeeswg commented Aug 3, 2019

Since I started using:

  • StrCompare (by itself, and in comparator functions for use with Sort)
  • a custom VerCompare(ver1, ver2, depth:=-1, delim:=".") function
  • custom StrMax/StrMin functions

In theory, I wouldn't mind if < <= > >= stopped supporting string comparison, I need them so rarely for that purpose.

Keeping/removing string comparison; I consider both situations below.
(Either way, I think VerCompare functionality is necessary, and should not be used as an argument to maintain string comparison for the operators.)

VerCompare

I'd recommend VerCompare be a built-in function.
AKA VersionCompare/IPCompare/DSLCompare (delimiter-separated list).
Or NumCompare, possibly with a 'V' (version) option.

Relevant code: How to check if an IPv6 address belongs to a network? But for dot-delimited strings, not arrays.

Maintaining string comparison: possibilities

If string comparison were maintained for the operators, we should probably always ignore type, so that we have simple easy-to-remember rules.
(Also, it's confusing to consider whether returning a value via Format or SubStr etc, will force a numeric or string comparison.)

If two items look numeric, they should be compared numerically.
Else (if at least 1 item looked non-numeric), a string comparison should be done, based on A_StringCaseSense.
The types should be ignored.

The finer points are considered here:
numbers/strings in expressions - AutoHotkey Community

Note: re. numeric/string comparison, this issue remains: Bug: StrSplit.

(Since types would be ignored, StrCompare would be used to force a string comparison. Or perhaps StrOp or one of the other ideas below.)

Or if type were to be considered, when doing comparisons:

Int/Float/Num and Str functions could force numeric/string comparisons.
There is no 'Num' function at present.

Removing string comparison: possibilities

StrCompare:
StrCompare is not intuitive, but has a handy rule (when you compare with 0):
(a OP b) = (StrCompare(a, b) OP 0)
e.g. (a > b) = (StrCompare(a, b) > 0)

StrOp:
(a OP b) = StrOp(a, "OP", b)
e.g. (a > b) = StrOp(a, ">", b)
StrOp/NumOp functions, would be useful in complex workhorse functions.
Somewhat similar to BINOP, here, although I imagined XXXOp(arg1, op, arg2) and XXXOp(op, arg1) and XXXOp(arg1, op1, arg2, op2, arg3).
Arithmetic/Rational - Rosetta Code

StrMatch:
Something like this: StrMatch(var1, var2, 0, "<")

[EDIT:] Array functionality:
Return the index(es)/value(s) of the key(s) with the max/min string/numeric value(s).

Operators (not recommended, because the symbols are ugly and too numerous):
String versions of the existing operators, with a leading character:
Perhaps one of:
~< ~> ~<= ~>= ~<< ~>> ~<<= ~>>=
@< @> @<= @>= @<< @>> @<<= @>>=
$< $> $<= $>= $<< $>> $<<= $>>=
There are 32 possible leading chars (chars 33-126 minus [0-9A-Za-z]) to choose from:
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
For reference:
String comparison - Rosetta Code

Further points

Other potential innovations:

StrEquals:
More intuitive than StrCompare for common comparisons.
More versatile and more easily portable to other languages than the = == != !== operators.
if !StrCompare(a, b, casesen) = if StrEquals(a, b, casesen)
More fully:
if var in abc,def,ghi ;AHK v1
if StrEquals(var, "abc,def,ghi", 0, ",")
if StrEquals(var, "abc,def,ghi", 0, A_Comma) ;A_Comma for added clarity
if StrEquals(haystack, needle string/array, case sen., delim.)
And likewise: StrContains, StrStarts, StrEnds.

Changing/removing A_StringCaseSense (not recommended):
I might consider removing A_StringCaseSense from AutoHotkey.
If so, I suppose 'in' and 'contains' would be case insensitive.
Although, A_StringCaseSense might still be useful, to indicate the mode of case insensitivity (0/Locale).

Redefining operators/operator overloading:
Being able to override binary operators with custom functions.

Modes for operators:
I'd mention, but don't especially want: a mode that warned of non-numeric strings/types being compared.

@Lexikos Lexikos merged commit fb1f58a into AutoHotkey:alpha Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants