Chained comparison evaluates tied scalar twice #17692

briandfoy · 2020-04-04T08:31:13Z

Description

The perlop docs describe the current behavior, so this doesn't contradict anything.

A chained comparison in 5.31.10 appears to evaluate the middle condition twice for a tied scalar but doesn't warn about it. Since a programmer might not realize they are using a tied variable, they might see odd behavior.

Steps to Reproduce

use v5.31.10;

package UnstableScalar {
	use parent qw(Tie::Scalar);
	sub TIESCALAR { bless {}, $_[0] }
	sub FETCH {
		my $value = int rand 10;
		say "Fetching scalar with $value";
		return $value;
		}
	}

tie my $unstable, 'UnstableScalar';

if( 5 < $unstable < 9 ) {
	say "Found a value between 5 and 9";
	}

The comparison short circuits just fine, but if the first comparison is true, the tied scalar is evaluated again (so things such as Tie::Cycle will unexpectedly progress or skip values):

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 2

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 0

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 9
Fetching scalar with 2
Found a value between 5 and 9

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 6
Fetching scalar with 9

Expected behavior

I expect that the scalar would only call FETCH once and reuse the result in the second comparison. perlop says this may happen, but perhaps a warning would be useful here. Ideally, tied variables would be evaluated once.

Is there a chance that the internals could recognize this and use the tied object instead of the scalar itself? This works as expected:

	if( 5 < tied($unstable)->FETCH < 9 ) {
		say "Found a value between 5 and 9";
		}

Perl configuration

Summary of my perl5 (revision 5 version 31 subversion 10) configuration:

  Platform:
	osname=darwin
	osvers=19.3.0
	archname=darwin-2level
	uname='darwin otter.local 19.3.0 darwin kernel version 19.3.0: thu jan 9 20:58:23 pst 2020; root:xnu-6153.81.5~1release_x86_64 x86_64 '
	config_args='-des -Dprefix=/usr/local/perls/perl-5.31.10 -Dusedevel'
	hint=recommended
	useposix=true
	d_sigaction=define
	useithreads=undef
	usemultiplicity=undef
	use64bitint=define
	use64bitall=define
	uselongdouble=undef
	usemymalloc=n
	default_inc_excludes_dot=define
	bincompat5005=undef
  Compiler:
	cc='cc'
	ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -DPERL_USE_SAFE_PUTENV'
	optimize='-O3'
	cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
	ccversion=''
	gccversion='4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.17)'
	gccosandvers=''
	intsize=4
	longsize=8
	ptrsize=8
	doublesize=8
	byteorder=12345678
	doublekind=3
	d_longlong=define
	longlongsize=8
	d_longdbl=define
	longdblsize=16
	longdblkind=3
	ivtype='long'
	ivsize=8
	nvtype='double'
	nvsize=8
	Off_t='off_t'
	lseeksize=8
	alignbytes=8
	prototype=define
  Linker and Libraries:
	ld='cc'
	ldflags =' -mmacosx-version-min=10.15 -fstack-protector-strong -L/usr/local/lib'
	libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /usr/lib
	libs=-lpthread -ldbm -ldl -lm -lutil -lc
	perllibs=-lpthread -ldl -lm -lutil -lc
	libc=
	so=dylib
	useshrplib=false
	libperl=libperl.a
	gnulibc_version=''
  Dynamic Linking:
	dlsrc=dl_dlopen.xs
	dlext=bundle
	d_dlsymun=undef
	ccdlflags=' '
	cccdlflags=' '
	lddlflags=' -mmacosx-version-min=10.15 -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
	HAS_TIMES
	PERLIO_LAYERS
	PERL_COPY_ON_WRITE
	PERL_DONT_CREATE_GVSV
	PERL_MALLOC_WRAP
	PERL_OP_PARENT
	PERL_PRESERVE_IVUV
	PERL_USE_DEVEL
	PERL_USE_SAFE_PUTENV
	USE_64_BIT_ALL
	USE_64_BIT_INT
	USE_LARGE_FILES
	USE_LOCALE
	USE_LOCALE_COLLATE
	USE_LOCALE_CTYPE
	USE_LOCALE_NUMERIC
	USE_LOCALE_TIME
	USE_PERLIO
	USE_PERL_ATOF
  Built under darwin
  Compiled at Mar 27 2020 01:45:56
  %ENV:
	PERL="/Users/brian/bin/perls/perl-latest"
	PERL5_PATH="/Users/brian/bin/perls"
	PERL6_PATH="/Users/brian/bin/perl6s:/Applications/Rakudo/bin:/Applications/Rakudo/share/perl6/site/bin"
	PERLDOTCOM_AUTHOR="brian d foy"
  @INC:
	/usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10/darwin-2level
	/usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10
	/usr/local/perls/perl-5.31.10/lib/5.31.10/darwin-2level
	/usr/local/perls/perl-5.31.10/lib/5.31.10

The text was updated successfully, but these errors were encountered:

xsawyerx · 2020-04-05T06:34:06Z

I consider a single FETCH to be correct user behavior.

karenetheridge · 2020-04-06T18:27:37Z

should this be a 5.32 blocker?

jkeenan · 2020-04-06T18:57:57Z

should this be a 5.32 blocker?

This is already on the "Milestone 5.32.0" list -- which I take to be the 'blocker' list for 5.32.

toddr · 2020-04-08T07:26:16Z

Relevant mailing list thread: https://www.nntp.perl.org/group/perl.perl5.porters/2020/04/msg257322.html

xsawyerx · 2020-04-09T11:55:01Z

I'm going to revise my position. The semantics do dictate the FETCH be done twice. It's only correct. However, user expectation may fall short of that.

Instead, I am in favor of a documentation update as @davidnicol suggested.

briandfoy · 2020-04-09T12:23:19Z

David Nicol suggested for the tied case that you capture the variable:

my $x = ...
if( 1 < $x < 10 ) { ... }

But, that assumes you know that something is tied, and the point of tie is that you don't have to know that. It's a feature where it takes more time to explain the edge case than the common case, and historically that sort of wart has been a pain point for Perl adoption. To effectively guard against the edge case, you are now checking the input for every case, which means it's easier to not use this feature.

I'm not sure which semantics dictate what. The mailing list thread was all implementation details, and specifically not language details. The language can decide what it wants to do with a feature, although I take it the implementation is harder because this isn't tokenized differently and is actually just shoehorning it into old syntax.

khwilliamson · 2020-04-16T15:34:07Z

@xenu submitted a comment that didn't make it into this ticket. The operation of this new feature is not as clear-cut as people previously thought. Shouldn't we make it experimental in 5.32? Since I think all new features should be experimental, based on my previous school of hard knocks, I agree with him.

jkeenan · 2020-04-16T15:51:26Z

On 4/16/20 11:34 AM, Karl Williamson wrote: @xenu <https://github.com/xenu> submitted a comment that didn't make it into this ticket. The operation of this new feature is not as clear-cut as people previously thought. Shouldn't we make it experimental in 5.32? Since I think all new features should be experimental, based on my previous school of hard knocks, I agree with him.

I concur with Karl here; make it experimental.

demerphq · 2020-04-16T17:32:27Z

It is documented that it does this. Yves

…

On Thu, 9 Apr 2020 at 13:55, Sawyer X ***@***.***> wrote: I'm going to revise my position. The semantics do dictate the FETCH be done twice. It's only correct. However, user expectation may fall short of that. Instead, I am in favor of a documentation update as @davidnicol <https://github.com/davidnicol> suggested. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#17692 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAZ5R7UYUHG7CXOBVFA3LDRLWZSLANCNFSM4L5U4YIQ> .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq · 2020-04-16T17:33:04Z

This how it is documented to behave, why should this be a bug? Yves

…

On Sat, 4 Apr 2020 at 10:31, brian d foy ***@***.***> wrote: *Description* A chained comparison in 5.31.10 appears to evaluate the middle condition twice for a tied scalar. *Steps to Reproduce* use v5.31.10; package UnstableScalar { use parent qw(Tie::Scalar); sub TIESCALAR { bless {}, $_[0] } sub FETCH { my $value = int rand 10; say "Fetching scalar with $value"; return $value; } } tie my $unstable, 'UnstableScalar'; say $unstable foreach ( 0 .. 2 ); say "Comparing----"; if( 5 < $unstable < 9 ) { say "Found a value between 5 and 9"; } The comparison short circuits just fine, but if the first comparison is true, the tied scalar is evaluated again (so things such as Tie::Cycle will unexpectedly progress or skip values): $ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 2 $ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 0 $ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 9 Fetching scalar with 2 Found a value between 5 and 9 $ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 6 Fetching scalar with 9 *Expected behavior* I expect that the scalar would only call FETCH once and reuse the result in the second comparison. *Perl configuration* Summary of my perl5 (revision 5 version 31 subversion 10) configuration: Platform: osname=darwin osvers=19.3.0 archname=darwin-2level uname='darwin otter.local 19.3.0 darwin kernel version 19.3.0: thu jan 9 20:58:23 pst 2020; root:xnu-6153.81.5~1release_x86_64 x86_64 ' config_args='-des -Dprefix=/usr/local/perls/perl-5.31.10 -Dusedevel' hint=recommended useposix=true d_sigaction=define useithreads=undef usemultiplicity=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='cc' ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -DPERL_USE_SAFE_PUTENV' optimize='-O3' cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.17)' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='cc' ldflags =' -mmacosx-version-min=10.15 -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /usr/lib libs=-lpthread -ldbm -ldl -lm -lutil -lc perllibs=-lpthread -ldl -lm -lutil -lc libc= so=dylib useshrplib=false libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=bundle d_dlsymun=undef ccdlflags=' ' cccdlflags=' ' lddlflags=' -mmacosx-version-min=10.15 -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector-strong' Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_COPY_ON_WRITE PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP PERL_OP_PARENT PERL_PRESERVE_IVUV PERL_USE_DEVEL PERL_USE_SAFE_PUTENV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF Built under darwin Compiled at Mar 27 2020 01:45:56 %ENV: PERL="/Users/brian/bin/perls/perl-latest" PERL5_PATH="/Users/brian/bin/perls" PERL6_PATH="/Users/brian/bin/perl6s:/Applications/Rakudo/bin:/Applications/Rakudo/share/perl6/site/bin" PERLDOTCOM_AUTHOR="brian d foy" @inc: /usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10/darwin-2level /usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10 /usr/local/perls/perl-5.31.10/lib/5.31.10/darwin-2level /usr/local/perls/perl-5.31.10/lib/5.31.10 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#17692>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAZ5R3YT56CGT33UJ4HXUDRK3V57ANCNFSM4L5U4YIQ> .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq · 2020-04-16T17:33:58Z

The original documentation specified this. Yves

…

On Thu, 9 Apr 2020 at 13:55, Sawyer X ***@***.***> wrote: I'm going to revise my position. The semantics do dictate the FETCH be done twice. It's only correct. However, user expectation may fall short of that. Instead, I am in favor of a documentation update as @davidnicol <https://github.com/davidnicol> suggested. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#17692 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAZ5R7UYUHG7CXOBVFA3LDRLWZSLANCNFSM4L5U4YIQ> .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

khwilliamson · 2020-04-17T22:03:27Z

I learned in my $work career that when the person charged with creating user documentation of my creations was troubled by some detail of how it worked, that meant it failed the DWIM test. And I was almost certainly wrong. (Sometimes I worked with requirements, and sometimes was given free reign, but usually the requirements failed to address the level of details involved.) Usually I was wrong because I had had my head down in the coding details and could not see the forest for the trees. I learned that by treating the documenter/coder-designer relationship as a collaboration that a better product resulted. (That was somewhat unusual in that company, as the writers were paid a lot less and considered lower class citizens by many.)

I think we owe great deference to brian's opinions. He has proven himself adept at explaining perl to a wide audience. Certainly dismissing them is done at our peril. Hedging our bets by making the feature experimental for the first release is the only prudent course of action.

I now can't remember a case where the documenter was wrong and I was right. Perhaps there were some where further explanation convinced them that it needed to operate the way I had designed it. I of course do remember the times when I was right but was overruled by management, not involving the documenter. (For example, I remember feeling satisfaction when we got sued by a famous singer who missed getting to a performance because he expected the feature to work as I originally had implemented it. They never told me the result of that lawsuit)

toddr · 2020-04-18T03:50:19Z

I wonder if we could warn when tied variables are used in chained comparisons? If so I wonder how often that would be more prophetic than annoying? We'd only need to do that for a middle one.

Leont · 2020-04-18T10:25:34Z

TBH I really don't see the problem. 1 < $x < 2 calls fetch magic exactly as many times as 1 < $x && $x < 2. The number of times fetch is called is not dependent on the number of times that variable is mentioned (there are plenty of exceptions to that), it's dependent on the number of operations on it.

Both behaviors are reasonable, and one's expectations depend on wether you see chained comparison as one or two operations.

Leont · 2020-04-18T11:21:43Z

I wonder if we could warn when tied variables are used in chained comparisons? If so I wonder how often that would be more prophetic than annoying? We'd only need to do that for a middle one.

That doesn't sound like a good idea. 99% of tied fetches are idempotent so this wouldn't be a problem anyway.

Quite frankly, this whole discussion feels rather theoretical. The combination of circumstances to run into this seem incredibly unlikely. Ties are uncommon, non-idempotent ties are really uncommon and accidentally dealing with such a non-idempotent ties seems even unlikelier (given that passing it as an argument does exactly the assigning to a variable that solves this).

I don't think this is a real-world problem.

briandfoy · 2020-04-18T14:16:04Z

I started thinking about this because the confusion was one of the first reactions in Reddit to my post of the new feature. I didn't think of the tie at first because it's rare and uncommon. But, that's not the issue. It's about what people expect and what they get. None of these people care about what the implementor has to do to satisfy the expectation.

Here's roughly how regular people are going to react to this feature:

Very cool!
Oh, it expands to two comparisons though—weird.
Oh, but it doesn't call subroutines or methods twice. Okay, cool again because that's the same as Python.
Oh, but if the subroutine or method returns a tied thingy or an lvalue, the value could be accessed twice.
Okay, so don't do that.

This is quite a number of curveballs for one feature. It would be conceptually much better if it did call subroutines or methods twice, but it doesn't because the implementation leaks through a little. A persons knowledge of the chaining logical operators interferes with their understanding of this chaining operator. That they aren't homologues and can't be doesn't matter to the person who's already read more than they intended..

The best feature would stop at the "Very cool!" step. In this case, there are two more conceptual levels the reader must journey through, and then an extra bonus confusion. Even if they never encounter a tie in their life, they know that the edges cases for this feature took longer to explain than the feature itself. They don't want to think about how rare or improbable this is, but if they read far enough (unlikely, which is a problem), they discover this feature isn't what they thought it was.

Documenting things (this or something else) tends to not to be helpful as we think it is. Virtually no one reads the docs (wouldn't that make life so much easier), and those who do tend to stop when they think they have the answer.

Leont · 2020-04-18T22:14:23Z

Oh, but if the subroutine or method returns a tied thingy or an lvalue, the value could be accessed twice.

That's not an or, but and and. You can't return a tied value from a sub, unless it's an lvalue sub.

Oh, it expands to two comparisons though—weird.

Why do you think this is weird?

briandfoy · 2020-04-18T22:46:08Z

Oh, it expands to two comparisons though—weird.

Why do you think this is weird?

It's not that I think this is weird. It's that I think most people won't think of it that way because that's not how they think of the concept. I've said about all I can say on the subject without repeating myself.

jkeenan · 2020-04-18T22:55:49Z

On 4/18/20 10:16 AM, brian d foy wrote: I started thinking about this because the confusion was one of the first reactions in Reddit to my post of the new feature <https://www.reddit.com/r/perl/comments/fu3auk/chain_comparisons_to_avoid_excessive_typing/fmb29k0/>. I didn't think of the |tie| at first because it's rare and uncommon. But, that's not the issue. It's about what people expect and what they get. None of these people care about what the implementor has to do to satisfy the expectation. Here's roughly how regular people are going to react to this feature: * Very cool! * Oh, it expands to two comparisons though—weird. * Oh, but it doesn't call subroutines or methods twice. Okay, cool again because that's the same as Python. * Oh, but if the subroutine or method returns a tied thingy or an lvalue, the value could be accessed twice. * Okay, so don't do that. This is quite a number of curveballs for one feature. It would be conceptually much better if it did call subroutines or methods twice, but it doesn't because the implementation leaks through a little. A persons knowledge of the chaining logical operators interferes with their understanding of this chaining operator. That they aren't homologues and can't be doesn't matter to the person who's already read more than they intended.. The best feature would stop at the "Very cool!" step. In this case, there are two more conceptual levels the reader must journey through, and then an extra bonus confusion. Even if they never encounter a |tie| in their life, they know that the edges cases for this feature took longer to explain than the feature itself. They don't want to think about how rare or improbable this is, but if they read far enough (unlikely, which is a problem), they discover this feature isn't what they thought it was.

I'm not sure whether brian is arguing for *not* including this new functionality in perl-5.32.0, but I do think it reinforces the argument for introducing it as experimental. Thank you very much. Jim Keenan

vpit · 2020-04-20T22:18:27Z

I think it should call FETCH twice since it should also call overloading twice (is that even implemented? I must admit I haven't followed this in detail). That said, making it experimental can't hurt.

davidnicol · 2020-04-20T23:38:21Z

imaginary complete starting over: Constraints: chained comparisons must all be numeric or all be string. They also have to be in the same order, that is, composed of operators taken from one of these sets: < <= ==

>= ==

lt le eq gt ge eq trying to mix comparators that are not all in the same set would be a parse-time syntax error. When the parser identifies a comparison chain, it recognizes this v0 c1 v1 c2 v2 [ ... cn vn] and does this with it: for i ( 1 .. n ){ when vi is magical or otherwise more dynamic than a plain old scalar, resolve it to either a plain old number or plain old string, depending. apply ci to v(i-1) and vi; when false RETURN FALSE } TRUE this approach would mean comparator overloading (is that a thing? wow) wouldn't work, only number or string overloading. Also, tied stuff would get called once. Short-circuiting would continue to work. What's the delta between what's in place now and the semantics I just described? I think it's just the highlighted bit, and that the loop is fully unrolled.

…

On Mon, Apr 20, 2020 at 5:18 PM Vincent Pit ***@***.***> wrote: I think it should call FETCH twice since it should also call overloading twice (is that even implemented? I must admit I haven't followed this in detail). That said, making it experimental can't hurt. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#17692 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCIPPF6TN4X6NT3N74JODRNTC4DANCNFSM4L5U4YIQ> .

-- "Amageddon, enthen armagedfunki."

xsawyerx · 2020-04-26T06:56:25Z

I understand the concerns of parties here, but after reviewing it once again - and then again - I consider the current behavior correct and expected from a consistency perspective.

I understand a person might expect an optimization, but what I think we're discussing is for the optimization to become the proper use. That is, just because we might be able to use the variable only once and "cache" it for the statement doesn't mean we should consider that the syntax.

Instead, the only consistency is understanding how the code should work without the optimization, which is in two steps. The effect of this is that consistency, a FETCH would be called twice.

This reminds me of the @x = sort @x discussion. An author depended on this accidentally not unweakening the references, which was a bug as part of the optimization to make this into an in-place sort. The author that depended on this considered this mishap in behavior (only due to optimization) to be now a requirement of the syntax itself.

(Even when you don't call it an "optimization", the practice of avoiding a double call is - essentially - an optimization.)

dur-randir · 2020-04-26T15:17:33Z

Making this feature experimental and removing that status the next release costs nothing, but it ships as non-experimental and it happens to be not that obvious for users and needs to change (remember smartmatch) - it'd be a much harder way.

xsawyerx · 2020-04-29T21:35:49Z

Once it is experimental, it requires two releases as experimental. This means it will only be available as non-experimental in three versions (that is, three years), assuming it's not changed in any way. Making it experimental is meant to allow us to change it when we're still trying to figure it out.

If we don't consider it changing, requiring people to wait for three releases (three years) to have it does not gain anything.

The fact that it can confuse people doesn't justify either experimental or change in behavior or implementation. Making it run a function twice instead of once would make it incorrect. Caching the value of the FETCH tie magic would make it incorrect.

While it can be confusing, it is correct and should stay put.

davidnicol · 2020-04-29T23:22:56Z

I don't know about the positions of anyone else involved in this discussion but I'm entirely satisfied with suggesting putting double-quotes around tied string values and adding zero to tied numeric values in the documentation of the feature as a workaround to avoid "double-fetch" issues. When there's a big long dereferencing expression, does the expression get evaluated twice or just the resolution of it? print "element $i is in range\n" if lower <= $outer->{$inner}{a}{b}{c}[$i] < upper; or print "element $i is in range\n" if lower <= $outer->{$inner}{a}{b}{c}[$i]+0 < upper; this seems like a less contrived scenario than explicitly tieing something by the coder, given dbx modules and so-on. Which only makes one database call?

…

On Wed, Apr 29, 2020 at 4:36 PM Sawyer X ***@***.***> wrote: While it can be confusing, it is correct and should stay put.

xenu · 2020-05-05T22:07:50Z

FYI: this issue made it to the frontpage of /r/programming: https://www.reddit.com/r/programming/comments/gdxe1p/should_x_foo_y_read_from_foo_once_or_twice_perl/

Of course, most of the commenters probably aren't Perl programmers but I think it's still interesting to hear what people think.

Grinnz · 2020-05-05T22:44:51Z

For completion's sake, here's what happens with two somewhat related situations: https://perl.bot/p/efhuah

Object overloads apply to the referenced object itself and so persist through any passing or copying of references. The numeric overload is called twice if both comparisons occur and can result in different comparisons each time.

Variable magic applies to the variable itself and does not persist through any passing or copying. It also cannot affect the resulting value directly but could have arbitrary side effects. This get magic is also called twice if both comparisons occur.

Grinnz · 2020-05-05T23:29:54Z

Given that, I think it would be easier to discuss this in terms of an object with overloading; it is both significantly more common, and can be the result of a more complex expression.

Grinnz · 2020-05-05T23:37:07Z

The complicating factor there being, of course, it's possible to overload every comparison operator individually.

toddr · 2020-05-06T03:05:08Z

Most of the reddit discussions don't seem to get that perl supports tied variables. They also seem to be quickly forgetting that in any language they smugly consider to be better, the right side isn't evaluated if the left side is false.

After reading reddit and some of the above examples, I think I've come to the conclusion that I would actually be more upset if $x < $y < $z behaved differently than $x < $y && $y < $z

Most of the examples I've seen so far are a little contrived to simplify the explanation. I get that. Can anyone think of a practical example of where this might actually be an issue? So far, the only things I've imagined are some sort of DBIC code or some sort of Math::BigInt object. However, I'm not sure I can imagine what the code would be that would unexpectedly misbehave.

In my day job, I honestly don't use many tied variables that change when accessed multiple times. If I did, I would probably emphasize to the coder that they better make the variable VERY clear and have a darn good reason to do so. i.e. $x > $a_variable_that_will_change_every_time_it_is_checked > $y. I think If I saw this code during review, I would probably ask questions. I would also be asking how many unit tests they had to confirm it worked as expected. No matter how this example is written, I would be concerned with the design of the tie.

briandfoy · 2020-05-06T06:53:33Z

tie is the red herring of this discussion, and if I had a time machine I'd go back to restate the issue. Forget about the tie. I've lost that fight and I'd be surprised if anyone is ever bit by it. But the docs still describe a situation that doesn't reflect reality.

As noted in the reddit thread, people are confused by the statement that the chained comparison is actually broken into two comparisons (a < b && b < c) and that statement is not true if b is a subroutine call. The actual Perl code is more complicated than that and something akin to this pseudocode:

 { temp = b; a < temp && temp < c }

Document it like that much of the problem goes away because there's not an exception for the subroutine case.

I think Python needs the same doc adjustment too.

briandfoy · 2020-05-06T07:03:08Z

$x > $a_variable_that_will_change_every_time_it_is_checked > $y. I think If I saw this code during review, I would probably ask questions.

You wouldn't see it in a code review. You wouldn't even think that it was happening because the person writing that code isn't thinking about tie or overloading. They provide this library that uses this feature for the limited situation they imagined.

Then someone else, far away, uses that library and does something a bit goofy because they have some task where black magic makes sense. It's not rare that people do unexpected and unforeseen things with the interfaces someone provides them.

Many people in the Reddit thread made the same conceptual mistake. They assume that they have control over their entire ecosystem and are responsible for all of the code, and that all code interacting with a feature will come from them. In reality, we know that we don't know how people are going to interact with our CPAN code, how other CPAN code might interact with our code, and so on. These effects can be far away from the code that we actually write. It's not about what we write and what we allow, but what the world writes and what they decide to tolerate.

gugod · 2020-05-06T09:11:17Z

I briefly played 41 < EXPR < 43 with EXPR being tied var, overloaded object, subroutine returning plain scalar, subroutine returning tied var, subroutine returning overladed object, arthmatic expression of those... interesting!

Now that I cannot "un-see" it... the meaning of 41 < EXPR < 43 is clear and unambigious to me but internal side-effect is not. In a sense that this chain is understood as a single operation. It doesn't seem matter that much that sometimes of chained comparison behaves differently than the un-chained version -- as long as the end-result is mathmatically correct.

Although I'm not a committer, I feel like it's a big promise to say that the side-effect must also be consistent with the unchained version and perhaps that will end up give us less resource in the future to iterate on the improvement of chained comparison.

When we see the unchained version like: EXPR_1 < EXPR_2 && EXPR_2 < EXPR_3, we are almost mentally hardwired and seeing: oh, there is a short-circuiting happenning and the second operant (EXPR_2 < EXPR_3) might not be evaluated, neither its side-effect.

Most of languages and their implementations do evaluate the EXPR_1 < EXPR_2 first then, optionally, EXPR_2 < EXPR_3. However, it should be logically equivalent to do the opposite: EXPR_2 < EXPR_3 first, then EXPR_1 < EXPR_2. (If an implementation like this exist, I'd like to know about it.)

And... if there is an language implementation that's not doing short-circuiting -- it is logically correct too. (Although I'll be yelling if my /bin/bash is not doing short-ciuiting on &&.)

To me, maintaining the mathmatical correctness of EXPR_1 < EXPR_2 < EXPR_3 is far more interesteing and useful then maintaining its side-effects. If we find another implementation to short-circuiting N-way chained comparison that is more performant/energy-saving than a traditional && short-circuiting, I don't see why we shouldn't give it a spin. Perhaps at some point we want to re-phrase the definition of this and avoid referrencing to the unchained version. If we fail on optimizing that, we learn something too.

jstaursky · 2020-05-06T09:58:59Z

The fact that it can confuse people doesn't justify either experimental or change in behavior or implementation...

While it can be confusing, it is correct and should stay put.

Isn't code that does not meeting user expectation by definition a bug?
Your initial position of agreement with the change is sort of indicative of that fact. Not trying to be inflammatory but I would hope the opinion of a prominent author would carry enough weight to be considered at least experimental. Do not see the harm in doing that.

matthewpersico · 2020-05-06T15:16:10Z

My USD 0.02 is this:
The current syntax for "between" is

$left < $value  and  $value < $right

In the extreme case, all the terms are function calls:

funcleft() < funcval() and funcval() < funcright()

For this case:

You probably do NOT want to call funcval() more than once. That leads to doing:

my $value = funcval();
funcleft() < $value and $value < funcright();

We expect, and don't need, funcright() to evaluate if funcleft() < $value is false.

I hope that the new syntax preserves these semantics. I see the whole point of this new syntax as eliminating the need to do the my $value = funcval(); setup above. Furthermore, I read the current code as:

funcleft() is less than $value and $value is less than funcright()

but i have to work hard to get that right (via my $value = funcval(); ). I read the new syntax as

funcval() is between funcleft() and funcright()

meaning that you "naturally" want to refer to the value once. If I have to worry about double eval of the thing I'm checking, I see no need to use the new syntax. And if you want the double eval, you can drop back to the current syntax with two compares joined by an and.

davidnicol · 2020-05-06T15:26:17Z

# flavor to taste package Acme::is_between; use Exporter 'import'; our @export = qw/is_between/ sub is_between{ my ($middle, $left, $right) = @_ ; return ($left <= $middle and $middle <= $right) if ($left <= $right); return ($left >= $middle and $middle >= $right) if ($left >= $right); return ($left le $middle and $middle le $right) if ($left le $right); return ($left ge $middle and $middle ge $right) if ($left ge $right); 0; } 1;

…

On Wed, May 6, 2020 at 10:16 AM Matthew O. Persico ***@***.***> wrote: funcval() is between funcleft() and funcright() meaning that you "naturally" want to refer to the value *once*. If I have to worry about double eval of the thing I'm checking, I see no need to use the new syntax. And if you *want* the double eval, you can drop back to the current syntax with two compares joined by an and. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#17692 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCIPKTMYXSRLS6EEWBOR3RQF5MZANCNFSM4L5U4YIQ> .

-- With great risk comes spectacular failure

Grinnz · 2020-05-06T15:37:21Z

@matthewpersico All of this is already specified to be true in the documentation.

matthewpersico · 2020-05-06T15:43:41Z

@Grinnz - quite frankly, I can't really tell if @briandfoy 's original concerns are being addressed or he code is being left as is from this long and twisty thread.

Update: I missed #17692 (comment)

But tie is still going to be eval'ed multiple times potentially?

nrdvana · 2020-05-06T18:53:02Z

If I see

EXPR1 < EXPR2 < EXPR3 < EXPR4

in a language, I expect it to be semantically equivalent to

tmp1= EXPR1
tmp2= EXPR2
if (tmp1 < tmp2) {
  tmp3= EXPR3
  if (tmp2 < tmp3) {
    tmp4= EXPR4
    if (tmp3 < tmp4) {

because this is the implementation that results in the least processing, and it is actually a pain to write so the syntax shortcut is of the most benefit.

(edit)
In terms of Perl, I would expect the temporaries to be scalars (possibly a ref to an object, but never directly tied scalars) and that all overloaded operators would happen according to the extrapolated code above.

Grinnz · 2020-05-06T18:57:43Z

All of this is already what happens. I suppose we can't do much about it now but anyone who sees this, please stop confusing the issue with the semantics of evaluating the expression only once, because that already the case. The question here is whether each side of the comparison should do a separate fetch if the central value is a tied variable.

briandfoy · 2020-05-06T19:02:21Z

I'm closing the issue. Sawyer made his ruling and the continued discussion hasn't been well-informed.

nrdvana · 2020-05-06T19:06:23Z

Well now I feel like my comment is being misread. If I write

my $tmp1= EXPR1

and EXPR 1 is a tied scalar, that does call fetch at that point and results in a single fetch, as per the original statement in this issue.

Grinnz · 2020-05-06T19:11:28Z

I did not see the part specifically mentioning tied variable semantics, my apologies. To respond to that, the difference here is that there isn't actually a temporary variable involved (as the documentation points out), only the same tied variable on the call stack, used in two operations. This would be a semantic explanation for it, to go with Zefram's technical explanation. Not that I have any opinion on which is the "right" answer.

xsawyerx · 2020-05-06T21:07:16Z

The fact that it can confuse people doesn't justify either experimental or change in behavior or implementation...
While it can be confusing, it is correct and should stay put.

Isn't code that does not meeting user expectation by definition a bug?

This is a loaded question.

First of all, I didn't say "this code does not meet user expectations." I said, "it can confuse people." Two operative words: can and people, meaning it has the possibility of confusing and not everyone.

Secondly, languages might confuse, but as long as they maintain consistency - on some level - it isn't a bug.

To put it in other terms, Perl could have been consistent here by saying that in $x < EXPR < $y, EXPR will only be run once and values are cached for the entire comparison expression. It could also be consistent by saying that EXPR is always evaluated from scratched. We assume this is a pendulum where it should be either here or there. They are certainly both consistent. However, consistency here is maintained elsewhere. The comparison is done on an SV. A subroutine will need to be executed and the SV it returns is used. Once you understand this compares scalars, it makes sense. (And yes, magic attacks again, but in a consistent and deterministic way.)

Consistency usually breaks perception. Sometimes one user and sometimes the other. This can be frustrating to either user, so it should be documented well. Not just the feature, but in which way is the feature consistent.

The example of @x = sort @x is still a good example of unexpected consistency (to at least some). We had what we considered consistent, but we had to deal with a developer who objected and said the consistency was in what we considered an optimization with a bug.

Your initial position of agreement with the change is sort of indicative of that fact.

It is not. I sometimes err in initial judgment. I consider it a human quality, though I wish I never made mistakes.

Not trying to be inflammatory but I would hope the opinion of a prominent author would carry enough weight to be considered at least experimental. Do not see the harm in doing that.

No offense was taken. I hope I addressed your comments fairly.

FWIW, all of the authors I know on this thread are prominent in one way or the other. I appreciate and respect each of them and their positions and opinions. This is why this thread had me going back to the code, to the implementation, original thread, and to other core developers to discuss this again and again and again.

I've tried explaining why I disagree with the experimental status in a different comment. It sums up to two reasons that go together:

Experimental status is meant for the case that we're not certain of the syntax or its implementation. This isn't the case.
Given that, experimental status is unlikely to provide a change in behavior and will delay people being able to use this feature for 3 years without turning off the experimental flag.

Feature flags - especially experimental ones - are not free. They come at a price to the user and we should only consider them when apt. In this case, I don't see it.

xsawyerx · 2020-05-06T21:14:15Z

tie is the red herring of this discussion, and if I had a time machine I'd go back to restate the issue. Forget about the tie. I've lost that fight and I'd be surprised if anyone is ever bit by it. But the docs still describe a situation that doesn't reflect reality.

As noted in the reddit thread, people are confused by the statement that the chained comparison is actually broken into two comparisons (a < b && b < c) and that statement is not true if b is a subroutine call. The actual Perl code is more complicated than that and something akin to this pseudocode:
 { temp = b; a < temp && temp < c }
Document it like that much of the problem goes away because there's not an exception for the subroutine case.

This is a fair point. I'll review the docs again tomorrow morning.

Grinnz · 2020-05-06T22:24:27Z

That is already how it's documented. https://perldoc.pl/blead/perlop#Operator-Precedence-and-Associativity

khwilliamson · 2020-05-06T22:37:58Z

On 5/6/20 4:24 PM, Dan Book wrote: That is already how it's documented. https://perldoc.pl/blead/perlop#Operator-Precedence-and-Associativity — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#17692 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA2DH46DJ6UST2HOZQMH2TRQHPSTANCNFSM4L5U4YIQ>.

I was the one who added the commit for this feature to blead: commit 02b85d3 Author: Zefram <zefram@fysh.org> AuthorDate: Wed Feb 5 07:43:14 2020 +0000 Commit: Karl Williamson <khw@cpan.org> CommitDate: Thu Mar 12 22:34:26 2020 -0600 I did not know at the time that this would turn out to be a contentious feature. But it has. The freeze date for contentious additions was January 20. The PR also dates to after the freeze date. That means that this commit violates our long-standing policy, and doesn't belong in 5.32.

jstaursky · 2020-05-07T04:10:03Z

@xsawyerx apologies on the bluntness of my initial message, it was early and now that I've come back to read it--should have phrased it better. Anyway thanks for addressing the comment despite the initial loaded question.

briandfoy · 2020-05-07T06:57:33Z

That is already how it's documented. https://perldoc.pl/blead/perlop#Operator-Precedence-and-Associativity

No, it's not documented like that. Instead of a general case, it starts off with the common case and presents a rule, then states a slightly less common case that uses a different rule, and then a rare case that doesn't seem to follow either rule. There's a way to unify those.

The documentation starts as being two comparisons in the simple scalar case. That sets the frame for everything that follows. At that point, you have already told people how to think about this feature and it's the foundation for their interpretation of the following paragraphs. After that, it starts to list other cases and other behavior, all of which are framed as modifications to the initial case. Many people (as evidenced by the the reddit thread and this thread), aren't going to keep reading, and even if they did, they won't understand what they are reading or why it matters to their life.

I think if we change the frame by documenting it the same way for all cases, we lose several paragraphs of exceptions and explanations. That is, for all cases, it's like { temp = b; a < temp && temp < c }, even if the particular implementation doesn't match that. Now, the subroutine and scalar case are the same rule, and it's easy to explain the tie because temp is still the tied object. Now the tie and overload cases are much easier to grok.

davidnicol · 2020-05-07T15:25:48Z

I think if we change the frame by documenting it the same way for all cases, we lose several paragraphs of exceptions and explanations. That is, for all cases, it's like { temp = b; a < temp && temp < c }, even if the particular implementation doesn't match that. Now, the subroutine and scalar case are the

A chained comparison a < b < c essentially works like do { temp = b; a < temp && temp < c }. When b is a magical overloaded or tied value, the second fetch magic may be avoided by adding zero, for numeric comparisons, or quoting, for string comparisons, the middle value, so temp will be a plain r-value.

…

-- With great risk comes spectacular failure

tobyink · 2020-05-08T17:05:32Z

If the tied variable is fetched twice, it means this will sometimes be true:

( 6 < $unstable < 3 )

And that seems bad.

davidnicol · 2020-05-08T17:39:56Z

but this will keep the first result

…

( 6 < "$unstable" < 3 )

haarg · 2020-05-08T20:29:17Z

If the tied variable is fetched twice, it means this will sometimes be true:
( 6 < $unstable < 3 )
And that seems bad.

That would be possible with overloads even if tied variables were only fetched once.

briandfoy added the Needs Triage label Apr 4, 2020

jkeenan added affects-5.32 Bug labels Apr 4, 2020

khwilliamson added this to the 5.32.0 milestone Apr 4, 2020

jkeenan removed the Needs Triage label Apr 6, 2020

toddr added the Feature Request label May 6, 2020

briandfoy closed this as completed May 6, 2020

Chained comparison evaluates tied scalar twice #17692

Chained comparison evaluates tied scalar twice #17692

Comments

briandfoy commented Apr 4, 2020 • edited Loading

xsawyerx commented Apr 5, 2020

karenetheridge commented Apr 6, 2020

jkeenan commented Apr 6, 2020

toddr commented Apr 8, 2020

xsawyerx commented Apr 9, 2020

briandfoy commented Apr 9, 2020

khwilliamson commented Apr 16, 2020

jkeenan commented Apr 16, 2020 via email

demerphq commented Apr 16, 2020 via email

demerphq commented Apr 16, 2020 via email

demerphq commented Apr 16, 2020 via email

khwilliamson commented Apr 17, 2020

toddr commented Apr 18, 2020 • edited Loading

Leont commented Apr 18, 2020 • edited Loading

Leont commented Apr 18, 2020 • edited Loading

briandfoy commented Apr 18, 2020 • edited Loading

Leont commented Apr 18, 2020

briandfoy commented Apr 18, 2020

jkeenan commented Apr 18, 2020 via email

vpit commented Apr 20, 2020

davidnicol commented Apr 20, 2020 via email

xsawyerx commented Apr 26, 2020

dur-randir commented Apr 26, 2020

xsawyerx commented Apr 29, 2020

davidnicol commented Apr 29, 2020 via email

xenu commented May 5, 2020

Grinnz commented May 5, 2020

Grinnz commented May 5, 2020

Grinnz commented May 5, 2020

toddr commented May 6, 2020

briandfoy commented May 6, 2020 • edited Loading

briandfoy commented May 6, 2020

gugod commented May 6, 2020

jstaursky commented May 6, 2020 • edited Loading

matthewpersico commented May 6, 2020

davidnicol commented May 6, 2020 via email

Grinnz commented May 6, 2020

matthewpersico commented May 6, 2020

nrdvana commented May 6, 2020 • edited Loading

Grinnz commented May 6, 2020

briandfoy commented May 6, 2020

nrdvana commented May 6, 2020

Grinnz commented May 6, 2020

xsawyerx commented May 6, 2020

xsawyerx commented May 6, 2020

Grinnz commented May 6, 2020

khwilliamson commented May 6, 2020 via email

jstaursky commented May 7, 2020

briandfoy commented May 7, 2020

davidnicol commented May 7, 2020 via email

tobyink commented May 8, 2020

davidnicol commented May 8, 2020 via email

haarg commented May 8, 2020

briandfoy commented Apr 4, 2020 •

edited

Loading

toddr commented Apr 18, 2020 •

edited

Loading

Leont commented Apr 18, 2020 •

edited

Loading

Leont commented Apr 18, 2020 •

edited

Loading

briandfoy commented Apr 18, 2020 •

edited

Loading

briandfoy commented May 6, 2020 •

edited

Loading

jstaursky commented May 6, 2020 •

edited

Loading

nrdvana commented May 6, 2020 •

edited

Loading