Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning if regexp used as LHS of =~ instead of RHS #15622

Open
p5pRT opened this issue Sep 22, 2016 · 6 comments
Open

Warning if regexp used as LHS of =~ instead of RHS #15622

p5pRT opened this issue Sep 22, 2016 · 6 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Sep 22, 2016

Migrated from rt.perl.org#129333 (status was 'open')

Searchable as RT129333$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 22, 2016

From @epa

Created by @epa

This program contains a mistake​:

  % perl -wE '$re = qr/a/; $s = "abc"; if ($re =~ $s) { say "hi" }'

The match operator is the wrong way round; it should be $s =~ $re.
This is the kind of mistake that can be hard to see even if you
stare at the code for a long time!

If warnings are enabled, and the LHS of the =~ operator is a regular
expression object (ref $foo eq 'Regexp'), and the RHS is not a literal
regexp or a regular expression object, then a warning should be produced.

  use warnings;
  my $re = qr/f/;
  my $re2 = qr/g/;
  my $s = 'foo';

  $s =~ $re; # OK
  $re =~ $s; # warns
  $re =~ /x/; # does not warn since RHS is a literal regexp
  $re =~ $re2; # does not warn since RHS is a regexp object

The second and third cases above *could* warn, but for the purpose of
this bug report I have left them out since I want to narrow down the
warning as much as possible and minimize false positives.

Perl Info

Flags:
    category=core
    severity=wishlist

Site configuration information for perl 5.22.2:

Configured by Red Hat, Inc. at Wed Aug  3 14:06:20 UTC 2016.

Summary of my perl5 (revision 5 version 22 subversion 2) configuration:
   
  Platform:
    osname=linux, osvers=4.6.3-300.fc24.x86_64, archname=x86_64-linux-thread-multi
    uname='linux buildvm-08.phx2.fedoraproject.org 4.6.3-300.fc24.x86_64 #1 smp fri jun 24 20:52:41 utc 2016 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Doptimize=none -Dccflags=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches  -m64 -mtune=generic -Dldflags=-Wl,-z,relro  -Dccdlflags=-Wl,--enable-new-dtags -Wl,-z,relro  -Dlddlflags=-shared -Wl,-z,relro  -Dshrpdir=/usr/lib64 -DDEBUGGING=-g -Dversion=5.22.2 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace -Duselargefiles -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin -Dusesitecustomize'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fwrapv -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='  -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fwrapv -fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='5.3.1 20160406 (Red Hat 5.3.1-6)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags ='-Wl,-z,relro  -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib64 /lib64 /usr/lib64 /usr/local/lib /usr/lib /lib/../lib64 /usr/lib/../lib64 /lib
    libs=-lpthread -lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lpthread -lresolv -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.22.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.22'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags -Wl,-z,relro '
    cccdlflags='-fPIC', lddlflags='-shared -Wl,-z,relro  -L/usr/local/lib -fstack-protector-strong'

Locally applied patches:
    Fedora Patch1: Removes date check, Fedora/RHEL specific
    Fedora Patch3: support for libdir64
    Fedora Patch4: use libresolv instead of libbind
    Fedora Patch5: USE_MM_LD_RUN_PATH
    Fedora Patch6: Skip hostname tests, due to builders not being network capable
    Fedora Patch7: Dont run one io test due to random builder failures
    Fedora Patch15: Define SONAME for libperl.so
    Fedora Patch16: Install libperl.so to -Dshrpdir value
    Fedora Patch22: Document Math::BigInt::CalcEmu requires Math::BigInt (CPAN RT#85015)
    Fedora Patch26: Make *DBM_File desctructors thread-safe (RT#61912)
    Fedora Patch27: Make PadlistNAMES() lvalue again (CPAN RT#101063)
    Fedora Patch28: Make magic vtable writable as a work-around for Coro (CPAN RT#101063)
    Fedora Patch29: Fix duplicating PerlIO::encoding when spawning threads (RT#31923)
    Fedora Patch30: Do not let XSLoader load relative paths (CVE-2016-6185)
    Fedora Patch31: Avoid loading optional modules from default . (CVE-2016-1238)
    Fedora Patch200: Link XS modules to libperl.so with EU::CBuilder on Linux
    Fedora Patch201: Link XS modules to libperl.so with EU::MM on Linux


@INC for perl 5.22.2:
    /home/eda/lib64/perl5/
    /usr/local/lib64/perl5
    /usr/local/share/perl5
    /usr/lib64/perl5/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib64/perl5
    /usr/share/perl5


Environment for perl 5.22.2:
    HOME=/home/eda
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LC_COLLATE=C
    LC_CTYPE=en_GB.UTF-8
    LC_MESSAGES=en_GB.UTF-8
    LC_MONETARY=en_GB.UTF-8
    LC_NUMERIC=en_GB.UTF-8
    LC_TIME=en_GB.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/eda/bin:/home/eda/bin:/usr/local/bin:/usr/bin:/sbin:/usr/sbin:/sbin:/usr/sbin
    PERL5LIB=/home/eda/lib64/perl5/
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 22, 2016

From @Abigail

On Thu, Sep 22, 2016 at 08​:38​:55AM -0700, Ed Avis wrote​:

This program contains a mistake​:

% perl \-wE '$re = qr/a/; $s = "abc"; if \($re =~ $s\) \{ say "hi" \}'

Who says it's a mistake? I've certainly written code which introspects
patterns by matching a pattern against it.

And I don't always write /$pattern/ on the RHS of a =~ operator.
After all, Perl is there for the programmer; the programmer isn't there
for Perl. Perl knows I mean a pattern, because I write =~, and will
duely treat a string as a pattern.

Abigail

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 22, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2016

From @ap

* Abigail <abigail@​abigail.be> [2016-09-22 22​:00]​:

On Thu, Sep 22, 2016 at 08​:38​:55AM -0700, Ed Avis wrote​:

This program contains a mistake​:

% perl \-wE '$re = qr/a/; $s = "abc"; if \($re =~ $s\) \{ say "hi" \}'

Who says it's a mistake? I've certainly written code which introspects
patterns by matching a pattern against it.

I’ve done that, but extremely rarely. I wonder if warning the user when
they use a qr object on the LHS might be worthwhile. You can always and
easily stringify it if that’s what you really want.

And I don't always write /$pattern/ on the RHS of a =~ operator.

strict.pm and warnings.pm themselves do that. :^) Courtesy of yours
truly. :^P (Because constant-folding.)

Not gonna fly.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2016

From @ap

* Aristotle Pagaltzis <pagaltzis@​gmx.de> [2016-09-26 06​:48]​:

I wonder if warning the user when they use a qr object on the LHS
might be worthwhile. You can always and easily stringify it if that’s
what you really want.

No.

Any time a value crosses an authority boundary, like when a value is
passed from user code to a CPAN module, the code on the other side of
the boundary would have to defensively stringify the LHS in every last
pattern match against that value, or else go to the effort to propagate
the warning back to the boundary point.

Not gonna fly.

Not as part of perl proper. However​:

  package Regexp { use overload q[""] => sub { die } }

Looks like that does what you’d think it’ll do. And returning $_[0]
doesn’t seem to cause recursion, so getting a warning out of it is
just as easy.

So anyone who wants as much can already have it.

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 26, 2016

From @epa

Aristotle Pagaltzis wrote​:

Any time a value crosses an authority boundary, like when a value is
passed from user code to a CPAN module, the code on the other side of
the boundary would have to defensively stringify the LHS in every last
pattern match against that value,

This is true. But the proposal was to warn only if the LHS of =~ is a regexp and the RHS is not.
Take the following code

  sub match_and_report {
  my ($string, $regexp, $message) = @​_;
  if ($string =~ $regexp) { say "it matches, $message" }
  }

Now, if this code is called the wrong way round, transposing the string and regexp arguments, isn't it reasonable for it to warn?
Even if the warning is seen by the programmer as coming from inside the bowels of match_and_report, rather than being a nicely packaged 'carp' of the caller's mistake.
Even though the mistake is by the caller, and there is no bug in match_and_report itself, the warning is still helpful in practice to the programmer.

This is surely analogous to the warnings on uninitialized values. Every time a value crosses into a module's code, it 'would have to defensively check against undef'.
But in fact perl often gives 'Use of uninitialized value' coming from inside library code. The programmer then can peek inside the library to work out what's wrong.
High-quality libraries will often have defensive usage checks for the most common errors, but it's unlikely to be entirely free of warnings in all cases where the caller might pass undef.
Again, it's not a perfect diagnostic tool, since the line number reported in the warning is a long way off where the undef was passed, but nobody suggests perl shouldn't warn on undef values at run time for this reason.

I suggest that using =~ 'the wrong way round' is if anything more likely to be a mistake than doing something like undef()+5, and is at least as worthy of a runtime warning, bearing in mind that warnings can be turned on and off.

--
Ed Avis <eda@​waniasset.com>

This email is intended only for the person to whom it is addressed and may contain confidential information. Any retransmission, copying, disclosure or other use of, this information by persons other than the intended recipient is prohibited. If you received this email in error, please contact the sender and delete the material. This email is for information only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument. Wadhwani Asset Management LLP is a Limited Liability Partnership registered in England (OC303168) with registered office at 9th Floor Orion House, 5 Upper St Martin’s Lane, London, WC2H 9EA. It is authorised and regulated by the Financial Conduct Authority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant