Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fail to find a match with a regular expression #15158

Closed
p5pRT opened this issue Jan 30, 2016 · 6 comments
Closed

fail to find a match with a regular expression #15158

p5pRT opened this issue Jan 30, 2016 · 6 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 30, 2016

Migrated from rt.perl.org#127436 (status was 'rejected')

Searchable as RT127436$

@p5pRT
Copy link
Author

p5pRT commented Jan 30, 2016

From spz1st@gmail.com

This is a bug report for perl from spz1st@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.18.2.


Should the regular expression "\bmr\.\b" and "\bmr.\b" match "mr." in the string
"a mr. cbcmr"?

% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr.\b/MR/g; print $str, "\n";'
a mr. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr\.\b/MR/g; print $str, "\n";'
a mr. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr\./MR/g; print $str, "\n";'
a MR cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr\b/MR/g; print $str, "\n";'
a MR. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr/MR/g; print $str, "\n";'
a MR. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/mr\b/MR/g; print $str, "\n";'
a MR. cmrbcMR
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/mr/MR/g; print $str, "\n";'
a MR. cMRbcMR

% perl -v
This is perl 5, version 18, subversion 2 (v5.18.2) built for x86_64-linux-gnu-th
read-multi(with 41 registered patches, see perl -V for more detail)

Copyright 1987-2013, Larry Wall



Flags​:
  category=core
  severity=medium


Site configuration information for perl 5.18.2​:

Configured by Debian Project at Thu Mar 27 18​:28​:21 UTC 2014.

Summary of my perl5 (revision 5 version 18 subversion 2) configuration​:
 
  Platform​:
  osname=linux, osvers=3.2.0-58-generic, archname=x86_64-linux-gnu-thread-multi
  uname='linux brownie 3.2.0-58-generic #88-ubuntu smp tue dec 3 17​:37​:58 utc 2013 x86_64 x86_64 x86_64 gnulinux '
  config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Dldflags= -Wl,-Bsymbolic-functions -Wl,-z,relro -Dlddlflags=-shared -Wl,-Bsymbolic-functions -Wl,-z,relro -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.18 -Darchlib=/usr/lib/perl/5.18 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.18.2 -Dsitearch=/usr/local/lib/perl/5.18.2 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -Uversiononly -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.18.2 -des'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fstack-protector -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O2 -g',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fstack-protector -fno-strict-aliasing -pipe -I/usr/local/include'
  ccversion='', gccversion='4.8.2', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
  libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
  perllibs=-ldl -lm -lpthread -lc -lcrypt
  libc=, so=so, useshrplib=true, libperl=libperl.so.5.18.2
  gnulibc_version='2.19'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib -fstack-protector'

Locally applied patches​:
  DEBPKG​:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN.
  DEBPKG​:debian/db_file_ver - http​://bugs.debian.org/340047 Remove overly restrictive DB_File version check.
  DEBPKG​:debian/doc_info - Replace generic man(1) instructions with Debian-specific information.
  DEBPKG​:debian/enc2xs_inc - http​://bugs.debian.org/290336 Tweak enc2xs to follow symlinks and ignore missing @​INC directories.
  DEBPKG​:debian/errno_ver - http​://bugs.debian.org/343351 Remove Errno version check due to upgrade problems with long-running processes.
  DEBPKG​:debian/libperl_embed_doc - http​://bugs.debian.org/186778 Note that libperl-dev package is required for embedded linking
  DEBPKG​:fixes/respect_umask - Respect umask during installation
  DEBPKG​:debian/writable_site_dirs - Set umask approproately for site install directories
  DEBPKG​:debian/extutils_set_libperl_path - EU​:MM​: Set location of libperl.a to /usr/lib
  DEBPKG​:debian/no_packlist_perllocal - Don't install .packlist or perllocal.pod for perl or vendor
  DEBPKG​:debian/prefix_changes - Fiddle with *PREFIX and variables written to the makefile
  DEBPKG​:debian/fakeroot - Postpone LD_LIBRARY_PATH evaluation to the binary targets.
  DEBPKG​:debian/instmodsh_doc - Debian policy doesn't install .packlist files for core or vendor.
  DEBPKG​:debian/ld_run_path - Remove standard libs from LD_RUN_PATH as per Debian policy.
  DEBPKG​:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable.
  DEBPKG​:debian/mod_paths - Tweak @​INC ordering for Debian
  DEBPKG​:debian/module_build_man_extensions - http​://bugs.debian.org/479460 Adjust Module​::Build manual page extensions for the Debian Perl policy
  DEBPKG​:debian/prune_libs - http​://bugs.debian.org/128355 Prune the list of libraries wanted to what we actually need.
  DEBPKG​:fixes/net_smtp_docs - [rt.cpan.org #36038] http​://bugs.debian.org/100195 Document the Net​::SMTP 'Port' option
  DEBPKG​:debian/perlivp - http​://bugs.debian.org/510895 Make perlivp skip include directories in /usr/local
  DEBPKG​:debian/cpanplus_definstalldirs - http​://bugs.debian.org/533707 Configure CPANPLUS to use the site directories by default.
  DEBPKG​:debian/cpanplus_config_path - Save local versions of CPANPLUS​::Config​::System into /etc/perl.
  DEBPKG​:debian/deprecate-with-apt - http​://bugs.debian.org/702096 Point users to Debian packages of deprecated core modules
  DEBPKG​:debian/squelch-locale-warnings - http​://bugs.debian.org/508764 Squelch locale warnings in Debian package maintainer scripts
  DEBPKG​:debian/skip-upstream-git-tests - Skip tests specific to the upstream Git repository
  DEBPKG​:debian/patchlevel - http​://bugs.debian.org/567489 List packaged patches for 5.18.2-2ubuntu1 in patchlevel.h
  DEBPKG​:debian/skip-kfreebsd-crash - http​://bugs.debian.org/628493 [perl #96272] Skip a crashing test case in t/op/threads.t on GNU/kFreeBSD
  DEBPKG​:fixes/document_makemaker_ccflags - http​://bugs.debian.org/628522 [rt.cpan.org #68613] Document that CCFLAGS should include $Config{ccflags}
  DEBPKG​:debian/find_html2text - http​://bugs.debian.org/640479 Configure CPAN​::Distribution with correct name of html2text
  DEBPKG​:debian/hurd_test_skip_stack - http​://bugs.debian.org/650175 Disable failing GNU/Hurd tests dist/threads/t/stack.t
  DEBPKG​:fixes/manpage_name_Test-Harness - http​://bugs.debian.org/650451 [rt.cpan.org #73399] cpan/Test-Harness​: add NAME headings in modules with POD
  DEBPKG​:debian/makemaker-pasthru - http​://bugs.debian.org/660195 [rt.cpan.org #28632] Make EU​::MM pass LD through to recursive Makefile.PL invocations
  DEBPKG​:debian/perl5db-x-terminal-emulator.patch - http​://bugs.debian.org/668490 Invoke x-terminal-emulator rather than xterm in perl5db.pl
  DEBPKG​:debian/cpan-missing-site-dirs - http​://bugs.debian.org/688842 Fix CPAN​::FirstTime defaults with nonexisting site dirs if a parent is writable
  DEBPKG​:fixes/memoize_storable_nstore - [rt.cpan.org #77790] http​://bugs.debian.org/587650 Memoize​::Storable​: respect 'nstore' option not respected
  DEBPKG​:fixes/net_ftp_failed_command - [rt.cpan.org #37700] http​://bugs.debian.org/491062 Net​::FTP​: cope gracefully with a failed command
  DEBPKG​:fixes/perlbug-patchlist - [3541c11] http​://bugs.debian.org/710842 [perl #118433] Make perlbug look up the list of local patches at run time
  DEBPKG​:fixes/module_metadata_security_doc - [68cdd4b] CVE-2013-1437 documentation fix
  DEBPKG​:fixes/module_metadata_taint_fix - [bff978f] http​://bugs.debian.org/722210 [rt.cpan.org #88576] untaint version, if needed, in Module​::Metadata
  DEBPKG​:fixes/IPC-SysV-spelling - http​://bugs.debian.org/730558 [rt.cpan.org #86736] Fix spelling of IPC_CREAT in IPC-SysV documentation
  DEBPKG​:fixes/fix-undef-source -


@​INC for perl 5.18.2​:
  /etc/perl
  /usr/local/lib/perl/5.18.2
  /usr/local/share/perl/5.18.2
  /usr/lib/perl5
  /usr/share/perl5
  /usr/lib/perl/5.18
  /usr/share/perl/5.18
  /usr/local/lib/site_perl
  .


Environment for perl 5.18.2​:
  HOME=/home/sysadm
  LANG=en_US.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/bin​:/usr/games​:/usr/local/games​:/usr/local/tqs/bin​:/usr/local/3rd​:/home/sysadm/scripts
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2016

From @jkeenan

On Sat Jan 30 12​:50​:29 2016, spz1st@​gmail.com wrote​:

This is a bug report for perl from spz1st@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.18.2.

-----------------------------------------------------------------
Should the regular expression "\bmr\.\b" and "\bmr.\b" match "mr." in
the string
"a mr. cbcmr"?

% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr.\b/MR/g; print $str,
"\n";'
a mr. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr\.\b/MR/g; print
$str, "\n";'
a mr. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr\./MR/g; print $str,
"\n";'
a MR cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr\b/MR/g; print $str,
"\n";'
a MR. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/\bmr/MR/g; print $str,
"\n";'
a MR. cmrbcmr
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/mr\b/MR/g; print $str,
"\n";'
a MR. cmrbcMR
% perl -e '$str = "a mr. cmrbcmr"; $str =~ s/mr/MR/g; print $str,
"\n";'
a MR. cMRbcMR

You are asking a question about pattern matching, but you're confusing matters by providing examples using substitutions. Let's start by simply focusing on the pattern matching. See first attachment​: 127436-match.pl

Running this program gives this output​:
#####
NO
NO
YES
YES
YES
YES
YES
#####

The reason why there is no pattern matched in the first two examples lies in the way '\b' is precisely defined in 'perldoc perlre'.

#####
A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W".
#####

The '.' in the string is not a '\w'; it's a \W'. So what you have there is two consecutive '\W' characters -- not a '\w' followed by a '\W'. This can be seen more clearly in the second attached file​: 127436-reduced.pl. This file produces​:

#####
NO
NO

NO
YES
#####

No bug in Perl here. Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2016

From @jkeenan

127436-match.pl

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2016

From @jkeenan

127436-reduced.pl

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 31, 2016

@jkeenan - Status changed from 'open' to 'rejected'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant