Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter::Simple $pod_or_DATA regex fails #11428

Closed
p5pRT opened this issue Jun 8, 2011 · 13 comments
Closed

Filter::Simple $pod_or_DATA regex fails #11428

p5pRT opened this issue Jun 8, 2011 · 13 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Jun 8, 2011

Migrated from rt.perl.org#92436 (status was 'resolved')

Searchable as RT92436$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 8, 2011

From chm@cpan.org

Created by chm@cpan.org

This is a bug report for perl from chm@​cpan.org,
generated with the help of perlbug 1.39 running under perl 5.10.1.

-----------------------------------------------------------------
Using the FILTER_ONLY with code_no_comments fails for the following
code because the $pod_or_DATA regex does not correctly capture the
pod sequence. It appears to capture up to the = of the last =cut
leaving a naked cut behind which is parses as an undeclared sub.

my $dims = pdl($var->dims);
($t = $dims->(0)) .= 1;
$rpt = $dims->prod;

=begin WHENCOMPLEXVALUESWORK

if( UNIVERSAL​::isa($var,'PDL​::Complex') ) {
$rpt = $var->dim(1);
$t = 'complex'
} else {
$t = type $var;
}

=end WHENCOMPLEXVALUESWORK

=cut

barf "Error​: wfits() currently can not handle PDL​::Complex arrays
(column $colnames[$i])\n"
if UNIVERSAL​::isa($var,'PDL​::Complex');
$t = $var->type;

$t = $bintable_types{$t};

Changing the following line in the regex from

  | ^=begin \s* (\S+) .*? \n=end \s* \1 .*? $EOP

to

  | ^=begin \s+ (?​:\S+) .*? \n=end \s* \1 .*? $EOP =cut \s*? $EOP

appears to fix the problem. I am not sure that the fix
completely handles all edge cases but I believe it is a
step to the correct solution.

Regards,
Chris

Perl Info

Flags:
     category=library
     severity=high
     module=Filter::Simple

Site configuration information for perl 5.10.1:

Configured by rurban at Sat Aug 28 20:14:06 CEST 2010.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:

   Platform:
     osname=cygwin, osvers=1.7.5(0.22553), 
archname=i686-cygwin-thread-multi-64int
     uname='cygwin_nt-5.1 reini 1.7.5(0.22553) 2010-04-12 19:07 i686 
cygwin '
     config_args='-de -Dlibperl=cygperl5_10.dll -Dcc=gcc-4 -Dld=g++-4 
-Dmksymlinks -Dusethreads -Dmad=y -Doptimize=-O3 -Accflags=-g3'
     hint=recommended, useposix=true, d_sigaction=define
     useithreads=define, usemultiplicity=define
     useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
     use64bitint=define, use64bitall=undef, uselongdouble=undef
     usemymalloc=y, bincompat5005=undef
   Compiler:
     cc='gcc-4', ccflags ='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -g3 
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
     optimize='-O3',
     cppflags='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -g3 
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
     ccversion='', gccversion='4.3.4 20090804 (release) 1', gccosandvers=''
     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
     d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
     ivtype='long long', ivsize=8, nvtype='double', nvsize=8, 
Off_t='off_t', lseeksize=8
     alignbytes=8, prototype=define
   Linker and Libraries:
     ld='g++-4', ldflags =' -Wl,--enable-auto-import 
-Wl,--export-all-symbols -Wl,--stack,8388608 
-Wl,--enable-auto-image-base -fstack-protector -L/usr/local/lib'
     libpth=/usr/local/lib /usr/lib /lib
     libs=-lgdbm -ldb -ldl -lcrypt -lgdbm_compat
     perllibs=-ldl -lcrypt
     libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=cygperl5_10.dll
     gnulibc_version=''
   Dynamic Linking:
     dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
     cccdlflags=' ', lddlflags=' --shared  -Wl,--enable-auto-import 
-Wl,--export-all-symbols -Wl,--stack,8388608 
-Wl,--enable-auto-image-base -L/usr/local/lib -fstack-protector'

Locally applied patches:
     CYG11 no-bs
     CYG12 no archlib in otherlibdirs
     CYG14 Dynaloader
     CYG15 static-Win32CORE
     CYG17 utf8-paths
     CYG21 LibList-Kid.patch
     CYG22 cygwin-1.7 hints
     CYG23 544-stat
     CYG24 build man pages
     CYG25 rebase_privlib
     Module-Build-0.36_13
     Bug#55162 CYG18 File::Spec::case_tolerant performance
     disable ExtUtils::MakeMaker::Coverage in Sys-Syslog


@INC for perl 5.10.1:
     /home/chm/perl/lib/i686-cygwin-thread-multi-64int
     /home/chm/perl/lib
     /usr/lib/perl5/5.10/i686-cygwin
     /usr/lib/perl5/5.10
     /usr/lib/perl5/site_perl/5.10/i686-cygwin
     /usr/lib/perl5/site_perl/5.10
     /usr/lib/perl5/vendor_perl/5.10/i686-cygwin
     /usr/lib/perl5/vendor_perl/5.10
     /usr/lib/perl5/vendor_perl/5.10
     /usr/lib/perl5/site_perl/5.8
     /usr/lib/perl5/vendor_perl/5.8
     .


Environment for perl 5.10.1:
     HOME=/home/chm
     LANG=C.UTF-8
     LANGUAGE (unset)
     LD_LIBRARY_PATH (unset)
     LOGDIR (unset)
 
PATH=/usr/local/pgplot:/usr/local/pgplot:/home/chm/perl/bin:/usr/lib/qt3/bin:/usr/local/bin:/usr/bin:/bin:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/Program 
Files/DisplayLink Core 
Software/:/cygdrive/c/strawberry/c/bin:/cygdrive/c/strawberry/perl/bin:/cygdrive/c/Program 
Files/QuickTime/QTSystem/:/usr/lib/lapack
     PERL5LIB=/home/chm/perl/lib
     PERLDOC_PAGER=less
     PERL_BADLANG (unset)
     SHELL (unset)


@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 14, 2011

From devel.chm.01@gmail.com

I wanted to follow up with a workaround I found for the
current problem. Adding an =pod line before the failing
=begin, =end, =cut section avoided the match against
the =begin/=end pattern and its failure.

One suggestion would be to simplify the $pod_or_DATA
regex for the POD part and only look for sequences of
command lines (=headN, =item, =begin, ...) followed by
an =cut line. It seems that would serve the same
purpose as the existing approach but be simpler to
maintain and debug.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 14, 2011

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 14, 2011

From devel.chm.01@gmail.com

The $pod_or_DATA problem actually arose from source filter
development using FILTER_ONLY code_no_comments => ....

Further development showed that for this code, the FILTER_ONLY
code_no_comments fails to hide/fold the qq{ } string. Am I even
correct to assume that the qq string would be hidden or is the
FILTER_ONLY doing what it is supposed to?

$ cat showinput.pm # filter that just prints its input
package showinput;

use Filter​::Simple;

FILTER_ONLY
  code_no_comments => sub { print; };

$ cat db-eg.pl # the code with the qq{} that is missed
use showinput;

sub show_nslice {
  my ($table, $yr, $schema) = ('mytable',2011,'myschema');
  $sth = $dbh->prepare(
  qq{
  CREATE TABLE $table ( CHECK ( yr = $yr ))
  INHERITS ($schema.master_table)
  }
  );
}

$ perl -c db-eg.pl

sub show_nslice {
  my ($table, $yr, $schema) = (,2011,);
  $sth = $dbh->prepare(
  qq{
  CREATE TABLE $table ( CHECK ( yr = $yr ))
  INHERITS ($schema.master_table)
  }
  );
}
db-eg.pl syntax OK

We can see in the output that the strings 'mytable' and 'myschema'
have been hidden (which is why they are not visible in the text
cut-and-paste output. However, the qq{} inside the prepare method
call has not been hidden.

Is this expected?
If so, is there a way to make the qq{} be hidden?

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 14, 2011

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jun 14, 2011

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 10, 2011

From @cpansprout

On Tue Jun 07 18​:34​:06 2011, chm wrote​:

Using the FILTER_ONLY with code_no_comments fails for the following
code because the $pod_or_DATA regex does not correctly capture the
pod sequence. It appears to capture up to the = of the last =cut
leaving a naked cut behind which is parses as an undeclared sub.

my $dims = pdl($var->dims);
($t = $dims->(0)) .= 1;
$rpt = $dims->prod;

=begin WHENCOMPLEXVALUESWORK

if( UNIVERSAL​::isa($var,'PDL​::Complex') ) {
$rpt = $var->dim(1);
$t = 'complex'
} else {
$t = type $var;
}

=end WHENCOMPLEXVALUESWORK

=cut

barf "Error​: wfits() currently can not handle PDL​::Complex arrays
(column $colnames[$i])\n"
if UNIVERSAL​::isa($var,'PDL​::Complex');
$t = $var->type;

$t = $bintable_types{$t};

Changing the following line in the regex from

 | ^=begin \\s\* \(\\S\+\) \.\*? \\n=end \\s\* \\1 \.\*? $EOP

to

 | ^=begin \\s\+ \(?​:\\S\+\) \.\*? \\n=end \\s\* \\1 \.\*? $EOP =cut \\s\*? $EOP

appears to fix the problem. I am not sure that the fix
completely handles all edge cases but I believe it is a
step to the correct solution.

I’ve fixed it with commit 0b2be16.

(The other issue you reported I have not fixed. I believe it’s a
Text​::Balanced bug.)

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 12, 2011

From @cpansprout

On Sat Sep 10 00​:09​:29 2011, sprout wrote​:

On Tue Jun 07 18​:34​:06 2011, chm wrote​:

Using the FILTER_ONLY with code_no_comments fails for the following
code because the $pod_or_DATA regex does not correctly capture the
pod sequence. It appears to capture up to the = of the last =cut
leaving a naked cut behind which is parses as an undeclared sub.

my $dims = pdl($var->dims);
($t = $dims->(0)) .= 1;
$rpt = $dims->prod;

=begin WHENCOMPLEXVALUESWORK

if( UNIVERSAL​::isa($var,'PDL​::Complex') ) {
$rpt = $var->dim(1);
$t = 'complex'
} else {
$t = type $var;
}

=end WHENCOMPLEXVALUESWORK

=cut

barf "Error​: wfits() currently can not handle PDL​::Complex arrays
(column $colnames[$i])\n"
if UNIVERSAL​::isa($var,'PDL​::Complex');
$t = $var->type;

$t = $bintable_types{$t};

Changing the following line in the regex from

 | ^=begin \\s\* \(\\S\+\) \.\*? \\n=end \\s\* \\1 \.\*? $EOP

to

 | ^=begin \\s\+ \(?​:\\S\+\) \.\*? \\n=end \\s\* \\1 \.\*? $EOP =cut \\s\*? $EOP

appears to fix the problem. I am not sure that the fix
completely handles all edge cases but I believe it is a
step to the correct solution.

I’ve fixed it with commit 0b2be16.

(The other issue you reported I have not fixed. I believe it’s a
Text​::Balanced bug.)

It is, and it has already been reported at
<https://rt.cpan.org/Public/Bug/Display.html?id=5722>.

I have just confirmed that CPAN is upstream for Text​::Balanced, so I’m
marking this ticket as resolved.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 12, 2011

@cpansprout - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 12, 2011

From @cpansprout

On Sun Sep 11 20​:21​:00 2011, sprout wrote​:

On Sat Sep 10 00​:09​:29 2011, sprout wrote​:

On Tue Jun 07 18​:34​:06 2011, chm wrote​:

Using the FILTER_ONLY with code_no_comments fails for the following
code because the $pod_or_DATA regex does not correctly capture the
pod sequence. It appears to capture up to the = of the last =cut
leaving a naked cut behind which is parses as an undeclared sub.

my $dims = pdl($var->dims);
($t = $dims->(0)) .= 1;
$rpt = $dims->prod;

=begin WHENCOMPLEXVALUESWORK

if( UNIVERSAL​::isa($var,'PDL​::Complex') ) {
$rpt = $var->dim(1);
$t = 'complex'
} else {
$t = type $var;
}

=end WHENCOMPLEXVALUESWORK

=cut

barf "Error​: wfits() currently can not handle PDL​::Complex arrays
(column $colnames[$i])\n"
if UNIVERSAL​::isa($var,'PDL​::Complex');
$t = $var->type;

$t = $bintable_types{$t};

Changing the following line in the regex from

 | ^=begin \\s\* \(\\S\+\) \.\*? \\n=end \\s\* \\1 \.\*? $EOP

to

 | ^=begin \\s\+ \(?&#8203;:\\S\+\) \.\*? \\n=end \\s\* \\1 \.\*? $EOP =cut \\s\*? $EOP

appears to fix the problem. I am not sure that the fix
completely handles all edge cases but I believe it is a
step to the correct solution.

I’ve fixed it with commit 0b2be16.

(The other issue you reported I have not fixed. I believe it’s a
Text​::Balanced bug.)

It is, and it has already been reported at
<https://rt.cpan.org/Public/Bug/Display.html?id=5722>.

I have just confirmed that CPAN is upstream for Text​::Balanced, so I’m
marking this ticket as resolved.

Sorry, I was getting this ticket confused with #92438. This *is* a
Filter​::Simple bug, in that it misuses Text​::Balanced’s extract_variable
function, which is not appropriate here.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 12, 2011

@cpansprout - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 14, 2011

From @cpansprout

On Sun Sep 11 20​:43​:21 2011, sprout wrote​:

Sorry, I was getting this ticket confused with #92438. This *is* a
Filter​::Simple bug, in that it misuses Text​::Balanced’s extract_variable
function, which is not appropriate here.

I’ve now fixed it with commit 828d619.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Sep 14, 2011

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant