Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor bug fix and enhancement to to pp.c:pp_pack() #1886

Closed
p5pRT opened this issue Apr 25, 2000 · 9 comments
Closed

Minor bug fix and enhancement to to pp.c:pp_pack() #1886

p5pRT opened this issue Apr 25, 2000 · 9 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Apr 25, 2000

Migrated from rt.perl.org#3154 (status was 'resolved')

Searchable as RT3154$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 25, 2000

From johnno@ms.com

Created by root@pland.cwp

This message contains a small set of patches to pp.c​:pp_pack/unpack().
This slightly rounds out the new /' modifier to support XDR
format more fully.

Included is a bug fix to the decode of unpack 'N/Z*' which
was not adding a byte to the encoded length as pack does.

There is also option to use 'N/a*4' to round the length of encoded
strings to four bytes as required by SUN XDR RFC 1832.

More contentiously perhaps the code also supports a new length
character '?' which steals the length from the next argument.
This allows multiple arrays to be encoded and decoded viz.​:

my @​arr = (1,2,3,4);
my $packed = pack "N/N?N/N?", scalar @​arr, @​arr, scalar @​arr, @​arr;

Using these basic changes it is possible to decode/encode SUN RPC calls,
indeed I have tested this code fairly extensively with an NFS server
implemented entirely in perl (less strange a thing to do than it sounds).

Included are patches to t/op/pack.t and pod/perlfunc.pod regression
testing the changes and documenting them.

I hope you find this interesting. I've tried to keep the changes
to the absolute minimum to try to perhaps slip under the 5.6.1 wire.

Regards,

John Holdsworth
Coldwave Programs
Ltd.

Patch file to perl-5.6.0 follows...

*** t/op/pack.t Mon Mar 13 21​:25​:37 2000
--- t/op/pack.new Wed Apr 19 23​:35​:31 2000
***************
*** 6,12 ****
  require Config; import Config;
  }
 
! print "1..156\n";
 
  $format = "c2 x5 C C x s d i l a6";
  # Need the expression in here to force ary[5] to be numeric. This avoids
--- 6,12 ----
  require Config; import Config;
  }
 
! print "1..160\n";
 
  $format = "c2 x5 C C x s d i l a6";
  # Need the expression in here to force ary[5] to be numeric. This avoids
***************
*** 405,407 ****
--- 405,427 ----
  w/A* # Count a BER integer
  EOP
  print 'not ' unless $z eq "\000\006string\003etc"; print "ok $test\n"; $test++;
+
+ # 157..160 test XDR N/? and unpack == pack
+
+ sub punp {
+ my $template = shift;
+ my $packed = pack $template, @​_;
+ my @​unpacked = unpack $template, $packed;
+
+ print "not " if join( '', @​unpacked ) ne join( '', @​_ );
+ print "ok $test\n";
+ $test++;
+ }
+
+ my @​arr = (11,22,33,44,55);
+
+ punp "N/a*4N/N?N/a?4@​?", ("hello", scalar @​arr, @​arr, 5, "there", 48);
+ punp "i/A*N/I?I/A*", ("hello", scalar @​arr, @​arr, "there");
+ punp "N/Z*w/a*N/f?N/i*", ("hello", "there", scalar @​arr, @​arr, @​arr);
+ punp "N/Z*4i/s?w/A?N/a*", ("hell", scalar @​arr, @​arr, 2, "hi", "john");
+
*** pod/perlfunc.pod Mon Mar 20 22​:13​:52 2000
--- pod/perlfunc.pod.new Wed Apr 19 23​:31​:29 2000
***************
*** 3006,3011 ****
--- 3006,3018 ----
  The repeat count for C<u> is interpreted as the maximal number of bytes
  to encode per line of output, with 0 and 1 replaced by 45.
 
+ The characterC<?> for the repeat count means the next argument will
+ be taken as the number of items to be encoded. This allows multiple
+ variable length arrays to be packed and unpacked in the same string
+ when using the C</> modifier. As packed results can now be variable
+ length the special case C<@​?> on unpack returns the number of bytes
+ into the string that have been processed up to that point.
+
  =item *
 
  The C<a>, C<A>, and C<Z> types gobble just one value, but pack it as a
***************
*** 3090,3104 ****
  and describes how the length value is packed.
  The ones likely to be of most use are integer-packing ones like
  C<n> (for Java strings), C<w> (for ASN.1 or SNMP)
! and C<N> (for Sun XDR).
 
- The I<string-item> must, at present, be C<"A*">, C<"a*"> or C<"Z*">.
  For C<unpack> the length of the string is obtained from the I<length-item>,
  but if you put in the '*' it will be ignored.
 
  unpack 'C/a', "\04Gurusamy"; gives 'Guru'
  unpack 'a3/A* A*', '007 Bond J '; gives (' Bond','J')
  pack 'n/a* w/a*','hello,','world'; gives "\000\006hello,\005world"
 
  The I<length-item> is not returned explicitly from C<unpack>.
 
--- 3097,3112 ----
  and describes how the length value is packed.
  The ones likely to be of most use are integer-packing ones like
  C<n> (for Java strings), C<w> (for ASN.1 or SNMP)
! and C<N/N*4> (for Sun XDR).
 
  For C<unpack> the length of the string is obtained from the I<length-item>,
  but if you put in the '*' it will be ignored.
 
  unpack 'C/a', "\04Gurusamy"; gives 'Guru'
  unpack 'a3/A* A*', '007 Bond J '; gives (' Bond','J')
  pack 'n/a* w/a*','hello,','world'; gives "\000\006hello,\005world"
+ pack 'N/a*4N/a*4','hello,','world'; gives (SUN XDR string format)
+ "\000\000\000\006hello,\000\000\000\000\000\005world\000\000\000"
 
  The I<length-item> is not returned explicitly from C<unpack>.
 
*** pp.c Mon Mar 20 15​:35​:44 2000
--- pp.c.new Wed Apr 19 11​:15​:26 2000
***************
*** 3284,3289 ****
--- 3284,3293 ----
  #define ISUUCHAR(ch) (memchr(PL_uuemap, (ch), sizeof(PL_uuemap)-1) || (ch) == ' ')
  #endif
 
+ #define NEXTPAT( _chr ) (pat < patend && *pat == (_chr) && pat++)
+
+
+
  PP(pp_unpack)
  {
  djSP;
***************
*** 3402,3410 ****
  goto reparse;
  break;
  case '@​'​:
  if (len > strend - strbeg)
  DIE(aTHX_ "@​ outside of string");
- s = strbeg + len;
  break;
  case 'X'​:
  if (len > s - strbeg)
--- 3406,3422 ----
  goto reparse;
  break;
  case '@​'​:
+ if (NEXTPAT('?')) {
+ EXTEND(SP, 1);
+ EXTEND_MORTAL(1);
+ sv = NEWSV(43, 0);
+ sv_setuv(sv, (UV)(s-strbeg));
+ PUSHs(sv_2mortal(sv));
+ }
+ else
+ s = strbeg + len;
  if (len > strend - strbeg)
  DIE(aTHX_ "@​ outside of string");
  break;
  case 'X'​:
  if (len > s - strbeg)
***************
*** 3420,3430 ****
  if (start_sp_offset >= SP - PL_stack_base)
  DIE(aTHX_ "/ must follow a numeric type");
  datumtype = *pat++;
- if (*pat == '*')
- pat++; /* ignore '*' for compatibility with pack */
  if (isDIGIT(*pat))
  DIE(aTHX_ "/ cannot take a count" );
! len = POPi;
  star = 0;
  goto redo_switch;
  case 'A'​:
--- 3432,3441 ----
  if (start_sp_offset >= SP - PL_stack_base)
  DIE(aTHX_ "/ must follow a numeric type");
  datumtype = *pat++;
  if (isDIGIT(*pat))
  DIE(aTHX_ "/ cannot take a count" );
! NEXTPAT('*'); /* ignore '*' for compatibility with pack */
! len = NEXTPAT('?') ? TOPi : POPi;
  star = 0;
  goto redo_switch;
  case 'A'​:
***************
*** 3436,3448 ****
  goto uchar_checksum;
  sv = NEWSV(35, len);
  sv_setpvn(sv, s, len);
- s += len;
  if (datumtype == 'A' || datumtype == 'Z') {
  aptr = s; /* borrow register */
  if (datumtype == 'Z') { /* 'Z' strips stuff after first null */
  s = SvPVX(sv);
  while (*s)
  s++;
  }
  else { /* 'A' strips both nulls and spaces */
  s = SvPVX(sv) + len - 1;
--- 3447,3460 ----
  goto uchar_checksum;
  sv = NEWSV(35, len);
  sv_setpvn(sv, s, len);
  if (datumtype == 'A' || datumtype == 'Z') {
  aptr = s; /* borrow register */
  if (datumtype == 'Z') { /* 'Z' strips stuff after first null */
  s = SvPVX(sv);
  while (*s)
  s++;
+ if (pat[-1] == '*')
+ len++;
  }
  else { /* 'A' strips both nulls and spaces */
  s = SvPVX(sv) + len - 1;
***************
*** 3454,3459 ****
--- 3466,3474 ----
  s = aptr; /* unborrow register */
  }
  XPUSHs(sv_2mortal(sv));
+ if (NEXTPAT('4'))
+ len = (len + 3 & ~3);
+ s += len;
  break;
  case 'B'​:
  case 'b'​:
***************
*** 4438,4451 ****
  DIE(aTHX_ "Repeat count in pack overflows");
  }
  }
  else
  len = 1;
! if (*pat == '/') {
! ++pat;
! if ((*pat != 'a' && *pat != 'A' && *pat != 'Z') || pat[1] != '*')
! DIE(aTHX_ "/ must be followed by a*, A* or Z*");
! lengthcode = sv_2mortal(newSViv(sv_len(items > 0
! ? *MARK : &PL_sv_no)));
  }
  switch(datumtype) {
  default​:
--- 4453,4472 ----
  DIE(aTHX_ "Repeat count in pack overflows");
  }
  }
+ else if (NEXTPAT('?')) {
+ fromstr = NEXTFROM;
+ len = SvIV(fromstr);
+ }
  else
  len = 1;
! if (NEXTPAT('/')) {
! lengthcode = items > 0 ? *MARK : &PL_sv_no;
! if (pat[1] != '?') {
! if (strchr("aAZ",*pat))
! lengthcode = sv_2mortal(newSViv(sv_len(lengthcode)));
! else
! lengthcode = sv_2mortal(newSViv(items));
! }
  }
  switch(datumtype) {
  default​:
***************
*** 4490,4495 ****
--- 4511,4518 ----
  if (datumtype == 'Z')
  ++len;
  }
+ if (NEXTPAT('4'))
+ len = (len + 3 & ~3);
  if (fromlen >= len) {
  sv_catpvn(cat, aptr, len);
  if (datumtype == 'Z')

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.6.0:

Configured by root at Wed Apr 19 23:36:44 BST 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.2.5-15, archname=i686-linux
    uname='linux pland.cwp 2.2.5-15 #1 mon apr 19 23:00:46 edt 1999 i686 unknown '
    config_args='-d'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define 
    use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=undef
  Compiler:
    cc='cc', optimize='-O2', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
    cppflags='-fno-strict-aliasing'
    ccflags ='-fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    stdchar='char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt
    libc=/lib/libc-2.1.1.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.6.0:
    /usr/local/lib/perl5/5.6.0/i686-linux
    /usr/local/lib/perl5/5.6.0
    /usr/local/lib/perl5/site_perl/5.6.0/i686-linux
    /usr/local/lib/perl5/site_perl/5.6.0
    /usr/local/lib/perl5/site_perl
    .


Environment for perl v5.6.0:
    HOME=/root
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:.
    PERL_BADLANG (unset)
    SHELL=/bin/bash


@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 25, 2000

From [Unknown Contact. See original ticket]

John Holdsworth writes​:

Included is a bug fix to the decode of unpack 'N/Z*' which
was not adding a byte to the encoded length as pack does.

Not clear. unpack 'N/Z*' with N unpacking to 14 should be the same as
Z14. Is it?

There is also option to use 'N/a*4' to round the length of encoded
strings to four bytes as required by SUN XDR RFC 1832.

I think a more general command is needed. Say, 'x!8' which would add
0-bytes until the next position which is multiple of 8. Then you
would write 'N/a*x!4'.

More contentiously perhaps the code also supports a new length
character '?' which steals the length from the next argument.
This allows multiple arrays to be encoded and decoded viz.​:

my @​arr = (1,2,3,4);
my $packed = pack "N/N?N/N?", scalar @​arr, @​arr, scalar @​arr, @​arr;

Again, something more general is needed. Something like

  my $packed = pack "(N//N)(N//N)", scalar @​arr, @​arr, scalar @​arr, @​arr;

or similar.

Ilya

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 26, 2000

From [Unknown Contact. See original ticket]

Ilya Zakharevich wrote​:

John Holdsworth writes​:

Included is a bug fix to the decode of unpack 'N/Z*' which
was not adding a byte to the encoded length as pack does.

Hello IIlya, thanks for your comments.

Not clear. unpack 'N/Z*' with N unpacking to 14 should be the same as
Z14. Is it?

It seems not. There is some rather strange code in pp_pack()
case 'A'​:
case 'Z'​:
case 'a'​:
  fromstr = NEXTFROM;
  aptr = SvPV(fromstr, fromlen);
  if (pat[-1] == '*') {
  len = fromlen;
  if (datumtype == 'Z')
  ++len;
  }
which bumps the length. I copied this into unpack so unpack(pack("N/Z*")) worked.

There is also option to use 'N/a*4' to round the length of encoded
strings to four bytes as required by SUN XDR RFC 1832.

I think a more general command is needed. Say, 'x!8' which would add
0-bytes until the next position which is multiple of 8. Then you
would write 'N/a*x!4'.

I would agree with this. I was trying to make the absolute
minum change possible to what is very much core code.

More contentiously perhaps the code also supports a new length
character '?' which steals the length from the next argument.
This allows multiple arrays to be encoded and decoded viz.​:

my @​arr = (1,2,3,4);
my $packed = pack "N/N?N/N?", scalar @​arr, @​arr, scalar @​arr, @​arr;

Again, something more general is needed. Something like

my $packed = pack "(N//N)(N//N)", scalar @​arr, @​arr, scalar @​arr, @​arr;

I'm not so sure this would be more intuitive. '?' is in effect a new way to specify
the length of operands after '123' and '*'. It steals an argument for a length in
pack and leaves it on the stack on unpack. This is very useful for processing
SUN/RPC calls along with "@​?" which pushes the current position in the string
onto the stack so you know how much of the string has been processed.

or similar.

Ilya

Cheers,

john.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 26, 2000

From [Unknown Contact. See original ticket]

On Wed, Apr 26, 2000 at 12​:25​:08PM +0100, John Holdsworth wrote​:

Included is a bug fix to the decode of unpack 'N/Z*' which
was not adding a byte to the encoded length as pack does.

Not clear. unpack 'N/Z*' with N unpacking to 14 should be the same as
Z14. Is it?

It seems not. There is some rather strange code in pp_pack()

Hmm, I was asking about unpack(), not about pack().

case 'A'​:
case 'Z'​:
case 'a'​:
fromstr = NEXTFROM;
aptr = SvPV(fromstr, fromlen);
if (pat[-1] == '*') {
len = fromlen;
if (datumtype == 'Z')
++len;
}
which bumps the length.

???? This chunk calculates the *required* length basing on the length
of the input data. The calculatation looks correct. AFAICU, It
should have nothing to do with unpack() logic.

What is the problem you want to address?

There is also option to use 'N/a*4' to round the length of encoded
strings to four bytes as required by SUN XDR RFC 1832.

I think a more general command is needed. Say, 'x!8' which would add
0-bytes until the next position which is multiple of 8. Then you
would write 'N/a*x!4'.

I would agree with this. I was trying to make the absolute
minum change possible to what is very much core code.

Thanks, but for user-visible changes this is not the best criterion.

my @​arr = (1,2,3,4);
my $packed = pack "N/N?N/N?", scalar @​arr, @​arr, scalar @​arr, @​arr;

Again, something more general is needed. Something like

my $packed = pack "(N//N)(N//N)", scalar @​arr, @​arr, scalar @​arr, @​arr;

I'm not so sure this would be more intuitive. '?' is in effect a new way to specify
the length of operands after '123' and '*'.

Currently this can be done with interpolation into the pattern​:

  my $len = @​arr;
  my $packed = pack "N$len N$len", @​arr, @​arr;

I can easily believe that the feature you wanted to add is desirable,
but until we can invent a more intuitive and more general "API". I do
not think that what you did should go in.

Ilya

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 26, 2000

From [Unknown Contact. See original ticket]

Created by root@pland.cwp

This message contains a small set of patches to pp.c​:pp_pack/unpack().
This slightly rounds out the new /' modifier to support XDR
format more fully.

Included is a bug fix to the decode of unpack 'N/Z*' which
was not adding a byte to the encoded length as pack does.

There is also option to use 'N/a*4' to round the length of encoded
strings to four bytes as required by SUN XDR RFC 1832.

More contentiously perhaps the code also supports a new length
character '?' which steals the length from the next argument.
This allows multiple arrays to be encoded and decoded viz.​:

my @​arr = (1,2,3,4);
my $packed = pack "N/N?N/N?", scalar @​arr, @​arr, scalar @​arr, @​arr;

Using these basic changes it is possible to decode/encode SUN RPC calls,
indeed I have tested this code fairly extensively with an NFS server
implemented entirely in perl (less strange a thing to do than it sounds).

Included are patches to t/op/pack.t and pod/perlfunc.pod regression
testing the changes and documenting them.

I hope you find this interesting. I've tried to keep the changes
to the absolute minimum to try to perhaps slip under the 5.6.1 wire.

Regards,

John Holdsworth
Coldwave Programs
Ltd.

Patch file to perl-5.6.0 follows...

*** t/op/pack.t Mon Mar 13 21​:25​:37 2000
--- t/op/pack.new Wed Apr 19 23​:35​:31 2000
***************
*** 6,12 ****
  require Config; import Config;
  }

! print "1..156\n";

  $format = "c2 x5 C C x s d i l a6";
  # Need the expression in here to force ary[5] to be numeric. This avoids
--- 6,12 ----
  require Config; import Config;
  }

! print "1..160\n";

  $format = "c2 x5 C C x s d i l a6";
  # Need the expression in here to force ary[5] to be numeric. This avoids
***************
*** 405,407 ****
--- 405,427 ----
  w/A* # Count a BER integer
  EOP
  print 'not ' unless $z eq "\000\006string\003etc"; print "ok $test\n"; $test++;
+
+ # 157..160 test XDR N/? and unpack == pack
+
+ sub punp {
+ my $template = shift;
+ my $packed = pack $template, @​_;
+ my @​unpacked = unpack $template, $packed;
+
+ print "not " if join( '', @​unpacked ) ne join( '', @​_ );
+ print "ok $test\n";
+ $test++;
+ }
+
+ my @​arr = (11,22,33,44,55);
+
+ punp "N/a*4N/N?N/a?4@​?", ("hello", scalar @​arr, @​arr, 5, "there", 48);
+ punp "i/A*N/I?I/A*", ("hello", scalar @​arr, @​arr, "there");
+ punp "N/Z*w/a*N/f?N/i*", ("hello", "there", scalar @​arr, @​arr, @​arr);
+ punp "N/Z*4i/s?w/A?N/a*", ("hell", scalar @​arr, @​arr, 2, "hi", "john");
+
*** pod/perlfunc.pod Mon Mar 20 22​:13​:52 2000
--- pod/perlfunc.pod.new Wed Apr 19 23​:31​:29 2000
***************
*** 3006,3011 ****
--- 3006,3018 ----
  The repeat count for C<u> is interpreted as the maximal number of bytes
  to encode per line of output, with 0 and 1 replaced by 45.

+ The characterC<?> for the repeat count means the next argument will
+ be taken as the number of items to be encoded. This allows multiple
+ variable length arrays to be packed and unpacked in the same string
+ when using the C</> modifier. As packed results can now be variable
+ length the special case C<@​?> on unpack returns the number of bytes
+ into the string that have been processed up to that point.
+
  =item *

  The C<a>, C<A>, and C<Z> types gobble just one value, but pack it as a
***************
*** 3090,3104 ****
  and describes how the length value is packed.
  The ones likely to be of most use are integer-packing ones like
  C<n> (for Java strings), C<w> (for ASN.1 or SNMP)
! and C<N> (for Sun XDR).

- The I<string-item> must, at present, be C<"A*">, C<"a*"> or C<"Z*">.
  For C<unpack> the length of the string is obtained from the I<length-item>,
  but if you put in the '*' it will be ignored.

  unpack 'C/a', "\04Gurusamy"; gives 'Guru'
  unpack 'a3/A* A*', '007 Bond J '; gives (' Bond','J')
  pack 'n/a* w/a*','hello,','world'; gives "\000\006hello,\005world"

  The I<length-item> is not returned explicitly from C<unpack>.

--- 3097,3112 ----
  and describes how the length value is packed.
  The ones likely to be of most use are integer-packing ones like
  C<n> (for Java strings), C<w> (for ASN.1 or SNMP)
! and C<N/N*4> (for Sun XDR).

  For C<unpack> the length of the string is obtained from the I<length-item>,
  but if you put in the '*' it will be ignored.

  unpack 'C/a', "\04Gurusamy"; gives 'Guru'
  unpack 'a3/A* A*', '007 Bond J '; gives (' Bond','J')
  pack 'n/a* w/a*','hello,','world'; gives "\000\006hello,\005world"
+ pack 'N/a*4N/a*4','hello,','world'; gives (SUN XDR string format)
+ "\000\000\000\006hello,\000\000\000\000\000\005world\000\000\000"

  The I<length-item> is not returned explicitly from C<unpack>.

*** pp.c Mon Mar 20 15​:35​:44 2000
--- pp.c.new Wed Apr 19 11​:15​:26 2000
***************
*** 3284,3289 ****
--- 3284,3293 ----
  #define ISUUCHAR(ch) (memchr(PL_uuemap, (ch), sizeof(PL_uuemap)-1) || (ch) == ' ')
  #endif

+ #define NEXTPAT( _chr ) (pat < patend && *pat == (_chr) && pat++)
+
+
+
  PP(pp_unpack)
  {
  djSP;
***************
*** 3402,3410 ****
  goto reparse;
  break;
  case '@​'​:
  if (len > strend - strbeg)
  DIE(aTHX_ "@​ outside of string");
- s = strbeg + len;
  break;
  case 'X'​:
  if (len > s - strbeg)
--- 3406,3422 ----
  goto reparse;
  break;
  case '@​'​:
+ if (NEXTPAT('?')) {
+ EXTEND(SP, 1);
+ EXTEND_MORTAL(1);
+ sv = NEWSV(43, 0);
+ sv_setuv(sv, (UV)(s-strbeg));
+ PUSHs(sv_2mortal(sv));
+ }
+ else
+ s = strbeg + len;
  if (len > strend - strbeg)
  DIE(aTHX_ "@​ outside of string");
  break;
  case 'X'​:
  if (len > s - strbeg)
***************
*** 3420,3430 ****
  if (start_sp_offset >= SP - PL_stack_base)
  DIE(aTHX_ "/ must follow a numeric type");
  datumtype = *pat++;
- if (*pat == '*')
- pat++; /* ignore '*' for compatibility with pack */
  if (isDIGIT(*pat))
  DIE(aTHX_ "/ cannot take a count" );
! len = POPi;
  star = 0;
  goto redo_switch;
  case 'A'​:
--- 3432,3441 ----
  if (start_sp_offset >= SP - PL_stack_base)
  DIE(aTHX_ "/ must follow a numeric type");
  datumtype = *pat++;
  if (isDIGIT(*pat))
  DIE(aTHX_ "/ cannot take a count" );
! NEXTPAT('*'); /* ignore '*' for compatibility with pack */
! len = NEXTPAT('?') ? TOPi : POPi;
  star = 0;
  goto redo_switch;
  case 'A'​:
***************
*** 3436,3448 ****
  goto uchar_checksum;
  sv = NEWSV(35, len);
  sv_setpvn(sv, s, len);
- s += len;
  if (datumtype == 'A' || datumtype == 'Z') {
  aptr = s; /* borrow register */
  if (datumtype == 'Z') { /* 'Z' strips stuff after first null */
  s = SvPVX(sv);
  while (*s)
  s++;
  }
  else { /* 'A' strips both nulls and spaces */
  s = SvPVX(sv) + len - 1;
--- 3447,3460 ----
  goto uchar_checksum;
  sv = NEWSV(35, len);
  sv_setpvn(sv, s, len);
  if (datumtype == 'A' || datumtype == 'Z') {
  aptr = s; /* borrow register */
  if (datumtype == 'Z') { /* 'Z' strips stuff after first null */
  s = SvPVX(sv);
  while (*s)
  s++;
+ if (pat[-1] == '*')
+ len++;
  }
  else { /* 'A' strips both nulls and spaces */
  s = SvPVX(sv) + len - 1;
***************
*** 3454,3459 ****
--- 3466,3474 ----
  s = aptr; /* unborrow register */
  }
  XPUSHs(sv_2mortal(sv));
+ if (NEXTPAT('4'))
+ len = (len + 3 & ~3);
+ s += len;
  break;
  case 'B'​:
  case 'b'​:
***************
*** 4438,4451 ****
  DIE(aTHX_ "Repeat count in pack overflows");
  }
  }
  else
  len = 1;
! if (*pat == '/') {
! ++pat;
! if ((*pat != 'a' && *pat != 'A' && *pat != 'Z') || pat[1] != '*')
! DIE(aTHX_ "/ must be followed by a*, A* or Z*");
! lengthcode = sv_2mortal(newSViv(sv_len(items > 0
! ? *MARK : &PL_sv_no)));
  }
  switch(datumtype) {
  default​:
--- 4453,4472 ----
  DIE(aTHX_ "Repeat count in pack overflows");
  }
  }
+ else if (NEXTPAT('?')) {
+ fromstr = NEXTFROM;
+ len = SvIV(fromstr);
+ }
  else
  len = 1;
! if (NEXTPAT('/')) {
! lengthcode = items > 0 ? *MARK : &PL_sv_no;
! if (pat[1] != '?') {
! if (strchr("aAZ",*pat))
! lengthcode = sv_2mortal(newSViv(sv_len(lengthcode)));
! else
! lengthcode = sv_2mortal(newSViv(items));
! }
  }
  switch(datumtype) {
  default​:
***************
*** 4490,4495 ****
--- 4511,4518 ----
  if (datumtype == 'Z')
  ++len;
  }
+ if (NEXTPAT('4'))
+ len = (len + 3 & ~3);
  if (fromlen >= len) {
  sv_catpvn(cat, aptr, len);
  if (datumtype == 'Z')

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.6.0:

Configured by root at Wed Apr 19 23:36:44 BST 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.2.5-15, archname=i686-linux
    uname='linux pland.cwp 2.2.5-15 #1 mon apr 19 23:00:46 edt 1999 i686 unknown '
    config_args='-d'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define
    use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=undef
  Compiler:
    cc='cc', optimize='-O2', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
    cppflags='-fno-strict-aliasing'
    ccflags ='-fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    stdchar='char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt
    libc=/lib/libc-2.1.1.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:



@INC for perl v5.6.0:
    /usr/local/lib/perl5/5.6.0/i686-linux
    /usr/local/lib/perl5/5.6.0
    /usr/local/lib/perl5/site_perl/5.6.0/i686-linux
    /usr/local/lib/perl5/site_perl/5.6.0
    /usr/local/lib/perl5/site_perl
    .


Environment for perl v5.6.0:
    HOME=/root
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:.
    PERL_BADLANG (unset)
    SHELL=/bin/bash




@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 26, 2000

From [Unknown Contact. See original ticket]

On Wed, Apr 26, 2000 at 08​:55​:42PM +0100, John Holdsworth wrote​:

Included is a bug fix to the decode of unpack 'N/Z*' which
was not adding a byte to the encoded length as pack does.

  DB<1> x pack "N/Z* a4", "hi there ", "1234";
0 "\c@​\c@​\c@​\cIhi there \c@​1234"
  DB<2> x chr 9
0 "\cI"
  DB<3> x length "hi there "
0 9

This is clearly a bug. Either \0 should not have been inserted, or
(better) the length should have been marked as 10.

But this is a bug in pack(). I do not see why any change should be
done to unpack().

I can easily believe that the feature you wanted to add is desirable,
but until we can invent a more intuitive and more general "API". I do
not think that what you did should go in.

No problem, how it is represented in the template is going to
require more discussion. I am mainly putting forward the idea.

I do not care which letter represents "it" in the template. I care
that "it" which you implemented looks ad hoc and may not generalize to
other useful situations.

Ilya

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 27, 2000

From [Unknown Contact. See original ticket]

Ilya Zakharevich <ilya@​math.ohio-state.edu> wrote

This is clearly a bug. Either \0 should not have been inserted, or
(better) the length should have been marked as 10.

Since Z is defined to *always* insert a \0, the length should be 10.

But this is a bug in pack(). I do not see why any change should be
done to unpack().

Agreed. Patch attached.

Mike Guy

Inline Patch
--- ./pp.c.orig	Thu Apr 27 18:11:50 2000
+++ ./pp.c	Thu Apr 27 18:04:17 2000
@@ -4445,7 +4445,8 @@
 	    if ((*pat != 'a' && *pat != 'A' && *pat != 'Z') || pat[1] != '*')
 		DIE(aTHX_ "/ must be followed by a*, A* or Z*");
 	    lengthcode = sv_2mortal(newSViv(sv_len(items > 0
-						   ? *MARK : &PL_sv_no)));
+						   ? *MARK : &PL_sv_no)
+                                            + (*pat == 'Z' ? 1 : 0)));
 	}
 	switch(datumtype) {
 	default:
--- ./t/op/pack.t.orig	Thu Apr 27 18:11:50 2000
+++ ./t/op/pack.t	Thu Apr 27 18:12:05 2000
@@ -372,8 +372,9 @@
 
 eval { ($x) = pack '/a*','hello' };
 print 'not ' unless $@; print "ok $test\n"; $test++;
-$z = pack 'n/a* w/A*','string','etc';
-print 'not ' unless $z eq "\000\006string\003etc"; print "ok $test\n"; $test++;
+$z = pack 'n/a* N/Z* w/A*','string','hi there ','etc';
+print 'not ' unless $z eq "\000\006string\0\0\0\012hi there \000\003etc";
+print "ok $test\n"; $test++;
 
 eval { ($x) = unpack 'a/a*/a*', '212ab345678901234567' };
 print $@ eq '' && $x eq 'ab3456789012' ? "ok $test\n" : "#$x,$@\nnot ok $test\n";

End of patch
@p5pRT
Copy link
Author

@p5pRT p5pRT commented May 2, 2003

From @iabyn

The patch appears to have been applied.

@p5pRT
Copy link
Author

@p5pRT p5pRT commented May 2, 2003

@iabyn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant