-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On RHS of s///, ${9} works but ${10} does not #12948
Comments
From @epaCreated by @epaI found this surprising: #!/usr/bin/perl Why is ${9} accepted on the RHS of a substitution but ${10} not? The same applies in ordinary code: say ${9}; # ok Perl Info
|
From @iabynOn Tue, May 07, 2013 at 07:14:03AM -0700, Ed Avis wrote:
It's just the braced variant of the variable name (${10} verses $10, and $ perl589 -e'use strict; I don't know if there is any rationale behind this, but a first glance it -- |
The RT System itself - Status changed from 'new' to 'open' |
From @ikegamiOn Tue, May 7, 2013 at 10:35 AM, Dave Mitchell <davem@iabyn.com> wrote:
No other type of variable seems to generate a strict error.
|
From @nwc10On Wed, May 08, 2013 at 01:47:48AM -0400, Eric Brine wrote:
Knowing how the implementation has to handle the mutli-character numeric Still, I agree with Dave that it feels like a bug. It suggests that the Nicholas Clark |
From @epaAnother interesting wrinkle to this bug is that $010 = 'a'; would be a syntax error. -- |
From @nwc10On Wed, May 08, 2013 at 01:10:37PM +0000, Ed Avis wrote:
Yes, particularly as they don't seem to offer anything other than $ perl -le 'use strict; $00 = 2; $000 = 3; print foreach $0, $00, $000' And, they aren't even octal :-) Nicholas Clark |
From @cpansproutOn Wed May 08 06:20:01 2013, nicholas wrote:
It has long been documented that variable can start with a digit, in -- Father Chrysostomos |
From @epaFTR, the behaviour is still the same with 5.28.2. |
I'm willing to work on this issue, but I don't understand enough about how things are tokenized, etc to efficiently get started. Can someone give me some tips? |
Karl asked me for some help investigating #12948, this is what I came up with. I have not even run "make test" so likely this is broken in important ways, but it should be enough of a thread for Karl to pull on that hopefull the whole shirt comes apart. So to speak. :-) What it does do is make ${10} parse the same way as $10
${10} and $10 were handled differently, this patch makes them be handled the same. It also forbids multi-digit numeric variables from starting with 0. Thus $00 is now a new fatal exception "Numeric variables with more than one digit may not start with '0'"
${10} and $10 were handled differently, this patch makes them be handled the same. It also forbids multi-digit numeric variables from starting with 0. Thus $00 is now a new fatal exception "Numeric variables with more than one digit may not start with '0'"
${10} and $10 were handled differently, this patch makes them be handled the same. It also forbids multi-digit numeric variables from starting with 0. Thus $00 is now a new fatal exception "Numeric variables with more than one digit may not start with '0'"
This was fixed by the commit above |
Looking at this bug report (because it recently came up on p5p) it wasn't entirely clear to me what the issues were. What one should be aware of: when using
An example (using an older perl):
Running the code with
It appears (I did not verify in code) to turn
What commit 60267e1 does:
Forbidding octal (leading numbers with '0') in
Note: even today you can still use hex/binary inside
Footnotes
|
In 60267e1 I patched toke.c to refuse $00 but did not properly handle ${00} and related cases when the code was unicode. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". A byproduct of this change is that the logic to detect duplicated leading zeros is now quite a bit simpler. This includes more tests for leading zero checks. See Issue #12948, Issue #19986, and Issue #19989.
See #20000 |
In 60267e1 I patched toke.c to refuse $00 but did not properly handle ${00} and related cases when the code was unicode. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". A byproduct of this change is that the logic to detect duplicated leading zeros is now quite a bit simpler. This includes more tests for leading zero checks. See Issue #12948, Issue #19986, and Issue #19989.
In 60267e1 I patched toke.c to refuse $00 but did not properly handle ${00} and related cases when the code was unicode. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". This includes more tests for leading zero checks. See Issue #12948, Issue #19986, and Issue #19989.
@bram-perl I added a bunch of tests to validate things. One thing you missed in your analysis was that my patch (somewhat accidentally) changes S_scan_ident so it parses an entire var more often than it used to, which is what is responsible for the change from a run time var for |
In 60267e1 I patched toke.c to refuse $00 but did not properly handle ${00} and related cases when the code was unicode. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". This includes more tests for leading zero checks. See Issue #12948, Issue #19986, and Issue #19989.
I did not miss it but maybe I didn't spell it out clear enough; As for the hex/binary identifiers: that's a separate discussion so might be best to leave it out of the discussion of this issue. |
On Thu, 28 Jul 2022 at 14:25, Bram ***@***.***> wrote:
I did not miss it but maybe I didn't spell it out clear enough;
As for the hex/binary identifiers: that's a separate discussion so might
be best to leave it out of the discussion of this issue.
Heh. Too late. Running make test right now. ;-p
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
…strict. Executive summary: in ${ .. } style notation consistently forbid octal and allow multi-digit longer decimal values under strict. The vars ${1} through ${9} have always been allowed under strict, but ${10} threw an error unlike its equivalent variable $10. In 60267e1 I patched toke.c to refuse octal like $001 but did not properly handle ${001} and related cases when the code was under 'use utf8'. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". This includes more tests for allowed vars and forbidden var names. See Issue #12948, Issue #19986, and Issue #19989.
…strict. Executive summary: in ${ .. } style notation consistently forbid octal and allow multi-digit longer decimal values under strict. The vars ${1} through ${9} have always been allowed under strict, but ${10} threw an error unlike its equivalent variable $10. In 60267e1 I patched toke.c to refuse octal like $001 but did not properly handle ${001} and related cases when the code was under 'use utf8'. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". This includes more tests for allowed vars and forbidden var names. See Issue #12948, Issue #19986, and Issue #19989.
…strict. Executive summary: in ${ .. } style notation consistently forbid octal and allow multi-digit longer decimal values under strict. The vars ${1} through ${9} have always been allowed under strict, but ${10} threw an error unlike its equivalent variable $10. In 60267e1 I patched toke.c to refuse octal like $001 but did not properly handle ${001} and related cases when the code was under 'use utf8'. Part of the reason was the confusing macro VALID_LEN_ONE_IDENT() which despite its name does not restrict what it matches to things which are one character long. Since the VALID_LEN_ONE_IDENT() macro is used in only one place and its name and placement is confusing I have moved it back into the code inline as part of this fix. I have also added more comments about what is going on, and moved the related comment directly next to the code that it affects. If it moved out of this code then we should think of a better name and be more careful and clear about checking things like length. I would argue the logic is used to parse what might be called a variable "description", and thus it is not identical to code which might validate an actual parsed variable name. Eg, ${^Var} is a description of the variable whose "name" is "\026ar". The exception of course is $^ whose name actually is "^". This includes more tests for allowed vars and forbidden var names. See Issue Perl#12948, Issue Perl#19986, and Issue Perl#19989.
Migrated from rt.perl.org#117907 (status was 'open')
Searchable as RT117907$
The text was updated successfully, but these errors were encountered: