-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peculiar debugger 'list' command behavior: "floating point" line numbers #21350
Comments
|
I spent some time looking at this last week, the regexp for matching a range is overly permissive, and there's some strange processing of the matched values I don't understand. This week I've been sick and haven't been able to follow-up on it. The permissive regexp and the strange processing have been there long before @shlomif 's re-org of the debugger. |
|
I will take a look this week if Tony can't get to it. I think I remember
where it is from back when I wrote the comments.
…On Thu, Aug 17, 2023 at 4:57 PM Tony Cook ***@***.***> wrote:
I spent some time looking at this last week, the regexp for matching a
range is overly permissive, and there's some strange processing of the
matched values I don't understand.
This week I've been sick and haven't been able to follow-up on it.
The permissive regexp and the strange processing have been there long
before @shlomif <https://github.com/shlomif> 's re-org of the debugger.
—
Reply to this email directly, view it on GitHub
<#21350 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAO6GY44MQR6AN632GMIYTXV2VWRANCNFSM6AAAAAA3JAZEZY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
|
hi all. No promises, but I also would like to investigate this issue |
|
We should be able to get it between us!
…On Thu, Aug 17, 2023 at 11:27 PM Shlomi Fish ***@***.***> wrote:
hi all.
No promises, but I also would like to investigate this issue
—
Reply to this email directly, view it on GitHub
<#21350 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAO6G7K4PLF3X3VSPREQHDXV4DOLANCNFSM6AAAAAA3JAZEZY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
|
Have fun, I'll get pinged on updates to this ticket. |
|
In order to write a test assertion, I/we need to know what is the desirable behaviour. I support rejecting the inputs with an error. "explicit is better than implicit" |
|
That seems reasonable. |
|
Agreed.
…On Sun, Aug 20, 2023 at 6:43 PM Tony Cook ***@***.***> wrote:
That seems reasonable.
—
Reply to this email directly, view it on GitHub
<#21350 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAO6G6QLXRCL7LB6D2GNWLXWK4NNANCNFSM6AAAAAA3JAZEZY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
|
Okay, finally working on this, and the bug is caused by the regex that tries to extract the list range, and is being too clever; it's trying to parse a couple different things at once, and messing up because of that. This is trying to handle Last, the I am going to break this match up into several different matches and match the pure line-number behavior documented in |
Joe, thanks for plugging away on this. Could you identify the relevant line numbers in the codebase so that others can look at them as well? Thanks. |
|
Sure, I'll edit the previous comment with that. |
|
Scanning up from the regex in question at line 2805, several of these cases are already handled and shouldn't be matched here at all:
But the line 2805 regex needs to be simplified to (probably) |
|
I dug back through the commit history, and at dad0832 (Feb 2002), lines 905 to 912 show us that a line number of So these should be acceptable range specs:
And so on. So that regex should have a case for matching |
|
At that same commit, I can see that the same regex we've been using is still there. No special casing for anything other than I'm going to jump back to before Ilya's commit that added this line and see if there was more complex code for parsing the regex results earlier. |
|
Larry's March 1991 commit contains the same regex without the leading I think the only special case that is actually supposed to be there is This is the earliest commit where the debugger actually exists, so I think we've dug back as far as we can go. |
|
Pattern cleaned up and passing current tests; $ was also removed. Going to add a few more tests to verify that the formerly-accepted but invalid strings are now rejected -- there's a catchall that print "Invalid line range specification" at the end. Should have a PR tomorrow. |
The pattern which matches valid line number ranges in the debugger is far too complaisant when it comes to strings it will accept. A number of completely nonsensical values are cheerfully accepted and produce results that range from nothing at all to very odd results indeed: - specifying a float for a line number appends the fractional part of the float to ALL line numbers until something else sets the starting line number to an integer again.[1] - Strings like `.$.$.$`, `.....`, and `22.$` are all acccepted. Some do nothing, some generate errors. This tracks back to a catchall line range pattern that was added in the Perl *3* debugger at commit a687059. This pattern looks as if it were added to eventually implement a number of things, including lines relative to the current line, negative indexes into the magic source code array, variable references, and more. Several of these were implemented elsewhere in the debugger, but the regex was never simplified. This change breaks up that regex and cleans up a few others (including code that internally generates possible negative line numbers). Changes to the old catchall regex: - Remove the conditional match of a leading `-`. Negative line numbers are not permitted. - Remove `$` as a valid character in a linespec in the catchall regex. `$scalar` is handled explicitly earlier in the _cmd_l code. There is no other documented/legitimate use for '$' in a list linespec. - The linespec character class treated `.`, and digits, as equally valid characters in a linespec. This led to the linespecs matching floating point numbers, IPv4 addresses, and various nonnumeric nonsense that did not translate to a valid line number. The combined character set was split into either one period (handled already as "the current line") or a series of digits _only_ (a possible line number). This had to be done for both the starting and ending linespec. - \A and \z were added to the catchall regex to ensure that it matched the whole of the remaining line, or failed. This prevents things like `...` from being matched as equivalent to `.`. - Make sure the 'v' command creates good linespecs. As previously coded, this could generate linespecs that started with a negative number. Since we've outlawed negative numbers in linespecs, a `v` too close to the start of the file (as occurs in the tests for perl5db.pl) would cause the more-stringent range check to throw an error. Instead, the 'v' command now checks the generated step back and sets it to 1 if it is less than 1. The existing test is sufficient to confirm that the new code in cmd_v fixes the issue. - Verify that float line numbers are illegal: This was the original bug that triggered this change: the original regex accepted floating-point numbers, and because indexing an array in Perl with a float works because of implicit type conversions, and the increment of a float by (integer) 1 doesn't downgrade the float to an int, the line numbers continued to be displayed with the fractional portion of the original number until something intervened with an explicit or implicit integer linespec. We simply don't permit them to be specified as line numbers now. - Use \w for valid variable name characters: The debugger should eventually be UTF-8 compatible, but is not; this is a step along the way to get there. It makes the variable match in the cmd_l reference parsing code match proper Perl variable names, including UTF-8 "word" characters, and now also properly matches regex result variables ($1, $2, etc.). It explicitly rules out bare $, which the old regex would match, triggering an error. - A new perldb/t test file was added to verify that all the range cleanups work properly, along with new stanzas in the perl5db.pl test suite.
|
#21488 fixes the egregious behavior. We still have some issues that are outside the scope of this bug, and I'll open new issues for those. They're related to UTF-8 and are a lot more challenging. |
|
Fixed by #21488 |
Module: perl5db.pl
Description
Accidentally entering a line range with one or more periods on a list command causes the debugger to produce anomalous line numbers
Steps to Reproduce
perl -dany script.l 1.10.lcommand.l 1.2.3, the debugger will list line 1 as line 1.2.3.l 127.0.0.1will list line 127 as127.0.0.1 ....lcommands will not show decimal points, so the logic is detecting that floating point numbers with no decimals can be displayed as integersExpected behavior
The debugger should either reject numbers containing periods on a list command; automatically truncate them to integers (probably the best option); or interpret the first
.as a comma, then truncate anything after a second., and print the resulting range of lines.Perl configuration
The text was updated successfully, but these errors were encountered: