-
Notifications
You must be signed in to change notification settings - Fork 37
🐛 Parse resp-text with invalid resp-text-code
#601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This can be used to catche parse errors, restore the parser state, and then either call a fallback or re-raise the error. NOTE: Reckless backtracking can lead to O(n**2) situations, so it should very rarely be used. Fallbacks should not backtrack.
`resp-text` is defined as: ```abnf resp-text = ["[" resp-text-code "]" SP] text ; RFC3501 resp-text = ["[" resp-text-code "]" SP] [text] ; RFC9051 ``` And our even more lenient interpretation (which incorporates RFC9051 errata), is this: ```abnf resp-text = ["[" resp-text-code "]" [SP [text]] / [text] ``` And, although the `resp-text-code` grammar is elaborate, the final alternative is a superset of every other alternative (in both RFCs): ```abnf resp-text-code /= atom [SP 1*<any TEXT-CHAR except "]">] ``` Notice that `text` is a superset of `atom`, `SP`, `"["`, `"]"`, and every other allowed character in `resp-text-code`. So, even if `resp-text-code` fails, `resp-text` may still succeed by parsing everything as `text`. However, (prior to this commit) the parser commits to `resp-text-code` as soon as `"["` is encountered at the beginning of `resp-text`. If what follows is not `resp-text-code`, maybe it doesn't begin with an `atom` or it doesn't have a closing `"]"`, then we fail to parse `resp-text` correctly. Fixing this requires either looking further ahead more than a single token or backtracking when `resp-text-code` fails to parse. In this case, I think backtracking is the best approach. We don't need to worry about backtracking taking exponential time, because the fallback (`text`) is exceptionally simple (does not call any other productions), will parse all the way up to the next CRLF, and so nothing backtrackable can be nested. This uses `ResponseParser#current_state` that was added for #599 and relies on #600 to ensure that debug error messages are not printed when backtracking. Fixes #597.
675b663 to
7fb8b57
Compare
|
Note that this still preserves
|
|
Another follow-up PR is still needed to handle On the other hand, this might be considered a breaking change: it would break code that expects |
| rescue ResponseParseError => error | ||
| raise if /\buid-set\b/i.match? error.message | ||
| restore_state state | ||
| text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. This is a bug. It should be ResponseText.new(nil, text? || "")
This bug was introduced in #601, which hasn't been released yet.
resp-textis defined as:And our even more lenient interpretation (which incorporates RFC9051 errata), is this:
And, although the
resp-text-codegrammar is elaborate, in both RFC3501 and RFC9051, the final alternative is a superset of every other alternative:Notice that
textis a superset ofatom,SP,"[","]", and every other allowed character inresp-text-code. So, ifresp-text-codefails,resp-textshould backtrack and fallback to just parsing everything astext.However, (prior to this PR) the parser commits to
resp-text-codeas soon as"["is encountered at the beginning ofresp-text. If what follows is notresp-text-code, maybe it doesn't begin with anatomor it doesn't have a closing"]", then we fail to parseresp-textcorrectly.Fixing this requires either looking further ahead more than a single token or backtracking when
resp-text-codefails to parse. In this case, I think backtracking is the best approach. We don't need to worry about backtracking taking exponential time, because the fallback (text) is exceptionally simple (does not call any other productions), will parse all the way up to the next CRLF, and so nothing backtrackable can be nested.This uses
ResponseParser#current_statethat was added for #599 and relies on #600 to ensure that debug error messages are not printed when backtracking.Fixes #597.