-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regexp problem with unterminated strings #89
Comments
Okay, I think adding a question mark to the final double-quote should do the trick. Then the string check in read_atom needs raise an exception if the first character is a double-quote but the last is not. I think that will work as long as the language regex engine is properly greedy.
Sound reasonable? I probably won't be able to get to this for a while. Feel free to send me a PR if you feel up to it :-) |
Yeah, using that new one works, at least for Common Lisp. I believe all perl regexp compatible library should just work. |
Now that #90 is implemented, I'm finally getting back around to this. I pushed a "test_unclosed_string" branch with step1 tests to catch this. Here is the list of broken implementations and whether a fix has been implemented yet:
|
Hm, I've implemented a check in my readers by testing for the first char of the token and if it's a double quote, test the last char of the token as well. Depending on that check either an error is thrown or a string object is returned. |
@wasamasa yeah, there are a number of implementations where the check just doesn't match the error text. The fixes are pretty simple. I was just planning to do them myself instead of creating a branch for fixes since the fixes tend to take less than a minute per implementation. FYI, the run with the fixed up test is here: https://travis-ci.org/kanaka/mal/builds/484027719 |
Okay, I pushed fixes for the 47 implementations that were complaining. We'll see if everything passes before closing: https://travis-ci.org/kanaka/mal/builds/484570489 |
The regular expression provided in the guide is really impressive, however, when the input string is
"123
, it will filter out the first double quote. So how can we detect the unmatched"
then?I found several implementations, which is using regular expressions to do lexical analysis, have this problem, including C, Go and OCaml.
The text was updated successfully, but these errors were encountered: