-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
[REPL] More accurate ends_with_semicolon
#43374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
ends_with_semicolonends_with_semicolon
|
(For fully robust semicolon detection, it seems better to extend |
|
Thanks for the suggestions! I implemented both. It's true that this still has corner cases, and that is unfortunate, but to turn it around: this is exactly why I made this small PR! The new approach from this PR has fewer corner cases than the current approach, so we should both be happy :) Is this PR not an objective improvement over the current state? The old test cases still pass, the new test cases now also pass. |
|
This doesn't seem right: julia> ends_with_semicolon("foo; \"\" ")
trueProbably instead of stripping strings entirely you should replace them with empty strings. |
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
Thanks! Also implemented now |
|
This returns |
|
@stevengj True! I don't think it's worth our time to fix this edge case, but please close the PR if you disagree. |
|
How about an improved finite state machine that handles strings etc: let matchend = Dict("\"" => r"\"", "\"\"\"" => r"\"\"\"", "'" => r"'",
"`" => r"`", "```" => r"```", "#" => r"$"m, "#=" => r"=#|#=")
global _rm_strings_and_comments
function _rm_strings_and_comments(code::Union{String,SubString{String}})
buf = IOBuffer(sizehint=sizeof(code))
pos = 1
while true
i = findnext(r"\"(?!\"\")|\"\"\"|'|`(?!``)|```|#(?!=)|#=", code, pos)
isnothing(i) && break
match = SubString(code, i)
j = findnext(matchend[match]::Regex, code, last(i)+1)
if match == "#=" # possibly nested
nested = 1
while j !== nothing
nested += SubString(code, j) == "#=" ? +1 : -1
iszero(nested) && break
j = findnext(r"=#|#=", code, last(j)+1)
end
elseif match[1] != '#' # quote match: check non-escaped
while j !== nothing
notbackslash = findprev(!=('\\'), code, first(j)-1)::Int
isodd(first(j) - notbackslash) && break # not escaped
j = findnext(matchend[match]::Regex, code, first(j)+1)
end
end
isnothing(j) && break
if match[1] == '#'
print(buf, SubString(code, pos, first(i)-1))
else
print(buf, SubString(code, pos, last(i)), ' ', SubString(code, j))
end
pos = last(j)+1
end
print(buf, SubString(code, pos, lastindex(code)))
return String(take!(buf))
end
end
ends_with_semicolon(code::Union{String,SubString{String}}) =
contains(_rm_strings_and_comments(code), r";\s*$")
ends_with_semicolon(code::AbstractString) = ends_with_semicolon(String(code))This passes all of the tests, except for the invalid Update: fixed to correctly identify escaped quotes, which are preceded by an odd number of backslashes. |
|
Even better, thanks a million! I will implement this and update the tests tomorrow 👍 |
Co-Authored-By: Steven G. Johnson <2913679+stevengj@users.noreply.github.com>
|
What should I do to get this PR through? |
|
Thanks for your help @stevengj , I really admire that you resolved our discussion about a temporary 'hack' by writing a proper implementation :) |
* More accurate `ends_with_semicolon` Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
* More accurate `ends_with_semicolon` Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
* More accurate `ends_with_semicolon` Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
REPL.ends_with_semicolonis used to determine whether or not to display output in the REPL, but IJulia and Pluto also use it. This PR uses some regex replacements instead of a stateful search for a more accurate heuristic.I did read and agree with this comment, but it turned out to be fairly easy to implement a "parser" good enough for this purpose. The new implementation is actually shorter, albeit with more "regex magic".
The new strategy requires a bit more calculation, but I think that this problem is not performance sensitive, and accuracy is more important? Let me know if you agree.
Fix #28743 and some others