Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix unicode indexing in parse(Complex, string) #51758

Merged
merged 1 commit into from Oct 20, 2023

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Oct 18, 2023

This fixes a string-indexing bug introduced in #24713 (Julia 0.7). Sometimes, this would cause parse(Complex{T}, string) to throw a StringIndexError rather than an ArgumentError, e.g. for parse(ComplexF64, "3 β+ 4im") or parse(ComplexF64, "3 + 4αm"). (As far as I can tell, it can never cause parsing to fail for valid strings.)

The source of the error is that if i is the index of an ASCII character in a string s, then s[i+1] is valid (even if the next character is non-ASCII) but s[i-1] is invalid if the previous character is non-ASCII.

@stevengj stevengj added kind:bugfix This change fixes an existing bug backport 1.6 Change should be backported to release-1.6 backport 1.8 Change should be backported to release-1.8 backport 1.9 Change should be backported to release-1.9 backport 1.10 Change should be backported to the 1.10 release labels Oct 18, 2023
@@ -321,14 +321,14 @@ function tryparse_internal(::Type{Complex{T}}, s::Union{String,SubString{String}
if i₊ == i # leading ± sign
i₊ = something(findnext(in(('+','-')), s, i₊+1), 0)
end
if i₊ != 0 && s[i₊-1] in ('e','E') # exponent sign
if i₊ != 0 && s[prevind(s, i₊)] in ('e','E') # exponent sign
Copy link
Member Author

@stevengj stevengj Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Checks like this could be sped up further if we used codeunits(s) and did byte indexing/comparisons rather than Char comparisons. It would uglify the code and maybe the speedup would be small, however?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In any case, speedups to this function should go into a separate PR, not a bugfix PR, since speedups need not be backported.)

@stevengj
Copy link
Member Author

Unrelated CI failures, it looks like: Error in testset Pkg in one test, Error in testset Downloads in another test, fatal error allocating signal stack: mmap: Cannot allocate memory in another test, JIT session error: Duplicate section in asan test, and fatal: could not read Username for 'https://github.com' in llvmpasses test

@JeffBezanson JeffBezanson merged commit f71228d into master Oct 20, 2023
3 of 6 checks passed
@JeffBezanson JeffBezanson deleted the sgj/fixparsecomplex branch October 20, 2023 20:24
KristofferC pushed a commit that referenced this pull request Oct 23, 2023
This fixes a string-indexing bug introduced in #24713 (Julia 0.7).
Sometimes, this would cause `parse(Complex{T}, string)` to throw a
`StringIndexError` rather than an `ArgumentError`, e.g. for
`parse(ComplexF64, "3 β+ 4im")` or `parse(ComplexF64, "3 + 4αm")`. (As
far as I can tell, it can never cause parsing to fail for valid
strings.)

The source of the error is that if `i` is the index of an ASCII
character in a string `s`, then `s[i+1]` is valid (even if the next
character is non-ASCII) but `s[i-1]` is invalid if the previous
character is non-ASCII.

(cherry picked from commit f71228d)
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.6 Change should be backported to release-1.6 backport 1.8 Change should be backported to release-1.8 backport 1.9 Change should be backported to release-1.9 kind:bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants