inconsistent .op parsing for unicode operators #9684

stevengj · 2015-01-08T21:44:44Z

As discussed on the mailing list, there is an annoying inconsistency in the parser:

julia> parse("5.≤x")
:(5.0 ≤ x) 

julia> parse("5.<=x") 
:(5 .<= x)

It would be nice to fix this by modifying the parser to treat . followed by any Unicode operator in the same way, so that we get "dot" versions of all the Unicode operators rather than having to manually add each operator twice. See also #6929 (comment)

The text was updated successfully, but these errors were encountered:

jakebolewski · 2015-01-08T22:42:54Z

As noted on the mailing list this particular example is due to julia's space sensative parsing but operations like:

julia> parse("5 .⊕  10")
ERROR: ParseError("extra token \"10\" after end of expression")
 in parse at string.jl:1257
 in parse at string.jl:1267

could be parsed correctly.

stevengj · 2015-01-08T23:26:03Z

@jakebolewski, look at the example more closely. The problem is not that it is space-sensitive, the problem is that .≤ and .<= are treated differently.

.⊕ is different because it is not treated as an operator at all, regardless of spacing. Although this is a separate problem, my feeling is that the fix is to parse .op consistently regardless of op, and that solution will kill two birds with one stone.

jakebolewski · 2015-01-09T00:08:47Z

I've added the missing unicode operators that were special cased by the parser in 1101086. I agree a more general solution would be nice.

tkelman · 2015-01-09T06:09:22Z

backport pending label for 1101086 - it's harder to lose track of as an issue label than a commit comment

stevengj · 2015-01-09T14:11:24Z

Closing this, as the immediate inconsistency is fixed. If people want more dot parsing of unicode operators (not to mention operators with combining characters like +̂), that can be a separate issue.

(cherry picked from commit 9322f20) Conflicts: test/runtests.jl

tkelman · 2015-01-16T09:23:14Z

backported in 68d11e4, and tests in a09b2fa

jiahao added parser Language parsing and surface syntax domain:unicode Related to unicode characters and encodings and removed domain:unicode Related to unicode characters and encodings labels Jan 8, 2015

stevengj added the domain:unicode Related to unicode characters and encodings label Jan 8, 2015

tkelman added the backport pending label Jan 9, 2015

tkelman referenced this issue Jan 9, 2015

add some missing unicode operators to dot-opchar?

1101086

stevengj closed this as completed Jan 9, 2015

jakebolewski added a commit that referenced this issue Jan 9, 2015

add parser test for issue #9684

9322f20

jakebolewski added a commit that referenced this issue Jan 16, 2015

add parser test for issue #9684

a09b2fa

(cherry picked from commit 9322f20) Conflicts: test/runtests.jl

tkelman removed the backport pending label Jan 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inconsistent .op parsing for unicode operators #9684

inconsistent .op parsing for unicode operators #9684

stevengj commented Jan 8, 2015

jakebolewski commented Jan 8, 2015

stevengj commented Jan 8, 2015

jakebolewski commented Jan 9, 2015

tkelman commented Jan 9, 2015

stevengj commented Jan 9, 2015

tkelman commented Jan 16, 2015

inconsistent .op parsing for unicode operators #9684

inconsistent .op parsing for unicode operators #9684

Comments

stevengj commented Jan 8, 2015

jakebolewski commented Jan 8, 2015

stevengj commented Jan 8, 2015

jakebolewski commented Jan 9, 2015

tkelman commented Jan 9, 2015

stevengj commented Jan 9, 2015

tkelman commented Jan 16, 2015