Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for non-standard identifiers #32408

Merged
merged 5 commits into from Aug 16, 2019
Merged

Conversation

@c42f
Copy link
Contributor

c42f commented Jun 25, 2019

Introduce syntax for non-standard identifiers. Features:

  • Have a direct way to print hidden symbols (eg, from macro expansion and closure conversion)
  • Directly write "weird" identifiers from other languages (for example R allows . as an identifier character, XRef JuliaInterop/RCall.jl#71 (comment))
  • Possibly distinguish between identifiers vs keywords in the future (see #26819)

Various syntax choices were explored (see below), with var"ident" currently chosen. Though this has the familiar visual form of a string macro, special parser support was required to make it valid in every context where a normal identifier is allowed. Therefore, this is technically a breaking change. However, the use here has exactly the same semantics as the use in RCall which is the only package I could find on github which defines @var_str.

show has been changed to use this syntax as necessary when printing expressions:

  • Fixes #32354 by printing macro identifiers using var"@ident" rather than as a macro call @ident.
  • Fixes printing of symbols generated by macro hygiene, for example var"#89#err" rather than the bare #89#err
  • Simplify printing of functions and type names which are generated by the compiler.
@c42f c42f added the bugfix label Jun 25, 2019
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jun 25, 2019

Ugh, tests are failing due to a somewhat dubious Symbol hack in Test.

base/show.jl Show resolved Hide resolved
@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jun 27, 2019

Maybe it's time to add:

macro sym_str(str)
    Symbol(str)
end

🎉
I don't really see any downside to that, except that people will also want to write quoted symbols, which you can do with the subtly-different syntax Symbol("..."). See also #9945. So the only point of this would be nicer printing of Exprs containing non-identifier symbols.

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jun 27, 2019

🎉 ... wait, what :-)

That's a neat idea I think we just need a better name for it: when used in a block of code, it doesn't produce a (quoted) Symbol, but rather an identifier [edit: yes, an unquoted symbol is an identifier. But I was really expecting the quoted version there]. So we could have ident"##", name"##" var"##" or some such thing? Ugly is kind of a feature here, but the current version of $(Symbol("##")) is a bit too ugly.

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jul 2, 2019

Agreed; var" " seems nice. Hopefully doesn't conflict with anything in a package.

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 3, 2019

I tried searching github for this and it turns out the RCall defines @var_str... to be exactly the macro we want: https://github.com/JuliaInterop/RCall.jl/blob/271d9e2ace285417666b731cda3f7b33d60f6aea/src/macros.jl#L79

It seems @simonbyrne added this in JuliaInterop/RCall.jl#71 (comment)

@c42f c42f force-pushed the cjf/fix-show-symbols-in-exprs branch from 8862925 to 8be46c4 Jul 3, 2019
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 3, 2019

@var_str is added. Still need to clean up the old Test code which is abusing Symbol to store strings.

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 3, 2019

I think the look of @var_str in macro expansions is quite satisfying -

julia> @macroexpand @info "msg" x=1 y=2
quote
    #= logging.jl:305 =#
    var"#25#level" = Base.CoreLogging.Info
    #= logging.jl:306 =#
    var"#26#std_level" = Base.CoreLogging.convert(Base.CoreLogging.LogLevel, var"#25#level")
    #= logging.jl:307 =#
    if var"#26#std_level" >= Base.CoreLogging.getindex(Base.CoreLogging._min_enabled_level)
        #= logging.jl:308 =#
        var"#27#group" = $(QuoteNode("REPL[23]"))
        #= logging.jl:309 =#
        var"#28#_module" = Main
        #= logging.jl:310 =#
        var"#29#logger" = Base.CoreLogging.current_logger_for_env(var"#26#std_level", var"#27#group", var"#28#_module")
        #= logging.jl:311 =#
        if !(var"#29#logger" === Base.CoreLogging.nothing)
            #= logging.jl:312 =#
... etc etc
@KristofferC

This comment has been minimized.

Copy link
Contributor

KristofferC commented Jul 3, 2019

Not sure if this matters but e.g.

try catch var"#89#err" end

is currently a syntax error and this gets printed from macro expansions.

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 3, 2019

@KristofferC oh good catch. This particular case is easily fixed in the parser; not ideal but I think acceptable given that $ interpolation is already special cased in the same way.

@simonbyrne

This comment has been minimized.

Copy link
Contributor

simonbyrne commented Jul 3, 2019

I seem to recall that there are some edge cases where it acts a little unexpectedly. The only one I can think of at the moment is when it is used after a ., e.g.

julia> macro var_str(str)
           esc(Symbol(str))
       end
@var_str (macro with 1 method)

julia> x = 1+2im
1 + 2im

julia> x.im
2

julia> x.var"im"
ERROR: LoadError: type Complex has no field @var_str
Stacktrace:
 [1] getproperty(::Any, ::Symbol) at ./sysimg.jl:18
in expression starting at REPL[4]:1
@KristofferC

This comment has been minimized.

Copy link
Contributor

KristofferC commented Jul 3, 2019

But wouldn't you have to write getproperty(x, var"im") anyway?

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jul 3, 2019

That would be getproperty(x, Symbol("im")).

True, without parser support var"" can't be a full drop-in replacement for identifiers.

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 3, 2019

I've added some parser support to fix the try catch var"exc" end case pointed out by @KristofferC but there's a few other problematic cases still. I'm not quite sure how much of a game of whack-a-mole that's going to turn out to be but the fact that $-interpolation works makes me hopeful.

And there I was, thinking it would be a guilty pleasure to tackle an easy "good first issue" like #32354 and get it sorted out within the hour :-)

@c42f c42f force-pushed the cjf/fix-show-symbols-in-exprs branch from 1fcfcc0 to 947efbc Jul 6, 2019
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 6, 2019

Good example @simonbyrne. I think I've got most of the cases in the parser sorted out now.

For the x.var"im" case I had to expand the syntax in the parser as it expects the symbol im to be quoted (whereas var"im" expands to an unquoted name im). Supporting printing of the @var_str macros in unusual places like import also causes some surprises in show which I haven't finished sorting out yet.

Taken together, I think it might be better to declare that var"str" is a special syntax which directly interpolates $(Symbol("str")) at parse time. Given that we already need parser support that would seem to be simpler and more consistent.

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jul 6, 2019

Makes sense; we can just add one case to parse-atom instead of lots of special cases everywhere..

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jul 6, 2019

We could try to find a non-breaking syntax for this (i.e. something that's currently an error). One option is @"name".

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 7, 2019

we can just add one case to parse-atom instead of lots of special cases everywhere

Agreed, that will be a lot cleaner.

We could try to find a non-breaking syntax for this (i.e. something that's currently an error). One option is @"name".

Ok that's an interesting idea. I wouldn't want to steal useful syntax for this but @"name" is kind of odd looking enough that it's not obvious that it should mean something.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Jul 7, 2019

$”string” is valid syntax but pretty useless. I had proposed :”#funky name” as syntax for Symbol(“#funky name”) which would mean that $:”#funky name” would be the syntax for the variable. Perhaps too odd looking? The syntax for funky symbols would be slightly breaking since :”foo” is currently a convoluted way to write ”foo”.

Perhaps something that is really non-breaking like Jeff’s @“#funky name” would be better after all—it more directly solves the problem here which is writing strange variables, not strange symbols. Would that mean that :@“foo” would become an alternate way to write :foo?

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 8, 2019

Would that mean that :@“foo” would become an alternate way to write :foo?

Yes that's a natural consequence of the way the parser works. Good point.

So comparing two alternatives for unquoted / quoted symbols:

  • @"funky" / :@"funky" (variable name / quote a variable name)
  • $:"funky" / :"funky" (unquote a quoted variable name / quote a variable name)

The second is a bit double-negative-y, not backward compatible and less natural in the parser. On the plus side it gives a neat syntax for quoted symbol literals and an uglier syntax for funky variable names.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Jul 8, 2019

Additional considerations which are kind of subjective:

  • :"funky" looks suggestive (to me) of what it is proposed to mean—a way of writing a symbol with strange content. This is also the syntax for unusual symbols in Ruby and since the :foo syntax was Ruby-inspired in the first place, it seems fitting to also have :"foo" mean the same thing. Note that Ruby allows interpolation in such a case, which is also handy.

  • $:"funky" looks suggestive of splicing something into the code, which is what is going on here. If I see quote $:foo = 1 end then I know that produces quote foo = 1 end and quote $:"funky" = 1 end looks suggestive of something similar. The fact that :"funky" looks like it might mean :funky means that I can guess what's going on here.

  • On the other hand, @"funky" looks like a macro call and :@"funky" looks like a quoted macro call.

Regarding breakingness of :"funky" for :funky, I really doubt that anyone in the wild is using :"funky" as a syntax for writing a string literal. So while this is technically breaking, I think it's very technical. Splicing a string value immediately after a : is already a syntax error:

julia> x = "funky"; :$x
ERROR: syntax: extra token "x" after end of expression

julia> x = "funky"; :$(x)
ERROR: MethodError: objects of type Symbol are not callable

So we don't really even have a referential transparency type of situation here—it would only break the literal use of :"foo" to write an immediate string literal.

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jul 8, 2019

I share the concern that @"a" looks like it has something to do with calling a macro. So I'd default to picking var"a". Colons and dollar signs can already get quite confusing. In particular, the most common use of interpolation looks like $identifier. So if a $ is required to write the identifier itself, you need an extra layer of @eval, which gets very annoying to deal with.

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 8, 2019

I tried out @"foo", it's a bit visually confusing.

var"str" has the advantage that it's visually familiar to anyone who already knows about string macros. The fact that it starts with an identifier makes it look more natural in various syntactic contexts. Some examples:

# R identifiers
foo(x, y, var"draw.segments" = 1)
foo(x, y, @"draw.segments" = 1)
foo(x, y, $:"draw.segments" = 1) 

# Constructing weirdly named types
struct var"##"
    var"hey-"
end
x = var"##"(1)
x.var"hey-"

struct @"##"
    @"hey-"
end
x = @"##"(1)
x.@"hey-"

struct $:"##"
    $:"hey-"
end
x = $:"##"(1)
x.$:"hey-"

We don't get a very natural way to write quoted symbol literals with it; x = :var"foo" looks odd. We could just have sym"foo" or :"foo" separately.

So if a $ is required to write the identifier itself, you need an extra layer of @eval, which gets very annoying to deal with.

Would it be a problem for the parser to eagerly interpolate $:"foo"? That doesn't seem different from declaring that var"foo" is special syntax and expanding it in the parser.

@c42f c42f changed the title Fix show of non-identifier Symbols in Exprs Syntax for unparsable identifiers and quoted symbol literals Jul 9, 2019
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Jul 9, 2019

Well, I continue to make a mess here, but I'll rebase all that rubbish away once the design converges. As of now, the parser changes include two modifications to parse-atom:

  • Parse var"foo" as the identifier name foo
  • Implement Stefan's proposed symbol literal syntax :"foo" to produce the quoted symbol :foo. I also translated :"foo$x" to Core.Symbol("foo$x") as I expect people would be surprised if this doesn't work. (Using Core feels icky... perhaps using Expr(:symbol, "foo", x) in analogy to Expr(:string, "foo", x) would be better?)

Subjectively I think these are the least surprising syntaxes which have been proposed so far in this thread.

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Jul 9, 2019

I'm not a big fan of :"a", even though it is visually nice and concise, since it's not consistent with colon as a quoting operator, and since Symbol("a") can already be used for that. We can tackle these separately.

@JeffBezanson

This comment has been minimized.

Copy link
Member

JeffBezanson commented Aug 16, 2019

Ok, hopefully the keyword thing is fixed. I think the remaining item is pretty printing for named tuples with reserved words as names, but that can be done as a separate fix.

@JeffBezanson JeffBezanson added this to the 1.3 milestone Aug 16, 2019
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Aug 16, 2019

Awesome, the keyword fix looks good to me.

@JeffBezanson JeffBezanson merged commit fb04178 into master Aug 16, 2019
15 of 16 checks passed
15 of 16 checks passed
buildbot/tester_freebsd64 Run complete
Details
buildbot/analyzegc_linux64 Run complete
Details
buildbot/doctest_linux64 Run complete
Details
buildbot/llvmpasses_linux64 Run complete
Details
buildbot/package_freebsd64 Run complete
Details
buildbot/package_linux32 Run complete
Details
buildbot/package_linux64 Run complete
Details
buildbot/package_macos64 Run complete
Details
buildbot/package_win32 Run complete
Details
buildbot/package_win64 Run complete
Details
buildbot/tester_linux32 Run complete
Details
buildbot/tester_linux64 Run complete
Details
buildbot/tester_macos64 Run complete
Details
buildbot/tester_win32 Run complete
Details
buildbot/tester_win64 Run complete
Details
buildbot/whitespace_linux32 Run complete
Details
@delete-merged-branch delete-merged-branch bot deleted the cjf/fix-show-symbols-in-exprs branch Aug 16, 2019
@marius311

This comment has been minimized.

Copy link
Contributor

marius311 commented Aug 16, 2019

I think this commit broke the following:

julia> var="foo"; run(`echo "$var"`)
ERROR: LoadError: unterminated double quote
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] #shell_parse#350(::String, ::typeof(Base.shell_parse), ::String, ::Bool) at ./shell.jl:106
 [3] #shell_parse at ./none:0 [inlined] (repeats 2 times)
 [4] @cmd(::LineNumberNode, ::Module, ::Any) at ./cmd.jl:387
in expression starting at REPL[46]:1

Any other variable name besides var works.

In any case surely no important package has the bad luck of triggering such an obscure bug :)

marius311 added a commit to marius311/julia that referenced this pull request Aug 17, 2019
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Aug 17, 2019

Good catch, though I'm not sure why you refer to PyCall. That particular use of $var is terminated by a closing parenthesis and should not be affected by this bug? [Edit: it turns out that Meta.parse() in positional mode is particularly sensitive to underlying stream position.]

In any case we do have a problem:

julia> Meta.parse("\"\$var\"")
:($(Expr(:incomplete, "incomplete: invalid string syntax")))

Looking at the parser code I think this particular case can be fixed (thankfully we don't support arbitrary atoms in string interpolation syntax). Your example hits the shell_parse code path which is slightly different as it calls Meta.parse().

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Aug 17, 2019

Ah, the error you're seeing in PyCall is a few lines down (https://github.com/JuliaPy/PyCall.jl/blob/6dcf5c2e1ac1399f0ad4f7d4c7cdadfd21faa524/deps/buildutils.jl#L8); it's unrelated to shell_parse and basically the same root cause as my Meta.parse example:

julia> dump(Meta.parse("\"'\$var'\""))
Expr
  head: Symbol '
  args: Array{Any}((1,))
    1: Expr
      head: Symbol string
      args: Array{Any}((2,))
        1: String "'"
        2: Symbol var
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Aug 18, 2019

So there's three problems here:

  • var syntax shouldn't be allowed in string interpolations (only identifiers are allowed). This seems easy to fix by not calling parse-atom while parsing string interpolations.

  • var syntax shouldn't be allowed in command string interpolations. This is more awkward to fix in isolation because Base.shell_parse already accepts some pretty weird things due to its use of Meta.parse in non-greedy mode, which ultimately calls into parse-atom. This behavior is inconsistent with string interpolation (see #3150).

  • The use of Meta.parse non-greedy mode combined with peek-token is inconsistent and returns the wrong position for the end of the parsed atom. This would cause Base.shell_parse("\"(\$var)\"") to miss the trailing ) character. Arguably this is an existing bug which interacts in a bad way with the changes here. For example, in julia 1.x

    julia> Meta.parse(":,", 1, greedy=false)
    (:(:), 3)  # should return (:(:), 2)
    
@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Aug 18, 2019

Ok, I believe I've got a fix for the PyCall problems at #32948.

@marius311

This comment has been minimized.

Copy link
Contributor

marius311 commented Aug 18, 2019

Awesome, @c42f, thanks for taking a quick look sorry to throw you off a bit with my alternate example!

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Aug 18, 2019

This is proving a bit more problematic than expected. Perhaps we got a little overeager and revert?

@c42f

This comment has been minimized.

Copy link
Contributor Author

c42f commented Aug 19, 2019

Thanks @marius311 for the quick reporting. It turned out that your example was also an important bug. I was confused because cmd interpolations are parsed quite differently from string interpolations.

@StefanKarpinski I think all the problems we know about should now be fixed in #32948 (awaiting CI) so I'd like to see how that pans out. This did have a lot more fallout than anticipated 😬

@StefanKarpinski

This comment has been minimized.

Copy link
Member

StefanKarpinski commented Aug 19, 2019

Yes, if that's the last of it then we can proceed.

@chriscoey

This comment has been minimized.

Copy link

chriscoey commented Aug 20, 2019

DynamicPolynomials.jl fails to precompile on Julia master - perhaps due to this PR? Please see JuliaAlgebra/DynamicPolynomials.jl#50 (comment)

dkarrasch added a commit to dkarrasch/julia that referenced this pull request Aug 24, 2019
Allow identifiers like Symbol("#example#") to be represented in julia
code with the syntax `var"#example#"`. This needed support in the parser
rather than being a string macro so that it could be used in any context
where a normal identifier is allowed.

Show nonstandard identifiers in Exprs with var"ident" syntax
findmyway added a commit to JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl that referenced this pull request Aug 28, 2019
findmyway added a commit to JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl that referenced this pull request Aug 28, 2019
* unify interfaces

* update screen after interact

* add minor comment

* add version check

* update docker

* revert to juali v1.2 due to JuliaLang/julia#32408

* update README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.