Dict syntax is getting me down #6739

Closed
JeffBezanson opened this Issue May 4, 2014 · 57 comments

Projects

None yet
@JeffBezanson
Member

I have not been enjoying that => is not first-class. It works well for simple dict literals ([a=>b, c=>d]), but the other forms cause trouble. For example I defined

typealias AbstractValue Dict{Symbol,LatticeElement}

but one cannot make a Dict of this type using this definition. You have to write (Symbol=>LatticeElement)[...] every time.

In the near future it will be possible to write Dict( (a,b) for i in x ), which makes a pretty good dict comprehension syntax without any special code in the front end. For literals this would be Dict([(a,b), (c,d), ...]), which starts to suffer from too much punctuation. One way to solve this is to make => a type, so you could write Dict(a=>b, c=>d, ...). => would not be iterable, so this would not be ambiguous. This would eliminate all special dict syntax. It would also make it more obvious how to write a typed empty Dict: you just pass 0 arguments.

@StefanKarpinski
Member

I've often wanted this too. What would the type of => be? Pair? Having a tuple of pairs as a lightweight Dict-like type would be really handy too. Or something like that.

@quinnj
Member
quinnj commented Jul 2, 2014

+1, it'd be great to have a=>b be =>(a,b) --> Pair.

@quinnj quinnj added this to the 0.4-projects milestone Aug 14, 2014
@quinnj
Member
quinnj commented Aug 14, 2014

It'd be great to address this with #7941.

@JeffBezanson
Member

Absolutely. Personally I want a syntax that generalizes, as described above, but all the better if if plays along well with array syntax.

@JeffBezanson
Member

I propose

  1. Make a=>b syntax for (a,b).
  2. Deprecate the syntax (a=>b)[...] and [ a=>b, ... ] and { a=>b, ... }.
  3. Continue to allow { a=>b for a in ... } a bit longer, but it should be deprecated too.
  4. Use Dict(a=>b, ...) and Dict{A,B}( ... ).

Dict(a=>b) is sketchy since Dict can also accept a single iterable argument, and tuples are iterable. It may be that the single best way to resolve this is to make Dict(xs...) fast. Otherwise you could run into problems if you have function f(pairs...) and you call Dict(pairs) inside. However we already use Dict(pairs) and Set(elts), so that might be too much change.

Another option is to make a=>b give a different type, but that could cause issues elsewhere, like iterating with for (k,v) in dict.

@vtjnash
Member
vtjnash commented Sep 27, 2014

Shouldn't that be Dict(a=b, ...)? Otherwise, what would the syntax in 4 mean generally?

@quinnj
Member
quinnj commented Sep 27, 2014

A couple other things to think about in accordance with other proposed changes:

  • #4470: Implementing generators could also provide a Dict(i=>v for i in ...) type syntax
  • #8470: Using { } for tuple types could open up Dict constructions/initialization to Dict({K,V}) or some other tricks
  • Actually, what about {K=>V} be shorthand for Dict{K,V}? It would kind of go hand in hand with the { } for tuple types change and writing {K=>V}[] seems pretty convenient.
    I kind of like the idea of a=>b returning a Pair(a,b) object that kind of represents a single association. I think it could be useful in a slew of other places apart from associatives. I don't understand how that would mess with iterating with for (k,v) in dict, @JeffBezanson?
@JeffBezanson
Member

Dict(a=b) could be allowed, but only works for symbol keys.

I really don't want special syntax using various combinations of brackets
and =>. Those don't generalize and are invariably hard to remember.

@toivoh
Member
toivoh commented Sep 28, 2014

I really think that a=>b should produce instances of a type distinct from Tuple, such as Pair. This will allow to distinguish uses such as Dict(a=>b) and Dict(iterable) based on syntax. Pair could still be iterable, or iteration over dicts could still return tuples.

@eschnett
Contributor

If a=>b produces a special type, then why not make it DictPair? Then the
distinction would be unambiguous. DictPair would not be iterable.

-erik

On Sun, Sep 28, 2014 at 2:09 AM, toivoh notifications@github.com wrote:

I really think that a=>b should produce instances of a type distinct from
Tuple, such as Pair. This will allow to distinguish uses such as
Dict(a=>b) and Dict(iterable) based on syntax. Pair could still be
iterable, or iteration over dicts could still return tuples.

Reply to this email directly or view it on GitHub
#6739 (comment).

Erik Schnetter schnetter@cct.lsu.edu
http://www.perimeterinstitute.ca/personal/eschnetter/

@JeffBezanson
Member

I think it could be OK for a=>b to give a Pair for the purpose of writing Dict(a=>b). In all other cases --- Dict(iter), iteration --- we can still use tuples. This is needed for composability with zip etc.

Another advantage @jakebolewski (IIRC) pointed out is that a Pair can be printed as a=>b.

@JeffBezanson JeffBezanson added a commit that referenced this issue Sep 28, 2014
@JeffBezanson JeffBezanson make `a=>b` syntax for `Pair(a,b)`
use Dict(iter) or Dict(::Pair...) as Dict constructors

ref #6739
034b8f5
@JeffBezanson
Member

Under development here: https://github.com/JuliaLang/julia/tree/jb/dictsyntax

After trying this briefly, I am instantly in love with it.

@jakebolewski
Member

Another use for the Pair type which was discussed was using pair syntax for keyword arguments:

func(a,b,c=>1) = ....

A breaking change which offhand doesn't bring real benefits, but does seem more consistent.

@johnmyleswhite
Member

I'm excited about the pair idea.

@timholy
Member
timholy commented Sep 28, 2014

Oh, I see, you already introduced another Pair type. Probably best to have only one Pair, though.

@JeffBezanson
Member

@timholy Ha, yes, of course. Should have thought of that.

Keyword arguments are not just pairs, but passed differently, by name instead of position. #7704 is related: in a=>b both a and b are evaluated, unlike in a=b keyword arg syntax. So f(x; a=>b) makes sense, and after this would actually work if the syntax were simply permitted (again, #7704).

@vtjnash
Member
vtjnash commented Sep 28, 2014

@JeffBezanson is is possible for => to be just another normal binary operator, so that it doesn't require adding Pair to Base.Operators?

@JeffBezanson
Member

Yes that is possible. Doing that would be a bit easier if we got rid of the
old dict syntax entirely, but we can still do it. There's also a chance you
wouldn't want => redefined or overloaded. In fact you do not want to add
definitions to this. Do we want => to be the name of the type instead of
Pair?

@Jutho
Contributor
Jutho commented Sep 29, 2014

I am sure there would be many uses of a Pair type in Base, several of which would rely on kind of an equivalent status of / reflexive relationship between both members, rather than the kind of (label,value) status / implied relationship that is used in Dict entries and is contained in the => notation. Not saying that this is incompatible, just something to consider.

@johnmyleswhite
Member

A Pair type might be nice for specifying graphs in terms of their edges.

@Jutho
Contributor
Jutho commented Sep 29, 2014

Indeed, or the nodes of a binary tree, where every pair contains new pairs, up to the nodes at the lowest level, which contain the leaves.

@JeffBezanson
Member

This Pair is like a tuple in that the types of both elements are tracked --- you wouldn't want to use it to represent a dynamic tree structure.

@johnmyleswhite
Member

I don't people are suggesting this be the final data structure, just that it's convenient sugar for specifying literal graphs.

@StefanKarpinski
Member

Yes, that's a fair point. It could well be useful to use the syntax for other data types.

@eschnett
Contributor

The Pair for a Dict should not be iterable. If you want a reflexive data structure, then a tuple would be best, no need to invent a new type Pair for this.

What we want here for Dict (and other, similar cases) is a Pair with a bit more structure, a type that implies a "from--to" relation or somesuch. Maybe we could use => as syntax for this, i.e. a use => as name for the type? If function names can be symbols, then why not types? If we want an English name for this type, then Arrow comes to mind.

=>{Key, Value} (shorthand: Key => Value)
longhand: Arrow{Key, Value}

@JeffBezanson
Member

It's fine for it to be iterable since it is well distinguished from other types that might be used as a single iterable argument to Dict. The same thing would be possible with Tuple; we could simply define that Tuple arguments to Dict are special. However tuples are already used as general length-n containers, especially for varargs, so that special case would not be very nice.

Keyword arguments use iteration as the interface for destructuring name-value pairs, since that's what's used by (k,v) = iter syntax. Therefore Pairs pretty much have to be iterable.

=> could indeed be the name of the type.

@JeffBezanson JeffBezanson closed this in #8521 Oct 6, 2014
@skumagai skumagai added a commit to skumagai/StatsBase.jl that referenced this issue Oct 8, 2014
@skumagai skumagai (K=>V)[] => Dict{K,V}[] (JuliaLang/julia#6739) 1a8fef7
@imanuelcostigan imanuelcostigan referenced this issue in imanuelcostigan/FinancialMarkets.jl Oct 13, 2014
Open

Replace dict literal syntax to functional syntax #18

@elcritch
elcritch commented Jul 9, 2015

Wait, what?!! It seems this has been in the makings for a while, but personally I use dict literal syntax quite heavily for embedding coding-information using JSON-like format. It's quite prevalent coming from almost any non-Julia scripting language, especially when porting from Javascript/JSON/Python code.

Is there a possibility of allowing quoted usage of { and } so I could write say a "@json" macro? Compare (I know... another overloading it colon):

data = @json { "key1": { "subkey": [ 1; {"a":4} ] } }

vs

data = Dict( "key1" => Dict( "subkey": [ 1 Dict("a" => 4) ] ) )

or worse:

data = Dict{String,Any}( "key1" => Dict{String,Any}( "subkey": Any[ 1, Dict{String, Any}("a" => 4) ] ) )
@elcritch
elcritch commented Jul 9, 2015

Perhaps a better question, is there any kind of written guide for porting v0.3 to v0.4 code? I'm still picking up "Julian" style. Having these types of invasive syntax changes without a guide of some sort ends complicates the learning process. I'm willing to work on helping write some documentation but reading the linked issues leaves me more baffled and the full Julia docs are too long for quick reference. Suggestions?

@ScottPJones
Contributor

Maybe a json"""...""" macro would be easier? You'd be able to use string interpolation (I don't know if you could make sure that the string syntax from within the """ """ would be totally json compatible (with the limit of """ having special meaning to end the string.

@IainNZ
Member
IainNZ commented Jul 9, 2015

@elcritch I think your macro may actually work. Also the typed version doesn't really seem a relevant comparison.

As for a written guide for porting v0.3 to v0.4 code, https://github.com/JuliaLang/Compat.jl serves as both a way to make code run deprecation-warning-free on both, but is at this point also the defacto guide. I think once we have a 0.4 prerelease there will interest in a transition guide. End-users shouldn't be playing with 0.4, its too unstable, so its kind of a low priority.

@JeffBezanson
Member

Yes I think such a @json macro would work fine. Also NEWS.md is supposed to be a quick guide to what changed from 0.3 to 0.4.

@ScottPJones
Contributor

@JeffBezanson you mean the @json { ... } style macro? Would that get the "..." strings already parsed in Julia fashion, instead of JSON fashion? (that's partly why I was thinking a string macro instead might work better)

@elcritch
elcritch commented Jul 9, 2015

@ScottPJones , a json""" ... """ macro is nice for larger copy-and-paste situations. For generic code I prefer the native Julia as it is syntax highlighted in most editors and you don't need to prepend $ to variables. It could be added to the json.jl package easily. PS What do you mean when you ask if the "..." strings would be parsed in Julian fashion? AFAIK, they'd be parsed as normal strings after the macro expansion.

@elcritch
elcritch commented Jul 9, 2015

@IainNZ , Thanks I'll look into it. Does that module also show how to allow macro to suppress syntax warnings? @JeffBezanson , the macro is really straightforward, but I'm unsure if it'd be more preferable to use traditional JSON notation e.g. { "a": 1, "b": { "bb": 2} } by overriding the : operator (I tried it, works) or to stick with the => operator? Also, would it be appropriate to create an issue with example code and/or a pull request? I'll look into the relevant community docs if so.

@elcritch
elcritch commented Jul 9, 2015

@JeffBezanson , thanks the NEWS.md pointed me to the syntax change. Unfortunately it was unclear if the { "a" => 1, ... } syntax was being deprecated in addition to the Dict list comprehension syntax.

@JeffBezanson
Member

I'm confused --- Dict comprehension syntax ([ a=>b for x in y ]) is not deprecated. NEWS says:

  * `Dict` literal syntax `[a=>b,c=>d]` is replaced by `Dict(a=>b,c=>d)`,
    `{a=>b}` is replaced by `Dict{Any,Any}(a=>b)`, and
@elcritch

Ooops! My bad for skimming, I must have interpreted it backwards. Here's a gist to an example macro json_macro.

@kmsquire
Member

@elcritch, it would be great if you submitted a PR to JSON.jl.

@ScottPJones
Contributor

@elcritch The problem is that Julia's string syntax is not really compatible with JSON string syntax. (I had to deal with this myself a year ago, writing a JSON parser for another language).
JSON requires exactly 4 hex digits after \u, and characters > 0xffff must be represented by surrogate pairs, i.e. \uXXXX\uYYYY. String interpolation will eat $ characters (I don't know if there is any way of turning that off for string literals that are going to be interpreted by a macro). There can't be any characters < 0x20 (control characters) either in a conforming JSON string. Numbers should be preserved (not converted to a floating point representation - this is a common mistake with JSON parsers, because people confuse what JavaScript does with the JSON standard) It is up to the consumer of the JSON string to convert numbers into whatever numeric format is desired, which could be an arbitrary precision decimal floating point number, for example. There is no "Inf" or "NaN" either.

@kmsquire
Member

That has little to do with @elcritch's macro or request. AFAIK, we don't validate many of the strings/inputs we currently convert to JSON either, and in either case, that could (and should) be handled by a separate (optional) validation step in JSON.jl.

@ScottPJones
Contributor

@kmsquire It all depends on how similar to JSON syntax @elcritch wants to get. If all he wants is just the {, }, [, ], and : to act like JSON, then that's fine. If he expected the stuff following the @json macro to really be JSON syntax, then the issues I raised would be important.

@kmsquire
Member

What I was suggesting is that both cases would (and should) be handled by adding validation to JSON.jl (assuming that's where his macro would live).

@ScottPJones
Contributor

OK, yes, I fully agree with adding this all to JSON.jl. I'll start looking at what can be done to handle JSON string constants (possibly just within a json"""...""" string macro, if that's not already done)

@ivarne
Contributor
ivarne commented Jul 10, 2015

@ScottPJones JSON.jl doesn't have any macro.

@elcritch

@ScottPJones , that'd be great! Having a JSON string macro would be useful for the cases you mention (e.g. direct compatibility). It sounds like you have experience with parsing JSON spec, would be possible for you to check the JSON.jl package and see if they're handling some of the more obscure cases correctly? I'll work on creating a PR for them.
@kmsquire , agreed. My main intent is just to embed json-esque data into my code so normal Julia string handling is what I'd prefer for that macro. A proper string_JSON macro would be more appropriate for the other cases.

@StefanKarpinski
Member
@ScottPJones
Contributor

@StefanKarpinski I agree, people tried to do the same thing to the language I worked on. You don't see a problem with having a json"""...""" string macro in JSON.jl though for convenience, do you?

@StefanKarpinski
Member
@ScottPJones
Contributor

At least the @json macro that @elcritch wants is doable in Julia, without (AFAIK) even touching the base language, unlike the nasty language ambiguities it would have added to ObjectScript, 👏 👏 👏 to the people who made things like that possible in Julia!

@elcritch

While not having a special dictionary syntax is annoying at first (primarily since I've been using Python for so long), I completely agree that it actually ends up being a very annoying problem. For example, try adding a first class OrderedDict to Python. :/ @ScottPJones I could work on adding a macro string in a PR using the standard parsing method if you wanted to focus on the validation parts. I'd like to be able to contribute something, even if it's small. :)

@elcritch

And fantastic point @ScottPJones regarding how Julia makes it possible. Great work @StefanKarpinski and @JeffBezanson and the rest of the team!

@ScottPJones
Contributor

https://github.com/JuliaLang/JSON.jl seems to be mostly the work of @WestleyArgentum and @aviks, very nice people I met at JuliaCon, so hopefully they will help us out! We should probably move this discussion to a PR under JSON.jl (can you open one there, @elcritch?) Macros are not (yet 😀) my forté, so I'd be happy just to deal with the validation parts.

@elcritch

Great, will do!

@axsk
Contributor
axsk commented Dec 27, 2015

What was the reason for deprecating the {k=>v} syntax? I liked it and having to write Dict(..) every time seems kind of clunky.
Being such a basic structure justifies the extra syntax in my eyes, we also wouldn't want to drop []...

@Ismael-VC
Contributor

@axsk that ship has sailed long ago, your question is better suited at julia-users. Long story short, concistency, Julia is not Python and it uses { } for other things, please let us continue this at julia-users or at the Gitter chat room, where you can link to this page and keep disscussing this if you want.

@quinnj
Member
quinnj commented Jan 7, 2016

@axsk, note that you can still do

julia> [a=>b for (a,b) in zip(1:10,11:20)]
Dict{Int64,Int64} with 10 entries:
  7  => 17
  4  => 14
  9  => 19
  10 => 20
  2  => 12
  3  => 13
  5  => 15
  8  => 18
  6  => 16
  1  => 11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment