New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add NamedTuples #22194

Merged
merged 1 commit into from Nov 2, 2017

Conversation

@JeffBezanson
Member

JeffBezanson commented Jun 2, 2017

This adds a built-in type very similar to the type in the NamedTuples package, with the syntax (x=1, y=2). Also owes quite a bit to #16580. I tried to keep this fairly minimal for now; we will probably need more functions.

The primary difference with NamedTuples is that the type itself is not used to construct a NamedTuple from field values. In general we seem to have moved to more "convert-like" constructors; for example Tuple(itr) will convert an iterable to a tuple, and Tuple(a, b, c) is not used to construct (a, b, c). (A bit of discussion on this in #15120 and #20969.)

One big issue is that while the syntax for instances is really nice, the current (implementation-based) syntax for NamedTuple types is nasty: NamedTuple{(:a, :b), Tuple{Int, Int}}. Maybe we're ready to use {a=Int, b=Int}, and {Int, Int} for tuple types? Although there are still other potential uses for that syntax.


Here's a summary of the justification and thinking behind this:

Relational tables are collections of tuples. Experience with tabular data in julia has shown that the main problem with tuples is not being able to refer to components by name. NamedTuples.jl adds that ability, and has worked very well, but suffers from (1) poor syntax, (2) relying on eval and thus breaking precompilation and generated functions in various ways. At the same time, we need a new approach to keyword arguments, and there are obvious parallels between the requirements for keyword arguments and the tabular data use cases. Here are some design criteria:

  1. Orderedness: both the tabular case and keyword arguments want ordered-dict-like containers.
  2. Analogy to tuples: tuples correspond to positional arguments, and there should be a corresponding structure for keyword arguments. It seems more elegant if this other structure is as much like tuples as possible. Syntactically, it's natural for (a=1, b=2) to make a keyword argument container the same way (1, 2) makes a positional argument container.
  3. Specialization: for performance, we need specialization on the type of each element and on the list of field names. This is needed to accelerate table operations and makes it easier to implement static keyword sorting and dispatch.
  4. Special syntax: a.b, (a=1,), (; x, y), (; a.x, a.b) are all desirable for both purposes, but only really work with symbol keys. It's generally difficult to justify a container where the keys have to be symbols, but our structs already have this property, and it's also an established property of record types and object types in other languages. So everything makes sense if we just say this is an anonymous struct type, and reuse as much of that internal machinery as possible.
  5. Covariance: we need to be able to join named tuple types and obtain types like Tuple{Union{T,Null}, Union{T,Null}} instead of producing a union with 2^n components. Adding another built-in type that shares infrastructure with tuple types is by far the easiest way to get this. EDIT: Covariance is not essential, and may in fact be a bad idea. With invariance, it's possible for a table type that knows it has nullable columns to specifically construct named tuples with Union{T,Null}-typed fields and have that be a concrete type, which avoids blowup in the number of named tuple types in at least some cases.
  6. Iterating values. The tabular use case wants to ignore the names as much as possible, for things like lexicographic ordering, and code that is generic over tuples and named tuples. Admittedly this feature is not a perfect match for keyword arguments, which would be happy with pair iteration, but this is not a significant problem since context (being after a semicolon) indicates when to treat a container as a keyword argument container.

There are different perspectives on dict-like types: sometimes they are more like relations, containing pairs, sometimes they are primarily collections of keys, and sometimes they are primarily collections of values. Iteration cannot do all of these at once; indeed, iteration has little to do with the concept of "keys". This becomes an issue for splatting during named tuple construction. So far, the interface to key-value splatting has been iterating pairs. But iteration isn't the whole story, since we also compute a merge (union-join) of the splatted pairs. Therefore it makes sense to use a slightly different protocol, as merge is more intrinsically tied to keyed collections than iteration is. So here I've made merge(::NamedTuple, other) the protocol for splatting in named tuples. There is a fallback definition for merge that expects an iterator of pairs, operating like we do now for e.g. keyword arguments. I think this makes sense: if an associative collection iterates pairs it doesn't have to do anything special, but if it wants to iterate keys or values instead, it can still participate by implementing merge.

@quinnj

This comment has been minimized.

Member

quinnj commented Jun 2, 2017

I'm excited for this. Thoughts:

  • The need for covariance is to match Tuple type behavior/covariance? What are the main advantages/uses of covariance here (since this isn't use for function arguments)?
  • I'm definitely on board with curly brace tuple type syntax
getindex(t::NamedTuple, i::Symbol) = getfield(t, i)
function getindex(t::NamedTuple, I::AbstractVector)
names = unique( Symbol[ isa(i,Symbol) ? i : fieldname(typeof(t),i) for i in I ] )

This comment has been minimized.

@quinnj

quinnj Jun 2, 2017

Member

Are the fields in a NamedTuple required to be unique?

This comment has been minimized.

@JeffBezanson

JeffBezanson Jun 2, 2017

Member

Yes, but I haven't yet done much to enforce that.

This comment has been minimized.

@quinnj

quinnj Jun 16, 2017

Member

So this getindex method is kind of a bummer; I played around with a definition like:

@generated function Base.getindex(nt::NamedTuple{names}, inds::Int...) where {names}
    nms = [:(names[inds[$i]]) for i = 1:length(inds)]
    args = [:(getfield(nt, inds[$i])) for i = 1:length(inds)]
    return :(Base.namedtuple(NamedTuple{tuple($(nms...))}, $(args...)))
end

But the problem is you still don't know the values of inds, so it has a hard time knowing the exact return type for the NamedTuple. Is there any other way we could get a smarter multi-getfield-like type method?

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 2, 2017

Covariance isn't super important, particularly without nice syntax for types. With nicer syntax, I can imagine using {a=Any, b=Any} as an argument type or element type.

@ararslan ararslan added the collections label Jun 2, 2017

@andyferris

This comment has been minimized.

Contributor

andyferris commented Jun 2, 2017

Super cool!

@quinnj I thought the idea is to use these for keyword arguments, so I'm not sure if covariance is useful to make f(;a::Real=1, b::Real=0.0) strongly typed (specialised on the types of a and b).

As for curly {} syntax - I'm not sure what I think yet but I do know that if inlined immutables, Buffer, non-resizable Vector and a new resizable list type are still on the roadmap for v1.0, then we definitely need syntax for both lists and vectors (there is already [1, 2] vs [1; 2], but I'm guessing {1, 2} might be considered a contender - until we rule it out at least?)

@andyferris

This comment has been minimized.

Contributor

andyferris commented Jun 2, 2017

Could we maybe just have lowering turn Tuple{a=Int} into NamedTuple{(:a,), Tuple{Int}}? Or perhaps from NamedTuple{a=Int}, if the first is too confusing?

And then similarly for show of the type?

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 2, 2017

Yes, we could allow T{a=b, c=d} to pass keyword arguments to apply_type, which could eventually have other uses as well. It seems a bit roundabout, since we'd be using keyword arguments to create the type of container used to pass keyword arguments, but it's doable. Or we could lower T{a=b, c=d} to T{(:a,:c),Tuple{b,d}}, which would be more efficient but a bit ad-hoc.

@@ -0,0 +1,91 @@
# This file is a part of Julia. License is MIT: https://julialang.org/license
@generated function namedtuple(::Type{NamedTuple{names,T} where T}, args...) where names

This comment has been minimized.

@quinnj

quinnj Jun 2, 2017

Member

This could perhaps just be simplified to

@generated function Base.namedtuple(syms::NTuple{N, Symbol}, args...) where N
    if length(args) == N
        Expr(:new, :(NamedTuple{syms,$(Tuple{args...})}), Any[ :(args[$i]) for i in 1:N ]...)
    else
        :(throw(ArgumentError("wrong number of arguments to named tuple constructor")))
    end
end

since it's a bit redundant to pass NamedTuple to the namedtuple function. Also can this be exported? I'm finding it the handiest way to generate non-literal NamedTuples. It has a nice companion in tuple(args...), you just have to pass in the fieldnames as well.

This comment has been minimized.

@JeffBezanson

JeffBezanson Jun 2, 2017

Member

My version is useful since it causes specialization on the names tuple. But we could add

namedtuple(syms::NTuple{N, Symbol}, args...) where N = namedtuple(NamedTuple{syms}, args...)

Though also I believe @vtjnash is working on inference improvements that could make my original method unnecessary.

This comment has been minimized.

@quinnj

quinnj Jun 3, 2017

Member

That would depend on having some kind of dedicated syntax thought, right? Like ((x=y for (x, y) in zip(syms, vals))...) or curly braces or something?

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 3, 2017

Have we resolved the question of inferred names that I brought up here? Having a syntax that infers the names like in C# and VB is really quite key for my use of named tuples in Query.jl. I guess I could keep using {} as the named tuple syntax in Query.jl, but I would much prefer to just use whatever is the julia base convention.

I also hope to enable the following kind of generator/query syntax very soon, and that would only work if the normal named tuple syntax supports inferred names:

df = DataFrame(<somedata>)
df2 = DataFrame({i.Name, i.Address} for i in df if i.Age > 3)

I think that kind of generator syntax is super julian and it essentially can cover any filtering and projection of tables with almost the shortest syntax I can think of. But it really only works if we have inferred names in the named tuple syntax.

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 3, 2017

Here are some other syntax ideas that would be cool to have. I'm using {} for the named tuple syntax here.

x = {a=1, b=2}
y = {c=3, d=4}
z = {x...,y...} # Constructs a named tuple with fields a,b,c and d
z = {e=5, x..., f=8, y.d} # Constructs a named tuple with fields e, a, b, f, d

One thing that would be really nice is if one could use [] to select a subset of fields. But I'm not entirely sure what a good scoping rule for that would be... One (problematic) approach would be:

z[a:f] # Constructs a named tuple with fields a,b,f

This would essentially mean that one can no longer index with a variable into a named tuple, and I'm not sure that is a good idea... But if someone has some other idea how to easily select a subset of fields from a named tuple in a concise way it would be great.

The use case for this last example might be something like this:

df = DataFrame(<somedata>)
df2 = DataFrame({i[b:m]...,x=3} for i in d)

This would create a new table that has columns b:m from the original table plus a column x.

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 3, 2017

No, we haven't resolved this, but { } is still available.

Hopefully it's ok for the range indexing case to need a few extra characters. For instance i[:b .. :m] would work.

@quinnj

This comment has been minimized.

Member

quinnj commented Jun 3, 2017

@davidanthoff, note that subsetting & merging are already included in this PR

julia> t = (a=1, b=2, c=3, d=4, e=5, f=6)
(a = 1, b = 2, c = 3, d = 4, e = 5, f = 6)

julia> t[[:a, :c, :e]]
(a = 1, c = 3, e = 5)

julia> r = (g=7, h=8, i=9)
(g = 7, h = 8, i = 9)

julia> merge(t, r)
(a = 1, b = 2, c = 3, d = 4, e = 5, f = 6, g = 7, h = 8, i = 9)

I can see how a dedicated syntax for named tuples would be convenient, given that you can currently do (itr...) as super convenient syntax for creating a tuple, but no corresponding syntax exists for named tuples, though I'm not sure exactly how convenient that syntax could be: {(a=>b for (a,b) in zip(A, B))...} isn't too bad, but not great, and that's assuming we make Pairs work for this.

I don't think I'm a fan of having {x.a, x.b} mean {a=x.a, b=x.b} though; there's nothing else like that in the language and it's not intuitive, too "magick-y". I also think there's got to be other ways of accomplishing what you're after. One option that immediately seems like it would work would be to do something like:

@generated function nt(x::T) where {T}
    names = tuple(T.name.names...)
    vals = map(names) do n
        :(getfield(x, $(Expr(:quote, n))))
    end
    return :(Base.namedtuple(NamedTuple{$names}, $(vals...)))
end

which allows you to do

julia> struct A
         a::Int
         b::Int
       end

julia> a = A(1, 2)
A(1, 2)

julia> nt(a)
(a = 1, b = 2)

in fact, that's so handy, it might be worth including in this PR (with a better name of course). Though unfortunately I'm not sure there's a way to get the reverse (i.e. auto-constructor of any T given a NamedTuple of field=values).

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 3, 2017

@quinnj Yeah, the merge function was in NamedTuples.jl already. The syntactic sugar for that would be nice but I could certainly emulate that for the time being in Query.jl, so I think that is less high priority from my point of view. I am really excited that this merge function can be type stable now, which was not feasible with the old NamedTuples.jl.

Having said this, there is precedence for that kind of syntax in Javascript and Typescript, they call this spread properties. The main use case in Query.jl for this spread syntax is in joins. You typically end up with two range variables for the two tables you join, and then you might want to construct one table out of the columns of both tables. With this spread syntax you could do this as @select {i..., j...} (where i and j refer to the two tables that are joined respectively). Right now that is one of the cases that is really not elegant in Query.jl (or the original LINQ).

I don't think I'm a fan of having {x.a, x.b} mean {a=x.a, b=x.b} though; there's nothing else like that in the language and it's not intuitive, too "magick-y".

I guess one counter data-point is that C# and VB have had this for over a decade and I have not seen any complaints about this. And while it might not seem like a huge thing, it is really quite key for the whole LINQ design (which Query.jl copies heavily from).

I also think there's got to be other ways of accomplishing what you're after. One option that immediately seems like it would work would be to do something like:

The example you give seems useful, but I don't understand how it maps to the projection case in a query like statement. If we stick with the example I gave (DataFrame({i.Name, i.Address} for i in df)), how does the nt function come into play?

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 3, 2017

Hopefully it's ok for the range indexing case to need a few extra characters. For instance i[:b .. :m] would work.

Yeah, that would work.

This is also generally a larger design area where I'm just not sure what I should do in Query.jl. On some level I'd like to somehow support something similar to all the select helper functions like starts_with etc. from dplyr, but on the other hand I'm no sure whether that kind of stuff belongs in a named tuple syntax... So ideally something like this would work in a type stable way:

x = {fielda=1, fieldb=2, c=3}
y = {starts_with(x, :field)...} # Creates a named tuple with fields fielda and fieldb
# Or maybe
y = {x[starts_with(:field)]...}
# One could even think about set like operations on the names
y = {x[starts_with(:field) - :fieldb]...} # would create a named tuple with fielda

But I'm not sure this is any good... In dplyr there are lots of other utility functions equivalent to starts_with. I don't have a good handle on that hold area, I probably just need to think a lot more about this before things clear up. I just wanted to mention it here, no need to come up with a solution, but this is the next problem I'm trying to solve in Query.jl, and it is pretty closely linked to the whole named tuple story.

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 3, 2017

For instance i[:b .. :m] would work.

Oh, and one more things: I also still have hope for #21875, so whenever I see multiple dots now, I start to wonder whether it can all be combined or not :)

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 3, 2017

My sense is that something interval-like is the most useful application of ... It's really useful for queries, and the other proposed uses seem more obscure.

Splatting is a good topic here. Obviously (a..., b...) is already taken, so we need to do something else for named tuples. In analogy to keyword arguments, we could use (; a..., b...) (everything before ; is "positional" and everything after ; is "named"). However, some might find this weird, as it is utterly different from (a=1; b=2), and you can't put anything before the ; --- we don't have any kind of object to return for (a, b; c=1, d=2). So {a..., b...} might be a viable option.

On the other hand, since (; a=1, ...) is already used for keyword arguments, it's possible that syntax could do all the work. Currently in (; a=1, x), the x is a "computed keyword argument", meaning it gets evaluated and is expected to yield a symbol-value pair. I will hazard that this feature is almost never used. Furthermore, it could easily be replaced by requiring => for computed keywords, i.e. (; a=1, x[1]=>x[2]). That frees up some syntax. (; a.x) could mean (; x=a.x). Currently, (; a.x) means that a.x evaluates to a symbol-value pair, which is bizarre and I would bet has never been used. In contrast, if we use named tuples for keyword arguments and a refers to some existing keyword argument container, then f(; a.name1, a.name3) is a natural way to forward a subset of keywords to another function, which is fairly common.

That would also fix #9972, since f(; x) could be a straightforward syntax error.

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 4, 2017

The (; a.x) thing would work, but aesthetically I don't find it pleasing at all... I think for the Query.jl use case it would just look odd.

It might make sense to think where else named tuples will pop up and how often we will look at them going forward. If they are rare, maybe it doesn't matter that much if they look a bit odd, but if they will be used a lot in a lot of contexts it would be nice to have something that is aesthetically nice.

One major use case might be return values of functions. Essentially I can imagine that any function that currently returns multiple values via a tuple might in the future return a named tuple instead. This would especially make sense in combination tuple deconstruction syntax and with rest properties like syntax. Say we have a function like this:

optimize() = {maxium=5, opt_x=2, iterations=5, algorithm=:myown}

One way to call this is obviously x = optimize() and then x would be that named tuple. But it would be great if one could also do this: a, b, c, d = optimize(), i.e. if the normal tuple deconstruction syntax would work with named tuples. And then the rest properties type thing might be a, b, c... = optimize() and in this case c would be {iterations=5, algorithm=:myown}, i.e. c would collect all the remaining tuple elements. If this kind of tuple deconstruction and rest properties would work for named tuples, one could essentially change any function that currently returns a tuple to return a named tuple instead, without breaking any client code, which would be great, after all it would just make a lot of APIs more self-documented.

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 4, 2017

All of that already works with the parenthesized syntax (maxium=5, opt_x=2, iterations=5, algorithm=:myown) (except a, b, c... = which we don't have yet). A problem only appears when we want a.b to have a non-standard interpretation as b=a.b.

(a=1, a.b) can work since there's a context clue that we're in a named tuple, but without an explicit named element we'd need something else. Maybe prefix =, basically just allowing the name to be omitted and inferred: (=a.x, =b.y) ? That's currently a syntax error.

Or if we use { }, maybe we can get some extra mileage out of it by also using {obj | field=val} for #21912-style non-mutating updates.

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 4, 2017

Also: while (; a.x) doesn't look so great on its own, I still think it's worth considering making it work for passing keyword arguments, i.e. f(a, b; c.x) means f(a, b, x=c.x).

@JeffBezanson JeffBezanson force-pushed the jb/namedtuples branch from 45b2b81 to d22ab50 Jun 4, 2017

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 4, 2017

I like (=a.x, =b.y) better than (;a.x,), but I would still prefer {a.x, b.y} :) I think one of the reasons that LINQ works really well is that the named tuple syntax really has very minimal overhead over a normal SQL SELECT statement. If I think about a projection statement that selects a lot of columns, the =a.x syntax just gets cumbersome:

@from i in df begin
@select (=i.a, =i.b, =i.c, =i.d, =i.e, =i.f, =i.g)
end

just is less nice than the same thing without the =. Having said that, I kind of like the idea outside of a query, though...

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 4, 2017

Here is another syntax idea: {a::Int=x} for explicit typing of fields.

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 4, 2017

Well, inside a macro argument you can have any syntax you want :)

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 4, 2017

Here is another syntax idea: {a::Int=x} for explicit typing of fields.

Yes, that works fine.

We only need a single context clue to know we're in a named tuple, so multiple = might not be necessary. You could use (=i.a, i.b, i.c).

Given that brackets are some of the most valuable ASCII real estate, it doesn't seem likely that we'll end up spending { } on this, though many more people will need to weigh in.

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 4, 2017

Well, inside a macro argument you can have any syntax you want :)

Yes, but a) it won't help with my plan to make things like this DataFrame({i.a, i.b} for i in df) work, i.e. using generator syntax for simple queries that only use filtering and projection and b) I'd very much like to avoid redefining base syntax inside queries, one of the nice things in LINQ is that it is really built up of more basic language features without many extra things. If named tuples in base get (), then I assume {} will be used for something else, and whatever that is, I'd want folks to be able to generate that from a query as well, which won't work if I redefine {} for named tuples in queries.

Here is another (maybe wacky) idea:

(i.a, b;) always generates a tuple, i.e. a trailing ; inside () forces things to be a tuple. (;i.a, b) always generates a named tuple, i.e. a ; at the beginning of () forces things to be a named tuple. The rules for () without any ; become more complicated: if the compiler can infer names for all elements, a named tuple will be generated, otherwise a tuple. This would obviously be breaking. It would also push people to more named tuple usage because things would turn into a named tuple more often when folks don't make an explicit choice between a tuple and a named tuple via a ; placement.

@davidanthoff

This comment has been minimized.

Contributor

davidanthoff commented Jun 4, 2017

Another idea.

Having thought a bit more about (;a), I think I actually don't mind the extra character so much, but my aesthetic objection is more about the optical asymmetry that this introduces. ; to me separates two things, and so I'm kind of expecting some content to the left of it, and so the lack of content to the left of ; is what feels odd to me.

But what about putting an extra symbol in front of () to denote named tuples? Say we use @ (that might be a bad choice, I haven't thought about it, just think any special character). Then a named tuple would be constructed with @(a.b, c = 5) etc. To me that looks much nicer than the (; story. So the generator table story would look like DataFrame(@(i.a, i.b) for i in df), and a Query.jl query would look like:

@from i in df begin
@select @(i.a, i.b)
end

Both seem ok to me. This might also nest nicely with #8470: {Int, String} would be used for constructing tuple types, and @{name::String, age::Float64} would construct a named tuple type. At that point @(a::Int=5, b::String="test") would essentially be a shortcut for @{a::Int, b::String}(5, "test").

Optically this also is more similar to normal constructor syntax, i.e. essentially the special syntax would be kind of the name of NamedTuple.

I think this would also work with the various splatting ideas, right? And one could have nice tuple and named tuple deconstruction:

@(a, b, c) = foo() # Deconstructs into a named tuple with names a, b, c

Now, if there was no opportunity cost in using {} for named tuples, I would still prefer that over @(), but clearly there is. I think I would actually prefer if the decision about the use of {} was just deferred until we have a) a better sense of all the alternative potential use cases and b) more real world experience whether e.g. something like @() is acceptable in things like queries.

So, what about the following strategy: for now use some syntax for named tuples that is not {}. Maybe @() or something else. Leave {} unused for julia 1.0, but have an issue where we collect all the potential use cases for {}. I'll use the julia base syntax for named tuples in Query.jl starting in julia 1.0, and then we get some feedback on all of this during the julia 1.0 use cycle, and maybe make a decision about {} for julia 2.0?

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Jun 4, 2017

I can totally understand the aversion to (; , but we already have f(; ...) for passing keyword arguments, so the main question is why (; ...) wouldn't be a way to construct named tuples --- should it be a syntax error? @( ) is available (as is &( )) but just seems a bit random to me. The @ is also not needed to use { } for tuple and named tuple types.

@quinnj

This comment has been minimized.

Member

quinnj commented Jun 5, 2017

I really like the idea of having (; ...) for named tuples and the connection between f(...)/(...) and f(; ...)/(; ...).

@andyferris

This comment has been minimized.

Contributor

andyferris commented Jun 5, 2017

I agree.

I'm wondering if the "full" analogy between function signatures and tuples might allow for mixtures of positional and keyword arguments like (1, true; a = 2.0, b = [1,2,3]) and possibly even just be supported by the built-in Tuple (e.g. something like Tuple{Int, Bool; a=Float64, b=Vector{Int}}, but I'm not sure if that would be a bit impractical...

namedtuple_fieldtype_a(x) = fieldtype(typeof(x), :a)
@test Base.return_types(namedtuple_fieldtype_a, (NamedTuple,)) == Any[Type]
@test Base.return_types(namedtuple_fieldtype_a, (typeof((b=1,a="")),)) == Any[Type{String}]

This comment has been minimized.

@vtjnash

vtjnash Oct 26, 2017

Member

tests the secondary code path (test(x, y) = fieldtype(typeof(x), y))?

@test NamedTuple{(:a,:b),Tuple{Int8,Int16}}((1,2)) === (a=Int8(1), b=Int16(2))
@test convert(NamedTuple{(:a,:b),Tuple{Int8,Int16}}, (a=3,b=4)) === (a=Int8(3), b=Int16(4))
@test_throws MethodError convert(NamedTuple{(:a,:b),Tuple{Int8,Int16}}, (x=3,y=4)) === (a=Int8(3), b=Int16(4))

This comment has been minimized.

@vtjnash

vtjnash Oct 26, 2017

Member

should test for which method throws the error, and move most of the computations (the ones that should succeed – e.g. the constructors) into a let block / outside of the try/catch block

@test_throws MethodError convert(NamedTuple{(:a,:b),Tuple{Int8,Int16}}, (x=3,y=4)) === (a=Int8(3), b=Int16(4))
@test NamedTuple{(:a,:c)}((b=1,z=2,c=3,aa=4,a=5)) === (a=5, c=3)
@test NamedTuple{(:a,)}(NamedTuple{(:b,:a),Tuple{Int,Union{Int,Void}}}((1,2))) === NamedTuple{(:a,),Tuple{Union{Int,Void}}}((2,))

This comment has been minimized.

@vtjnash

vtjnash Oct 26, 2017

Member

I'm pretty sure more spaces would never be amiss in improving readability

@vtjnash

add a test for (a = 1, b = 2, a = 3) (possibly I just missed seeing this) and equivalently for NamedTuple{(:a, :b, :a), NTuple{3, Int}}((1, 2, 3))

@JeffBezanson JeffBezanson force-pushed the jb/namedtuples branch 3 times, most recently from d08bf3b to 6a08d12 Oct 26, 2017

names = Symbol[an...]
for n in bn
if !sym_in(n, an)
push!(names, n)

This comment has been minimized.

@vtjnash

vtjnash Oct 30, 2017

Member

This is not preserving of the order of the values. Need to switch an and bn in the implementation:

names = Symbol[]
for n in an
    if !sym_in(n, bn)
        push!(names, n)
    end
end
append!(names, bn)
return names

This comment has been minimized.

@JeffBezanson

JeffBezanson Oct 30, 2017

Member

I think either behavior could be useful but I went with how merge works for ordered dicts and keyword arguments.

for (k,v) in itr
oldind = get(inds, k, 0)
if oldind > 0
vals[oldind] = v

This comment has been minimized.

@vtjnash

vtjnash Oct 30, 2017

Member

same here

@quinnj

This comment has been minimized.

Member

quinnj commented Nov 1, 2017

bump :)

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Nov 1, 2017

It seems like we've reached the point where it would be best if this was merged and then iterated upon after merging.

implement named tuples
Based on #16580, also much work done by quinnj.

`(a=1, ...)` syntax is implemented, and `(; ...)` syntax is
implemented but not yet enabled.

@JeffBezanson JeffBezanson force-pushed the jb/namedtuples branch from 6a08d12 to ebfb307 Nov 1, 2017

@quinnj

This comment has been minimized.

Member

quinnj commented Nov 1, 2017

CI failures look unrelated

@StefanKarpinski

This comment has been minimized.

Member

StefanKarpinski commented Nov 1, 2017

Now we just have to wait a few days for Travis' macOS queue to get to this PR.

@JeffBezanson JeffBezanson merged commit 37d47ba into master Nov 2, 2017

4 of 5 checks passed

ci/circleci: build-x86_64 Your tests failed on CircleCI
Details
ci/circleci: build-i686 Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
julia freebsd ci Build done
Details

@JeffBezanson JeffBezanson deleted the jb/namedtuples branch Nov 2, 2017

@andyferris

This comment has been minimized.

Contributor

andyferris commented Nov 2, 2017

Thanks Jeff and Jacob and everyone else involved in getting this done, really nice feature :)

@stevengj

This comment has been minimized.

Member

stevengj commented Nov 2, 2017

Does this make dot overloading easier to implement, or is it orthogonal? Or will named tuples have to be re-implemented once dot overloading lands?

@JeffBezanson

This comment has been minimized.

Member

JeffBezanson commented Nov 2, 2017

It's orthogonal. Once we have dot overloading we'll probably re-implement named tuples to use it, but it's not necessary.

@nalimilan

This comment has been minimized.

Contributor

nalimilan commented Nov 4, 2017

Is there a plan for Julia 0.6 support, e.g. via Compat? NamedTuples.jl could be used on 0.6, but currently it has different semantics for merge (it keeps the first value rather than the last in case of conflicts). I guess it could be adapted and included in Compat with different semantics.

@timholy

This comment has been minimized.

Member

timholy commented Nov 4, 2017

I have a sense that there's enough to do for 1.0 already that backporting shouldn't be a priority. Certainly not this implementation, since it touches a fair bit of the *.scm and *.c code in src/. I'd base any 0.6 efforts on pure-julia implementations like NamedTuples.jl.

Or better yet, use this as an excuse for trying to switch to 0.7 for real work (including package development). I know it's not easy, as I haven't succeeded in pulling this off myself. (On the majority of days over the last two months I've been trying to put in ~30 minutes on Images and its dependencies, but things keep breaking faster than I can keep up.) But it seems important since this is going to be what we use, and there are so many goodies that the potential benefits are huge.

@tknopp

This comment has been minimized.

Contributor

tknopp commented Nov 4, 2017

Thanks for this! @JeffBezanson will keyword arguments be based on this? (in order to make them fast)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment