Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type promotion of missing inside tuples #31077

Open
clarkevans opened this issue Feb 15, 2019 · 6 comments
Open

type promotion of missing inside tuples #31077

clarkevans opened this issue Feb 15, 2019 · 6 comments

Comments

@clarkevans
Copy link
Member

clarkevans commented Feb 15, 2019

This is a low-priority feature request. Within a Vector{Tuple} and Vector{NamedTuple}, Julia does not infer the presence of missing values to mean a Union{Missing, T} data type as it does within a plain Vector. A more specific type inference would be more helpful.

julia> typeof([("A", 3), ("B", missing)])
Array{Tuple{String,Any},1}

julia> typeof([(k="A", v=3), (k="B", v=missing)])
Array{NamedTuple{(:k, :v),T} where T<:Tuple,1}

Within a plain vector or tuple, Union{Missing, T} is inferred.

julia> typeof([3, missing])
Array{Union{Missing, Int64},1}

Without missing inference within a vector of tuples is specific:

julia> typeof([("A", 3), ("B", 4)])
Array{Tuple{String,Int64},1}

julia> typeof([(k="A", v=3), (k="B", v=4)])
Array{NamedTuple{(:k, :v),Tuple{String,Int64}},1}

Hence, combining these two precedents, I would prefer to see the following type inference:

# a future version of Julia
julia> typeof([("A", 3), ("B", missing)])
Array{Tuple{String,Union{Missing, Int64}},1}

julia> typeof([(k="A", v=3), (k="B", v=missing)])
Array{NamedTuple{(:k, :v),Tuple{String,Union{Missing, Int64}},1}
@simonbyrne
Copy link
Contributor

Duplicate of #25925?

@martinholters
Copy link
Member

This has nothing do to with inference, it's a matter of promotion:

julia> Base.promote_typeof(("A", 3), ("B", missing))
Tuple{String,Any}

julia> Base.promote_typeof(3, missing)
Union{Missing, Int64}

We don't have a specific promote_rule for the former, so we fall back on typejoin.

@nalimilan
Copy link
Member

AFAICT that's #25924. TBH I'm still not sure why it was rejected...

@clarkevans
Copy link
Member Author

clarkevans commented Feb 15, 2019

@JeffBezanson wrote on #25924 (a specific implementation of #25925?)

The union type issue is better solved upstream, by making named tuples with the element types you want in the first place

I think missing feels like a special case, even though the general rule makes sense. Our use case is querying native Julia data structures, specifically vectors of tuples containing missing values. While possible, it's inconvenient to ask users to provide specific type information on deeply nested structures; it's not intuitive at the user level that mixing-in missing values changes specific data types to Any when Julia community specifically recommends the Union{Missing, T} pattern for this general problem.

@nalimilan
Copy link
Member

As noted by @martinholters, #25925 is different since it's only about inference. Here you're talking about the actual type of the result (for which inference isn't involved at all contrary to what you wrote in the description).

@clarkevans
Copy link
Member Author

clarkevans commented Feb 15, 2019

So, this is a duplicate (or perhaps special case) of #25924 which is rejected. I'm coming at this from a user perspective, and I see two conflicting desires: (a) recommending missing be used for empty cells, and (b) wanting to make it convenient to create native Julia data structures and query them. Here is an exact data structure, for example. The type information seems... an unnecessary "implementation" detail.

Emp = NamedTuple{(:name,:position,:salary,:rate),
                  Tuple{String,String,Union{Int,Missing},
                        Union{Float64,Missing}}}
Dep = NamedTuple{(:name, :employee), Tuple{String,Vector{Emp}}|

my_data = Dep[
   (name = "POLICE", employee = Emp[
     (name = "JEFFERY A", position = "SERGEANT",
      salary = 101442, rate = missing),
     (name = "NANCY A", position = "POLICE OFFICER",
      salary = 80016, rate = missing)]),
   (name = "FIRE", employee = Emp[
     (name = "JAMES A", position = "FIRE ENGINEER-EMT",
      salary = 103350, rate = missing),
     (name = "DANIEL A", position = "FIRE FIGHTER-EMT",
      salary = 95484, rate = missing)]),
   (name = "OEMC", employee = Emp[
     (name = "LAKENYA A", position = "CROSSING GUARD",
      salary = missing, rate = 17.68),
     (name = "DORIS A", position = "CROSSING GUARD",
      salary = missing, rate = 19.38)])]

# ... generic querying code ...

@JeffBezanson JeffBezanson changed the title type inference of vector-of-tuples having missing cells type promotion of missing inside tuples Mar 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants