Skip to content

Reinterpreting Vector{Union{T,Missing}} as Vector{T} #27831

@haampie

Description

@haampie

A general and hopefully elegant way to solve the problem of sorting a vector with missing data (#27781) is to pre-proces it by partitioning the vector as

# Vector{Union{T,Missing}} --> [Vector{T}; Vector{Missing}]
#                                        ^-- mid
mid = partition!(v)

# Sort the non-missing part using the standard best sorting algorithm for T's
sort!(reinterpret(T, v), 1, mid, ...)

Partitioning is just an O(n) operation, and in principle equivalent to the first step of QuickSort when the initial pivot value is fixed to be missing.

Unfortunately the idea breaks down because

julia> reinterpret(Float64, Vector{Union{Float64,Missing}}(rand(10)))
ERROR: ArgumentError: cannot reinterpret `Union{Missing, Float64}` `Float64`, type `Union{Missing, Float64}` is not a bits type

fails.

Is it technically possible to allow this kind of reinterpret(T, Vector{Union{T,Missing}}) in such a way that getindex and setindex! are without overhead?

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrays[a, r, r, a, y, s]missing dataBase.missing and related functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions