Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

materializer for matrix tables #281

Open
ablaom opened this issue May 3, 2022 · 6 comments
Open

materializer for matrix tables #281

ablaom opened this issue May 3, 2022 · 6 comments

Comments

@ablaom
Copy link
Contributor

ablaom commented May 3, 2022

This is not currently overloaded, ie, falls back to columntable.

In MLJ (and elsewhere) core models operate on matrices. An MLJ interface for a model accepts a table X, converts it to a matrix (using Tables.matrix) and reports any output as a table using materializer(X)(Xout). This is currently leading to unnecessary copying and computation in the last step.

Is there an overloading of materializer for matrix tables that Tables.jl would be willing to implement to address this use case?

@ablaom ablaom changed the title materializer for matrix tables materializer for matrix tables May 3, 2022
@nalimilan
Copy link
Member

You mean something like Tables.materializer(X::AbstractMatrix) = Tables.table?

I must be missing something, as Tables.columntable doesn't support matrices:

julia> X = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> Tables.materializer(X)
columntable (generic function with 5 methods)

julia> Tables.materializer(X)(X)
ERROR: ArgumentError: a 'Matrix{Int64}' is not a table; see `?Tables.table` for ways to treat an AbstractVecOrMat as a table
Stacktrace:
 [1] columns(m::Matrix{Int64})
   @ Tables ~/.julia/packages/Tables/M26tI/src/matrix.jl:5
 [2] columntable(itr::Matrix{Int64})
   @ Tables ~/.julia/packages/Tables/M26tI/src/namedtuples.jl:171
 [3] top-level scope
   @ REPL[10]:1

@ablaom
Copy link
Contributor Author

ablaom commented May 15, 2022

You mean something like Tables.materializer(X::AbstractMatrix) = Tables.table?

No. Currently, we have:

using Tables
matrix_table = Tables.table(rand(2,3))
some_table = [(x=1, y=2), (x=3, y=4)]
julia> Tables.materializer(matrix_table)(some_table)
(x = [1, 3], y = [2, 4])

I'm getting a column table for the last call because that's the fallback behaviour for materializer. But I want Tables.materializer(matrix_table)(some_table) to be a matrix table.

@nalimilan
Copy link
Member

OK, makes sense. So that would mean Tables.materializer(::Type{<:Tables.MatrixTable}) = Tables.table ∘ Tables.matrix?

@ablaom
Copy link
Contributor Author

ablaom commented May 16, 2022

Syntactically, that would work for my use-case. However, I'm not sure that would be ideal as an implementation. I was hoping that Tables.materializer(matrix_table), when applied to Tables.table(some_matrix), would compile to a no-op. But that doesn't appear to be the case, although it does seem that some_matrix is not copied.

@nalimilan
Copy link
Member

Yeah we could define a separate function with a fallback method calling Tables.table ∘ Tables.matrix and add a no-op method for MatrixTable inputs.

@ablaom
Copy link
Contributor Author

ablaom commented May 17, 2022

Perfect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants