Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Union{Missing,T} in columns after DropMissing #35

Merged
merged 10 commits into from
Apr 9, 2022
Merged

Avoid Union{Missing,T} in columns after DropMissing #35

merged 10 commits into from
Apr 9, 2022

Conversation

eliascarv
Copy link
Member

@eliascarv eliascarv commented Apr 4, 2022

This PR adds a post-processing in the apply function of the DropMissing transformation and a pre-processing in the revert function.
This is necessary so that the type of columns where the DropMissing transformation will be applied is T instead of Union{Missing,T} .

@eliascarv eliascarv requested a review from juliohm April 4, 2022 17:08
@juliohm
Copy link
Member

juliohm commented Apr 4, 2022

I think this current implementation has the issue that it forces a Union{Missing,T} in the revert even when the original table doesn't have missing values. As we discussed over Zulip, we just need to save the original column eltype during the apply and reuse it in the revert.

Copy link
Member

@juliohm juliohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix the revert so that it restores the original eltype, and also add tests.

@codecov-commenter
Copy link

codecov-commenter commented Apr 5, 2022

Codecov Report

Merging #35 (3634072) into master (cb72d48) will increase coverage by 0.06%.
The diff coverage is 96.00%.

@@            Coverage Diff             @@
##           master      #35      +/-   ##
==========================================
+ Coverage   92.51%   92.57%   +0.06%     
==========================================
  Files          16       16              
  Lines         414      431      +17     
==========================================
+ Hits          383      399      +16     
- Misses         31       32       +1     
Impacted Files Coverage Δ
src/transforms/filter.jl 98.00% <96.00%> (-2.00%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cb72d48...3634072. Read the comment docs.

@eliascarv eliascarv requested a review from juliohm April 5, 2022 12:37
@eliascarv
Copy link
Member Author

Done. I made the requested changes and added tests @juliohm.

src/transforms/filter.jl Outdated Show resolved Hide resolved
@eliascarv
Copy link
Member Author

eliascarv commented Apr 5, 2022

@juliohm, I moved post and pre processing to apply and revert. This avoids creating the helpers functions.
The _nonmissing function has these two extra methods to avoid unnecessary calls to the collect function:

_nonmissing(::Type{T}, c) where {T} = c
_nonmissing(::Type{Union{Missing,T}}, c) where {T} =
  collect(T, c)

function _nonmissing(columns, col)
  c = Tables.getcolumn(columns, col)
  _nonmissing(eltype(c), c)
end

But I can simplify this function, but the calls to the collect function will happen in all columns of the table:

function _nonmissing(columns, col)
  c = Tables.getcolumn(columns, col)
  collect(nonmissingtype(eltype(c)), c)
end

Which of the two options do you prefer?

@eliascarv eliascarv requested a review from juliohm April 5, 2022 16:32
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
@eliascarv eliascarv requested a review from juliohm April 6, 2022 18:28
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
@eliascarv eliascarv requested a review from juliohm April 7, 2022 12:32
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
src/transforms/filter.jl Outdated Show resolved Hide resolved
test/transforms.jl Outdated Show resolved Hide resolved
test/transforms.jl Outdated Show resolved Hide resolved
test/transforms.jl Outdated Show resolved Hide resolved
@eliascarv eliascarv requested a review from juliohm April 8, 2022 12:41
Copy link
Member

@juliohm juliohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor changes now and I think we are ready

src/transforms/filter.jl Show resolved Hide resolved
@eliascarv
Copy link
Member Author

This PR is ready for a final review @juliohm!

@eliascarv eliascarv requested a review from juliohm April 9, 2022 20:09
@juliohm juliohm merged commit 68e89ba into JuliaML:master Apr 9, 2022
@eliascarv eliascarv deleted the dropmissing-op1 branch April 9, 2022 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants