-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve push! and categorical! test coverage #1375
Conversation
src/dataframe/dataframe.jl
Outdated
@@ -33,7 +33,7 @@ DataFrame(ds::Vector{AbstractDict}) | |||
* `column_eltypes` : elemental type of each column | |||
* `categorical` : `Vector{Bool}` indicating which columns should be converted to | |||
`CategoricalVector` | |||
* `ds` : a vector of Associatives | |||
* `ds` : `AbstractDict` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify "of columns" or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -952,23 +952,6 @@ Base.convert(::Type{DataFrame}, d::AbstractDict) = DataFrame(d) | |||
## | |||
############################################################################## | |||
|
|||
function Base.push!(df::DataFrame, associative::AbstractDict{Symbol,Any}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT this method was provided separately because it only works with Symbol
keys, while the one below only works with AbstractString
keys. The latter is unusual since I think we generally require symbols, so it could be deprecated (and its get
call is really weird).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. But - as noted above - this get
has the same performance as getindex
and push!
for :Symbol
was implemented in a bad way as it required Any
value. I will revert and fix it and deprecate the other one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am removing the tests for deprecated version of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I had to use other approach for deprecation as we can have Dict{Any}
or Dict{Union{Symbol, Missing}}
that actually contains only Symbol
as keys.
test/dataframe.jl
Outdated
@@ -405,6 +427,18 @@ module TestDataFrame | |||
@test findfirst(c -> typeof(c) <: CategoricalVector{Union{Int, Missing}}, | |||
categorical!(deepcopy(df), 1).columns) == 1 | |||
|
|||
@testset "categorical!" begin | |||
df = DataFrame([["a", "b"], ['a', 'b'], [true, false], 1:2, ["x", "y"]]) | |||
@test eltypes(categorical!(df)) == [CategoricalArrays.CategoricalString{UInt32}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better not hardcode UInt32
as the default could change in the future. I think in other tests we used another approach based on isa.(eltypes(...), [CategoricalValue{Char}, ...])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this pattern did not go through so I used all(map(<:, eltypes(...), [...]))
approach.
src/dataframe/dataframe.jl
Outdated
function Base.push!(df::DataFrame, associative::AbstractDict) | ||
i = 1 | ||
for nm in _names(df) | ||
try | ||
val = get(() -> associative[string(nm)], associative, nm) | ||
val = get(() -> (Base.depwarn("push!(::DataFrame, ::AbstractDict) with"* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a space at the end of the string, otherwise with the concatenation you'll get withAbstractDict
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
116692f
to
1939677
Compare
src/dataframe/dataframe.jl
Outdated
function Base.push!(df::DataFrame, associative::AbstractDict) | ||
i = 1 | ||
for nm in _names(df) | ||
try | ||
val = get(() -> associative[string(nm)], associative, nm) | ||
val = get(() -> (Base.depwarn("push!(::DataFrame, ::AbstractDict) with "* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this warning be printed when associative
doesn't contain a key for one of df
's columns? Then it will be confusing to print this before an error.
BTW, you can use the do
syntax. Also associative
could be renamed to d
or dict
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. will fix
Part three of implementation of #1372.
Changes:
push!
categorical!
push!
method that was not needed (duplicate in special case with no extra functionality and the same performance)Associative
referencesDataFrame(ds::AbstractDict)
constructor (it does not accept a vector but a dictionary)