You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I really like the tidier::fill() function to fill in missing values with previous or next values, which frequently happens when I build balanced data sets. I found a workaround to implement the logic for single variables using Imputation.jl:
using DataFrames, Impute, Tidier
df = DataFrame(dt1=[0.2, missing, missing, 1, missing, 5, 6], dt2=[0.3, missing, missing, 3, missing, 5, 6])
@chain df begin
@mutate(dt1 = ~Impute.locf(dt1))
end
@chain df begin
@mutate(dt1 = ~Impute.nocb(dt1))
end
I suppose it might be a rather low hangig fruit to build a @fill function for Tidier.jl.
The text was updated successfully, but these errors were encountered:
below is the code for fill function. two helper functions were added so that Impute.jl does not need to be libraried in.
function locf(column::AbstractVector)
last_observation = column[1]
for i in 1:length(column)
if ismissing(column[i])
column[i] = last_observation
else
last_observation = column[i]
end
end
return column
end
function nocb(column::AbstractVector)
next_observation = column[end]
for i in length(column):-1:1
if ismissing(column[i])
column[i] = next_observation
else
next_observation = column[i]
end
end
return column
end
function fill(column::AbstractVector, method::String)
if method == "locf"
return locf(column)
elseif method == "nocb"
return nocb(column)
else
error("Unsupported fill method. Choose either 'locf' or 'nocb'.")
end
end
df = DataFrame(dt1=[missing, 0.2, missing, missing, 1, missing, 5, 6], dt2=[0.3, missing, missing, 3, missing, 5, 6,missing])
# apply the fill function in the DataFrame chain
@chain df begin
@mutate(dt1 = ~fill(dt1, "locf"))
@mutate(dt2 = ~fill(dt2, "nocb"))
end
I really like the
tidier::fill()
function to fill in missing values with previous or next values, which frequently happens when I build balanced data sets. I found a workaround to implement the logic for single variables usingImputation.jl
:I suppose it might be a rather low hangig fruit to build a
@fill
function forTidier.jl
.The text was updated successfully, but these errors were encountered: