You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On discord usage-question, the replication for following Pandas code seems to be over-complicated and does not scale for the purpose of performance. Can this be considered as a feature that Polars should have?
Just to be clear, the performance remark was about a "horizontal fill" with cum_reduce (and not the approach shown above)
# This does not "scale"backward_fill_horizontal=lambdacols: (
pl.cum_reduce(function=lambdal, r: pl.coalesce(l, r), exprs=map(str, reversed(cols)))
.struct.field("*")
)
I believe another way to do this in pandas could be to use Categoricals? i.e. without horizontal bfill() / reindex:
importpandasaspddf=pd.DataFrame({
"instu_id": [9000098, 9000099],
"COLL_END_DATE": ["2023-10-31","2023-10-31"],
"Period": [1,3],
"outstanding": [1.188779e+08, 1231231]
})
df["Period"] =pd.Categorical(temp_df["Period"], categories=[0, 1, 2, 3], ordered=True)
(df.groupby(["instu_id", "COLL_END_DATE", "Period"], dropna=False, observed=False)
.agg(["any", "sum"])
)
# outstanding# any sum# instu_id COLL_END_DATE Period# 9000098 2023-10-31 0 NaN 0.0# 1 True 118877900.0# 2 NaN 0.0# 3 NaN 0.0# 9000099 2023-10-31 0 NaN 0.0# 1 NaN 0.0# 2 NaN 0.0# 3 True 1231231.0
Description
On discord usage-question, the replication for following Pandas code seems to be over-complicated and does not scale for the purpose of performance. Can this be considered as a feature that Polars should have?
Polars:
Or is there any simpler way?
The text was updated successfully, but these errors were encountered: