Broken aggregate function in main branch #211

candalfigomoro · 2023-06-15T15:44:13Z

What happened + What you expected to happen

If I use hierarchicalforecast v0.3.0 there's a problem with "unbalanced" time series, see: #189

However, if you exclude this specific issue, it works fine.

If I use the latest version from the main branch (maybe after this commit? c107217) I have huge issues.

I have unbalanced time series.

If I set the new "is_balanced" parameter to False (the default behavior), the returned dataframe has many missing dates (ds). For some unique_ids I expect many years of data but I just get 2 rows.

If I set the new "is_balanced" parameter to True, I get a reshaping error.

The same code works with version 0.3.0. I had to revert to the released 0.3.0 version because the version in the main branch is not usable for me.

Versions / Dependencies

hierarchicalforecast main/master branch
Python 3.10
Linux
pandas 2.0.2

Reproduction script

Y_df, S_df, tags = aggregate(Y_df, spec)

Unfortunately, I can't share my data set.

Issue Severity

None

The text was updated successfully, but these errors were encountered:

candalfigomoro · 2023-06-15T16:00:05Z

Just to add some context:

I have time series with the same ending date but different starting dates.

I think it's ok for reconciliation to fill missing dates with zeros.

However, for model training data I don't want to add leading zeros to my time series.

NudnikShpilkis · 2023-06-15T18:46:29Z

Doesn't aggregate have a is_balanced parameter not a is_unbalanced? The function assumes the data is not balanced by default.

candalfigomoro · 2023-06-15T20:01:28Z

Doesn't aggregate have a is_balanced parameter not a is_unbalanced? The function assumes the data is not balanced by default.

Yes I misspelled in the message above. The parameter is called is_balanced and the default value is False. I edited the message above. The issue is the same.

patleg78 · 2023-08-03T09:49:35Z

I was looking into the source code for new version and old version
here I find some the order of code logic
new version with sparse (say snippet1 )
" Y_bottom_df = Y_bottom_df.groupby(['unique_id', 'ds'])['y'].sum().reset_index()
Y_bottom_df.unique_id = Y_bottom_df.unique_id.astype('category')
Y_bottom_df.unique_id = Y_bottom_df.unique_id.cat.set_categories(S_df.columns)
"

here we are aggregating first and then getting column names as index for Y_bottom_df.

but is old version (say snippet2 )

" Y_bottom_df.unique_id = Y_bottom_df.unique_id.astype('category')
Y_bottom_df.unique_id = Y_bottom_df.unique_id.cat.set_categories(S_df.columns)
Y_bottom_df = Y_bottom_df.groupby(['unique_id', 'ds'])['y'].sum().reset_index()
"
we were first creating all the unique_id's first that is helping us to get the complete time series for all the unique_id's

this small change has resolved this issue for me
just change (say snippet1 ) with (say snippet2 ) in new updated version
thanks

candalfigomoro added the bug label Jun 15, 2023

candalfigomoro mentioned this issue Jul 25, 2023

[utils] aggregate doesn't work correctly with most recent version of the code #224

Closed

mcsqr mentioned this issue Aug 23, 2023

Fixes for large datasets #229

Merged

jmoralez mentioned this issue Sep 14, 2023

fix aggregate function #232

Merged

jmoralez closed this as completed in #232 Oct 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broken aggregate function in main branch #211

Broken aggregate function in main branch #211

candalfigomoro commented Jun 15, 2023 •

edited

Loading

candalfigomoro commented Jun 15, 2023

NudnikShpilkis commented Jun 15, 2023

candalfigomoro commented Jun 15, 2023

patleg78 commented Aug 3, 2023

Broken aggregate function in main branch #211

Broken aggregate function in main branch #211

Comments

candalfigomoro commented Jun 15, 2023 • edited Loading

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

candalfigomoro commented Jun 15, 2023

NudnikShpilkis commented Jun 15, 2023

candalfigomoro commented Jun 15, 2023

patleg78 commented Aug 3, 2023

candalfigomoro commented Jun 15, 2023 •

edited

Loading