We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensembles fails to works with HierarchicalPipeline, they lose hierarchical structure during forecast and don't use hierarchical structure in backtest
HierarchicalPipeline
forecast
backtest
You can run fit/forecast/backtest without errors and losing hierarchical structure
import pandas as pd from etna.metrics import SMAPE from etna.models import SeasonalMovingAverageModel from etna.pipeline import HierarchicalPipeline from etna.ensembles import VotingEnsemble from etna.datasets import TSDataset from etna.reconciliation import BottomUpReconciliator # Download data 'curl "https://robjhyndman.com/data/hier1_with_names.csv" --ssl-no-revoke -o "hier1_with_names.csv"' df = pd.read_csv("hier1_with_names.csv") # Prepare Dataframe periods = len(df) city_segments = list(filter(lambda name: name.count("-") == 2, df.columns)) df["timestamp"] = pd.date_range("2006-01-01", periods=periods, freq="MS") df.set_index("timestamp", inplace=True) hierarchical_df = [] for segment_name in city_segments: segment = df[segment_name] region, reason, city = segment_name.split(" - ") seg_df = pd.DataFrame( data={ "timestamp": segment.index, "target": segment.values, "city_level": [city] * periods, "region_level": [region] * periods, "reason_level": [reason] * periods, }, ) hierarchical_df.append(seg_df) hierarchical_df = pd.concat(hierarchical_df, axis=0) # Create Dataset hierarchical_df, hierarchical_structure = TSDataset.to_hierarchical_dataset( df=hierarchical_df, level_columns=["reason_level", "region_level", "city_level"] ) hierarchical_ts = TSDataset(df=hierarchical_df, freq="MS", hierarchical_structure=hierarchical_structure) # Create pipeline pipeline_1 = HierarchicalPipeline( model=SeasonalMovingAverageModel(window=1, seasonality=1), reconciliator=BottomUpReconciliator(target_level="region_level", source_level="city_level"), ) pipeline_2 = HierarchicalPipeline( model=SeasonalMovingAverageModel(window=1, seasonality=2), reconciliator=BottomUpReconciliator(target_level="region_level", source_level="city_level"), ) pipeline_vote = VotingEnsemble(pipelines=[pipeline_1, pipeline_2]) bottom_up_metrics, _, _ = pipeline_vote.backtest( ts=hierarchical_ts, metrics=[SMAPE()], n_folds=3, aggregate_metrics=True )
ValueError: There are segments in y_pred that are not in y_true, for example: vfr_NT, vfr_QLD, hol_WA, hol_QLD, vfr_VIC
No response
The text was updated successfully, but these errors were encountered:
alex-hse-repository
Successfully merging a pull request may close this issue.
🐛 Bug Report
Ensembles fails to works with
HierarchicalPipeline
, they lose hierarchical structure duringforecast
and don't use hierarchical structure inbacktest
Expected behavior
You can run fit/forecast/backtest without errors and losing hierarchical structure
How To Reproduce
Code
Error
Environment
No response
Additional context
Checklist
The text was updated successfully, but these errors were encountered: