Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of estimators for a multivariate target series with multi_models=True #2217

Closed
dwolffram opened this issue Feb 7, 2024 · 3 comments · Fixed by #2246
Closed

Order of estimators for a multivariate target series with multi_models=True #2217

dwolffram opened this issue Feb 7, 2024 · 3 comments · Fixed by #2246
Labels
question Further information is requested

Comments

@dwolffram
Copy link

Hi there,

I was wondering about the order in which estimators are stored, specifically, in an XGBoost model. Let's say, I train a model called xgb on a multivariate target series with two components A and B, and an output_chunk_length=2 with multi_models=True.
Are the estimators in xgb.model.estimators_ stored as [A1, A2, B1, B2] or [A1, B1, A2, B2] (if e.g. B2 donates the estimator for component B with horizon 2)?

Alternatively, is there maybe something similar to xgb.lagged_feature_names to retrieve the "target names" of all estimators?

Thank you!

@madtoinou
Copy link
Collaborator

Hi @dwolffram,

Based on your example, the order is [A1, B1, A2, B2]. You can access each individual estimator using the helper method model.get_multioutput_estimator(horizon, target_dim), to access the estimator for A1 for example, you would pass horizon=0, target_dim=0.

Oh so you would like an additional method that would return the target name & horizon step forecasted by each estimator as a list?

@madtoinou madtoinou added the question Further information is requested label Feb 7, 2024
@dwolffram
Copy link
Author

dwolffram commented Feb 7, 2024

Thank you, knowing the order already helps me!

But sure, it would be convenient if you could retrieve the target and horizon of an estimator.

For example, I'm currently looking at the feature importance of my estimators, e.g.:

x = xgb.model.estimators_[20]
x.get_booster().feature_names = xgb.lagged_feature_names
plot_importance(x, max_num_features=30)

But I'd also like to know which target this estimator refers to, which is not straightforward, but I can now figure it out by knowing the order.

I just looked at the method you referred to, but I'm not sure it's doing the right thing? It returns self.model.estimators_[horizon + target_dim]. Shouldn't there be some multiplication in there (something like horizon*max_target_dim + target_dim)? Currently, horizon=0 and target_dim=1 returns the same as horizon=1 and target_dim=0, or am I missing something? 🤔

@madtoinou
Copy link
Collaborator

Well spotted, I noticed it as well and almost ready to submit a PR to fix this bug (and add docstring).

I will also implement a method to get the "lagged_features_names", I'll make sure to link this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants