Problems interpreting variable importances in multivariate time series forest #736
Unanswered
flying-scotsman
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi sktime community!
I'm trying to understand variable importances in a multivariate time series forest. I've attached a diagram - it displays the 3 standard feature importances for a time series forest fitted on a dataset with 103 instances, 2 variables (here 1 & 2) and 28 time points. What's confusing me is how they seem to be continuous at the boundary - of course the importances for the last time point for variable 1 are directly before the first time point of variable 2. I've also observed that the ordering of the variables in the long time series affects the feature importances.
Can anyone explain to me what's happening here? What I would expect: Discontinuous (but normalized) variable importance curves.
Here's the code I'm using to extract the importances (df is a pandas.DataFrame with pandas.Series as cells and labels is a pandas.Series):
Beta Was this translation helpful? Give feedback.
All reactions