Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permutation importance calculations of multilevel models #138

Closed
BELONOVSKII opened this issue Oct 23, 2023 · 0 comments · Fixed by #139
Closed

Permutation importance calculations of multilevel models #138

BELONOVSKII opened this issue Oct 23, 2023 · 0 comments · Fixed by #139
Labels
bug Something isn't working

Comments

@BELONOVSKII
Copy link
Contributor

🐛 Bug

Problem

Functions calc_one_feat_imp and calc_feats_permutation_imps in lightautoml/automl/presets/utils.py are unable to work with multilevel models.

To Reproduce

Fit a TabularAutoML with multi class Task and call get_feature_scores('accurate', df)

Traceback

KeyError Traceback (most recent call last)
Cell In[63], line 1
----> 1 accurate_fi = automl.get_feature_scores('accurate', test_data, silent=True)
2 accurate_fi.set_index('Feature')['Importance'].plot.bar(figsize = (30, 10), grid = True)

File ~/LightAutoML/lightautoml/automl/presets/tabular_presets.py:837, in TabularAutoML.get_feature_scores(self, calc_method, data, features_names, silent)
835 data, _ = read_data(data, features_names, self.cpu_limit, read_csv_params)
836 used_feats = self.collect_used_feats()
--> 837 fi = calc_feats_permutation_imps(
838 self,
839 used_feats,
840 data,
841 self.reader.target,
842 self.task.get_dataset_metric(),
843 silent=silent,
844 )
845 return fi

File ~/LightAutoML/lightautoml/automl/presets/utils.py:38, in calc_feats_permutation_imps(model, used_feats, data, target, metric, silent)
35 feat_imp = []
36 for it, f in enumerate(used_feats):
37 feat_imp.append(
---> 38 calc_one_feat_imp(
39 (it + 1, n_used_feats),
40 f,
41 model,
42 data,
43 norm_score,
44 target,
45 metric,
46 silent,
47 )
48 )
49 feat_imp = pd.DataFrame(feat_imp, columns=["Feature", "Importance"])
50 feat_imp = feat_imp.sort_values("Importance", ascending=False).reset_index(drop=True)

File ~/LightAutoML/lightautoml/automl/presets/utils.py:14, in calc_one_feat_imp(iters, feat, model, data, norm_score, target, metric, silent)
13 def calc_one_feat_imp(iters, feat, model, data, norm_score, target, metric, silent):
---> 14 initial_col = data[feat].copy()
15 data[feat] = np.random.permutation(data[feat].values)
17 preds = model.predict(data)

File ~/LAMA_venv3_8/lib/python3.8/site-packages/pandas/core/frame.py:3807, in DataFrame.getitem(self, key)
3805 if self.columns.nlevels > 1:
3806 return self._getitem_multilevel(key)
-> 3807 indexer = self.columns.get_loc(key)
3808 if is_integer(indexer):
3809 indexer = [indexer]

File ~/LAMA_venv3_8/lib/python3.8/site-packages/pandas/core/indexes/base.py:3804, in Index.get_loc(self, key, method, tolerance)
3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
-> 3804 raise KeyError(key) from err
3805 except TypeError:
3806 # If we have a listlike key, _check_indexing_error will raise
3807 # InvalidIndexError. Otherwise we fall through and re-raise
3808 # the TypeError.
3809 self._check_indexing_error(key)

KeyError: 'Lvl_0_Pipe_0_Mod_0_LinearL2_prediction_0'

@BELONOVSKII BELONOVSKII added the bug Something isn't working label Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant