Skip to content

Refactor permutation importance method and add per-column permutation importance method #2302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jun 1, 2021

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented May 25, 2021

Closes #2299

  • Also updates calculate_permutation_importance to only return dict with mean. Previously, the slow impl would return the std / the feature importances but that was different from the fast impl which only returned the mean. Hence, standardizing to return just the mean, since we don't use the std/feature importances to begin with.

Question for discussion: We no longer use the scikit-learn calculate_permutation_importance method as the "slow" method. Should we still compare outputs? I originally wrote this test and confirmed that it passed, but don't think it's necessary to add to codebase. (It also takes 5 minutes for the test locally 😱). Since we're moving away from the sklearn method, I think its fine.



@pytest.mark.parametrize('pipeline_class, parameters', test_cases)
@patch('evalml.pipelines.PipelineBase._supports_fast_permutation_importance', new_callable=PropertyMock)
def test_slow_permutation_importance_matches_sklearn_output(mock_supports_fast_importance, pipeline_class, parameters,
                                                            has_minimal_dependencies, fraud_100):
    if has_minimal_dependencies and pipeline_class == LinearPipelineWithTargetEncoderAndOHE:
        pytest.skip("Skipping test_fast_permutation_importance_matches_sklearn_output for target encoder cause "
                    "dependency not installed.")
    from evalml.demos import load_fraud
    from evalml.objectives import get_objective
    from sklearn.inspection import permutation_importance as sk_permutation_importance

    X, y = fraud_100

    if pipeline_class == LinearPipelineWithTextFeatures:
        X.ww.set_types(logical_types={'provider': 'NaturalLanguage'})

    # Do this to make sure we use the same int as sklearn under the hood
    random_state = np.random.RandomState(0)
    random_seed = random_state.randint(np.iinfo(np.int32).max + 1)

    mock_supports_fast_importance.return_value = False
    parameters['Random Forest Classifier'] = {'n_jobs': 1}
    pipeline = pipeline_class(pipeline_class.component_graph, parameters=parameters)
    pipeline.fit(X, y)

    random_state = np.random.RandomState(0)
    random_seed = random_state.randint(np.iinfo(np.int32).max + 1)
    def sklearn_impl(pipeline, X, y, objective, n_repeats, n_jobs, random_seed):
        X = infer_feature_types(X)
        y = infer_feature_types(y)
        objective = get_objective(objective, return_instance=True)

        def scorer(pipeline, X, y):
            scores = pipeline.score(X, y, objectives=[objective])
            return scores[objective.name] if objective.greater_is_better else -scores[objective.name]
        perm_importance = sk_permutation_importance(pipeline, X, y, n_repeats=n_repeats, scoring=scorer, n_jobs=n_jobs,
                                                    random_state=random_seed)
        mean_perm_importance = perm_importance["importances_mean"]
        feature_names = list(X.columns)
        mean_perm_importance = list(zip(feature_names, mean_perm_importance))
        mean_perm_importance.sort(key=lambda x: x[1], reverse=True)
        return pd.DataFrame(mean_perm_importance, columns=["feature", "importance"])

    sklearn_scores = sklearn_impl(pipeline, X, y, objective='Log Loss Binary', n_repeats=5, n_jobs=None,
                                                   random_seed=0)
    mock_supports_fast_importance.return_value = False
    slow_scores = calculate_permutation_importance(pipeline, X, y, objective='Log Loss Binary',
                                                   random_seed=random_seed)
    pd.testing.assert_frame_equal(sklearn_scores, slow_scores)

@angela97lin angela97lin self-assigned this May 25, 2021
@@ -293,113 +293,9 @@ def graph_roc_curve(y_true, y_pred_proba, custom_class_names=None, title_additio
return _go.Figure(layout=layout, data=graph_data)


def _calculate_permutation_scores_fast(pipeline, precomputed_features, y, objective, col_name,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved everything to a permutation_importance file since we were starting to add more methods that were just to calculate permutation importance. Left the graphing permutation importance method here, though.

@codecov
Copy link

codecov bot commented May 25, 2021

Codecov Report

Merging #2302 (9431939) into main (275ea65) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2302     +/-   ##
=======================================
+ Coverage   99.9%   99.9%   +0.1%     
=======================================
  Files        280     281      +1     
  Lines      24492   24566     +74     
=======================================
+ Hits       24464   24538     +74     
  Misses        28      28             
Impacted Files Coverage Δ
evalml/model_understanding/__init__.py 100.0% <100.0%> (ø)
evalml/model_understanding/graphs.py 100.0% <100.0%> (ø)
...alml/model_understanding/permutation_importance.py 100.0% <100.0%> (ø)
...understanding_tests/test_permutation_importance.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 275ea65...9431939. Read the comment docs.

@freddyaboulton
Copy link
Contributor

@angela97lin What i like about that test is that it makes sure our permutation importance implementation is producing correct results (at least by sklearn standards). I'm not sure what you have in mind for how the new tests would look, but I think it's worth ensuring we're producing correct values (especially since we're venturing out with our own impl now). We don't have to call sklearn's permutation importance to do that though.

@angela97lin
Copy link
Contributor Author

@freddyaboulton I converted the previous test which checked against sklearn's impl to compare the new "slow" vs fast method, as a way of checking for correctness. This test which I added above is to check that the "slow" produces the same as sklearn's... which I think is fine to check for this PR but since this PR is also breaking away from the public sklearn interface, is perhaps not necessary? What do you think? 🤔
I guess TLDR is, how tied are we to checking for correctness via external library :d

@freddyaboulton
Copy link
Contributor

Got it! I don't think we have to use sklearn to establish "correctness"!

@angela97lin
Copy link
Contributor Author

@freddyaboulton Sounds good!! 😁

@angela97lin angela97lin marked this pull request as ready for review May 26, 2021 03:53
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I left a few nit-pick comments, but the test coverage is solid, and I like how you broke the methods apart!

pipeline (PipelineBase or subclass): Fitted pipeline
X (pd.DataFrame): The input data used to score and compute permutation importance
y (pd.Series): The target data
objective (str, ObjectiveBase): Objective to score on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, but can you sort these args based on the order of the args in the method? Also, can you add col_name to the args here?

return pd.DataFrame(mean_perm_importance, columns=["feature", "importance"])


def calculate_permutation_importance_one_column(X, y, pipeline, col_name, objective,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: to make this and the previous method more similar, can you normalize the order of X, y, pipeline vs pipeline, X, y?

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Angela, this looks good. I think there are a few doctstring adjustments to make. It also seems that the per-column and base perm importance functions are pretty similar so I was wondering if its possible to reuse the function more or less as-is. I had grand ideas to try it out, but I got distracted :P. Anyways, I'm sure it's the way it is for a reason, so take it or leave it if you think it's worth exploring.

@chukarsten chukarsten merged commit e323993 into main Jun 1, 2021
@chukarsten chukarsten deleted the 2299_per_col branch June 1, 2021 23:22
@chukarsten chukarsten mentioned this pull request Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add per-column permutation importance methods
4 participants