-
Notifications
You must be signed in to change notification settings - Fork 89
Added ability to create a bar plot of feature importances #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #133 +/- ##
==========================================
+ Coverage 97.3% 97.32% +0.01%
==========================================
Files 95 95
Lines 2712 2729 +17
==========================================
+ Hits 2639 2656 +17
Misses 73 73
Continue to review full report at Codecov.
|
how does this handle features that weren't selected in a feature selection? is it still showing them in the x-axis? |
It only shows features that were selected for that particular pipeline. Features that were dropped in this pipeline are not shown. |
Actually, it looks like this functionality has already existed as feature_importances_plot() in evalml/models/render.py. Does this plot need any changes @kmax12 @jeremyliweishih? |
evalml/pipelines/pipeline_base.py
Outdated
def plot_feature_importances(self): | ||
feat_imp = self.feature_importances | ||
feat_imp['importance'] = abs(feat_imp['importance']) | ||
feat_imp = feat_imp.iloc[::-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just add a comment about why we need to reverse the list? i would think feature importance already returned it in the right order
evalml/pipelines/pipeline_base.py
Outdated
@@ -252,3 +253,28 @@ def feature_importances(self): | |||
importances.sort(key=lambda x: -abs(x[1])) | |||
df = pd.DataFrame(importances, columns=["feature", "importance"]) | |||
return df | |||
|
|||
def plot_feature_importances(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should wait for #169 to get merge in and then change the api to pipeline.plot.feature_importances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see this might be a bit tricky since the feature importances are store on the pipeline object rather than results dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently have it so that the plot class takes in a data dictionary that can be updated to store whatever (aka more than just results) for this purpose. I wonder if there is a better implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. one though i have is that we could initialize the class every time .plot
is accessed. When we initialize it, we just pass the pipeline object to the constructor. that would also also use to override __call__
so that pipeline.plot()
rendered the pipeline graphic in #211
This function is called by using |
@christopherbunn I think the naming is good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than some clarifying questions. One small thing: would it look better for centering on the title and subtitle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! :D
Naming sounds good to me! (Honestly, might be worth reducing generate_roc() to just roc() but if anything we can address that in a future PR :)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only one question. otherwise looks good
@@ -75,3 +75,32 @@ def __call__(self, to_file=None): | |||
graph.render(output_path, cleanup=True) | |||
|
|||
return graph | |||
|
|||
def feature_importances(self): | |||
import plotly.graph_objects as go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we import this inside of the function?
Addressed conf.py issues
Resolves #24