Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a plot function for gains/lift in R and Python #7271

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 5 comments
Closed

Add a plot function for gains/lift in R and Python #7271

exalate-issue-sync bot opened this issue May 11, 2023 · 5 comments

Comments

@exalate-issue-sync
Copy link

We don't have this in R and Python, however we do have the plots in Flow.

Here's how you'd have to do it in R:

{code:r}plot(gain_table$cumulative_data_fraction,
gain_table$cumulative_capture_rate,'l',
ylim = c(0,1.5), col = "dodgerblue3",
xlab = "cumulative data fraction",
ylab = "cumulative capture rate, cumulative lift",
main = "Gains/Lift")
lines(gain_table$cumulative_data_fraction,
gain_table$cumulative_lift, col = "orange"){code}

Python code:

{code:python}from h2o.estimators import H2OGradientBoostingEstimator
from h2o.utils.ext_dependencies import get_matplotlib_pyplot
from matplotlib.collections import PolyCollection

Import the airlines dataset:

airlines = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/testng/airlines_train.csv")

Build and train the model:

model = H2OGradientBoostingEstimator(ntrees=1, gainslift_bins=20)
model.train(x=["Origin","Distance"], y="IsDepDelayed", training_frame=airlines)

gl = model.gains_lift()

X = gl['cumulative_data_fraction']
Y = gl['cumulative_capture_rate']
YC = gl['cumulative_lift']

plt = get_matplotlib_pyplot(server=False, raise_if_not_available=True)
plt.figure(figsize=(10,10))
plt.grid(True)
plt.plot(X, Y, zorder=10, label='cumulative capture rate')
plt.plot(X, YC, zorder=10, label='cumulative lift')
plt.legend(loc=4, fancybox=True, framealpha=0.5)
plt.xlim(0, 1)
plt.ylim(0, 1.5)
plt.xlabel('cumulative data fraction')
plt.ylabel('cumulative capture rate, cumulative lift')
plt.title('Gains/Lift')
fig = plt.gcf()
plt.show(){code}

Functions should be something like:

R:

{code:r}h2o.plot_gainslift(model, xval = TRUE){code}

Python:

{code:r}model.plot_gainslift(xval=True){code}

!Screen Shot 2021-10-20 at 5.49.08 PM.png|width=541,height=524!

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: I think there’s a standard blue to use in our R plots, but I just guessed a shade of blue to use.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: We can also consider whether we want to allow the user to plot a single plot at the same time, maybe by passing an arg, and have the standard show both at once? Check out this ticket for more info and discussion: [https://github.com//pull/5845|https://github.com//pull/5845|smart-link]

@h2o-ops-ro
Copy link
Collaborator

JIRA Issue Details

Jira Issue: PUBDEV-8388
Assignee: Tomas Fryda
Reporter: Erin LeDell
State: Resolved
Fix Version: 3.36.1.1
Attachments: Available (Count: 3)
Development PRs: Available

@h2o-ops-ro
Copy link
Collaborator

Attachments From Jira

Attachment Name: Screen Shot 2021-10-20 at 5.49.08 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8388/Screen Shot 2021-10-20 at 5.49.08 PM.png

Attachment Name: Screen Shot 2021-11-15 at 3.31.27 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8388/Screen Shot 2021-11-15 at 3.31.27 PM.png

Attachment Name: Screen Shot 2021-11-15 at 3.37.06 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8388/Screen Shot 2021-11-15 at 3.37.06 PM.png

@h2o-ops-ro
Copy link
Collaborator

Linked PRs from JIRA

#5907

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants