Skip to content

Plot for time series regression#1483

Merged
freddyaboulton merged 10 commits intomainfrom
1258-plot-for-ts-regression
Dec 2, 2020
Merged

Plot for time series regression#1483
freddyaboulton merged 10 commits intomainfrom
1258-plot-for-ts-regression

Conversation

@freddyaboulton
Copy link
Contributor

@freddyaboulton freddyaboulton commented Dec 1, 2020

Pull Request Description

Fixes #1258

Example with a year's worth of (synthetic) data:
image

Example with a month's worth of data:
image


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@freddyaboulton freddyaboulton changed the title Adding graph_prediction_vs_target_over_time function to model_underst… Plot for time series regression Dec 1, 2020
@codecov
Copy link

codecov bot commented Dec 1, 2020

Codecov Report

Merging #1483 (de127f0) into main (e3acef6) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1483     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         223      223             
  Lines       15100    15139     +39     
=========================================
+ Hits        15093    15132     +39     
  Misses          7        7             
Impacted Files Coverage Δ
evalml/model_understanding/__init__.py 100.0% <ø> (ø)
evalml/model_understanding/graphs.py 100.0% <100.0%> (ø)
...lml/tests/model_understanding_tests/test_graphs.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e3acef6...de127f0. Read the comment docs.

Copy link
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet looks good to me. Only extra thing I would mention would that we should use this in the docs when we get to it.

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, the graphs look amazing! Just left a tiny comment about the axis name :d

# Let plotly pick the best date format.
layout = _go.Layout(title={'text': "Prediction vs Target over time"},
xaxis={'title': 'Time'},
yaxis={'title': 'Target Values'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nit-picky: could we name the y-axis something else since we're graphing both target and prediction values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe good point. I'll change it to "Target Values and Predictions"!

@freddyaboulton freddyaboulton force-pushed the 1258-plot-for-ts-regression branch from 6c120e8 to c2d1b05 Compare December 1, 2020 22:03
@freddyaboulton
Copy link
Contributor Author

Good point @jeremyliweishih ! I updated #1384 to mention showing how to use the model understanding module for time series.

Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton I left a couple suggestions and some thoughts, but once those are resolved LGTM!

graph_partial_dependence,
graph_prediction_vs_actual
graph_prediction_vs_actual,
graph_prediction_vs_target_over_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton do you like "graph_prediction_vs_actual" or "graph_prediction_vs_target" more? We should use the same name for both of these.

I don't feel strongly about it, but I am more used to "prediction vs actual"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've seen predicted vs actual, I agree that that sounds much better :o

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I preferred using "target" in the function name since that's what we mention in the docstrings and axis labels but I'll change the name to actual for consistency.

return _go.Figure(layout=layout, data=data)


def graph_prediction_vs_target_over_time(pipeline, X, y, dates):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented about this on the issue, but to reiterate: its fine to leave this as-is and expect dates as a 4th arg, but heads up that @angela97lin is in the process of standardizing all the util/graph methods to use DataTables, in which case X.time_index should get you the dates.

y (pd.Series): Target values to compare predictions against.

Returns:
plotly.Figure showing the prediction vs actual over time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton could we separate this into a data generation method and a graphing method? That way callers with a different UI can use the data generation method, and people who want plotly figs can use this method.

For this particular plot the data generated is fairly simple, just three vectors: predictions, actual and dates

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I thought about doing that at first but thought it wasn't worth it since it was a one-liner. I'll push this up now!

class NotTSPipeline:
problem_type = ProblemTypes.REGRESSION

error_msg = "graph_prediction_vs_target_over_time only supports time series regression pipelines! Received regression."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@freddyaboulton freddyaboulton force-pushed the 1258-plot-for-ts-regression branch from c2d1b05 to fd2732c Compare December 2, 2020 15:53
@freddyaboulton freddyaboulton force-pushed the 1258-plot-for-ts-regression branch from af55d46 to 6164e95 Compare December 2, 2020 17:02
@freddyaboulton freddyaboulton merged commit 3d69967 into main Dec 2, 2020
@freddyaboulton freddyaboulton deleted the 1258-plot-for-ts-regression branch December 2, 2020 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add predicted-vs-actual plot for timeseries regression

4 participants