Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add optional prophet forecasting functionality to chart data api #10324

Merged
merged 7 commits into from
Jul 20, 2020

Conversation

villebro
Copy link
Member

SUMMARY

This PR adds Facebook Prophet as an optional dependency to enable time series forecasting in viz plugins. For each series (=column), it will add three new columns with the following suffix:

  • __yhat: forecast
  • __yhat_lower: lower confidence level
  • __yhat_upper: uper confidence level
    In addition, it extrapolates as many time grain steps into the future as have been specified in the periods parameter. For instance, 52 periods with weekly time grains will create a forecast of 52 weekly observations = 1 year into the future. The forecast along with confidence intervals will be made available for the whole time period, while the observations (without a suffix) will be null for the forecasted dates. For more details please refer to the excellent tutorial by the Prophet team.

This treats each series as an individual time series model, hence will be computationally very expensive in the case of multiple columns. However, since the new chart data endpoint caches the end result, only the first calculation will be slow.

I've tried to add as many different tests as possible (schema validation, post processing, e2e API test) to ensure functionality. The end-to-end tests require fbprophet to be installed, however I decided against adding it to requirements-dev.txt, as it will make CI much more sluggish. However, anyone interested in developing this feature further can install the dependency locally and activate the e2e tests that are otherwise skipped.

TEST PLAN

CI + new tests

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

@villebro villebro force-pushed the villebro/prophet branch 4 times, most recently from c57a8e1 to aeeb919 Compare July 15, 2020 20:15
Copy link
Member

@willbarrett willbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple code structure suggestions, but other than that looks good.

@@ -544,3 +563,105 @@ def contribution(
if temporal_series is not None:
contribution_df.insert(0, DTTM_ALIAS, temporal_series)
return contribution_df


def prophet( # pylint: disable=too-many-arguments,too-many-locals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this function be broken up? Generally comments in a large function like this are a good place to draw divisions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These unfortunately need to be one single function with too-many-arguments, as they are called based on instructions provided in the chart data request. But you're right, too-many-locals doesn't need to be part of the design here, so will try to break out some internal logic.

@@ -508,3 +509,119 @@ def test_contribution(self):
self.assertListEqual(df.columns.tolist(), ["a", "b"])
self.assertListEqual(series_to_list(column_df["a"]), [0.25, 0.75])
self.assertListEqual(series_to_list(column_df["b"]), [0.1, 0.9])

def test_prophet_incorrect_values(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking this up into a test for each incorrect value type may make debugging test failures easier down the road.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will break this up.

@villebro
Copy link
Member Author

@willbarrett this is ready for re-review. I split out the training part into a private method to make the main method simpler (was able to drop the disable too-many-locals), and also made each unit test a separate function. However, the too-many-arguments can't really be changed due to the nature of how post processing operations are called from the chart data API.

Copy link
Member

@willbarrett willbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one suggestion for avoiding another pylint disable, but it's non-blocking.

@villebro villebro merged commit 7af8b2b into apache:master Jul 20, 2020
@villebro villebro deleted the villebro/prophet branch July 27, 2020 18:20
@enjineerMan
Copy link

Hi, this might be a dumb question but where can we find this forecasting function in Superset?

@villebro
Copy link
Member Author

villebro commented Nov 9, 2020

Hi @enjineerMan this feature is available on master branch and on the forthcoming 0.38 release of Apache Superset. on the new ECharts based timeseries viz type. To enable, you need to make sure the fbprophet package is installed, after which the Predictive Analytics feature can be used in the "Time-series chart" visualization type. Please keep in mind that this functionality is still experimental and has known performance issues that are gradually being worked on.

auxten pushed a commit to auxten/incubator-superset that referenced this pull request Nov 20, 2020
apache#10324)

* feat: add prophet post processing operation

* add tests

* lint

* whitespace

* remove whitespace

* address comments

* add note to UPDATING.md
@@ -23,6 +23,8 @@ assists people when migrating to a new version.

## Next

* [10324](https://github.com/apache/incubator-superset/pull/10324): Facebook Prophet has been introduced as an optional dependency to add support for timeseries forecasting in the chart data API. To enable this feature, install Superset with the optional dependency `prophet` or directly `pip install fbprophet`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to install Superset with the optional dependency prophet? i have already installed superset. Now how to install prophet to my existing superset?

Copy link

@muchemwal muchemwal Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wakilkhan96

If you haven't managed to solve installing prophet, access the terminal for your superset_app and run the following commands:

  1. docker exec -it superset_app /bin/bash
  2. pip uninstall fbprophet pystan
  3. pip --no-cache-dir install pystan==2.19.1.1
  4. pip install prophet

That should solve that issue.
pystan>=3.0 is currently not supported for prophet that why you should specify that version.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.38.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 0.38.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants