_Lambda School Data Science — Regression 1_

# Making Forecasts

#### Objectives
- acquire time series data
- begin with baselines for time series
- use Prophet to forecast time series

## Acquire time series data



 [Wikimedia Foundation's pageviews tool](https://tools.wmflabs.org/pageviews/) to explore and download Wikipedia pageviews data.

- To learn how to get data from an API, follow along with the [Requests library quickstart](https://2.python-requests.org/en/master/user/quickstart/), or [_Automate the Boring Stuff with Python_, Chapter 14](https://automatetheboringstuff.com/chapter14/) by Al Swiegart.
- Then, refer to the [Wikipedia Pageviews API quickstart](https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews#Quick_start) and [documentation](https://wikimedia.org/api/rest_v1/#/Pageviews%20data).

## Begin with baselines for time series


#### [Will Koehrsen:](https://twitter.com/koehrsen_will/status/1088863527778111488)

> One of the most important steps in a machine learning project is establishing a common sense baseline. If your model can't beat the baseline, then maybe you don't really need machine learning.

> A baseline for classification can be the most common class in the training dataset.

> A baseline for regression can be the mean of the training labels. 

> A baseline for time-series regressions can be the value from the previous timestep.

#### Rob Hyndman & George Athanasopoulos, [_Forecasting: Principles and Practice_, Chapter 3.1](https://otexts.com/fpp2/simple-methods.html), Some simple forecasting methods:

> Some forecasting methods are extremely simple and surprisingly effective. We will use the following methods as benchmarks throughout this book.

> **Average method:** the forecasts of all future values are equal to the average (or “mean”) of the historical data.

> **Naïve method:** we simply set all forecasts to be the value of the last observation. This method works remarkably well for many economic and financial time series.

> **Drift method:** This is equivalent to drawing a line between the first and last observations, and extrapolating it into the future.

> Sometimes one of these simple methods will be the best forecasting method available; but in many cases, these methods will serve as benchmarks rather than the method of choice. That is, any forecasting methods we develop will be compared to these simple methods to ensure that the new method is better than these simple alternatives. If not, the new method is not worth considering.

## Use Prophet to forecast time series

We will follow the [Prophet Quick Start tutorial](https://facebook.github.io/prophet/docs/quick_start.html#python-api).

> The input to Prophet is always a dataframe with two columns: `ds` and `y`. The `ds` (datestamp) column should be of a format expected by Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp. The `y` column must be numeric, and represents the measurement we wish to forecast.

> We fit the model by instantiating a new `Prophet` object. Any settings to the forecasting procedure are passed into the constructor. Then you call its `fit` method and pass in the historical dataframe. Fitting should take 1-5 seconds.

```
    from fbprophet import Prophet
    m = Prophet()
    m.fit(df)
```

> Predictions are then made on a dataframe with a column `ds` containing the dates for which a prediction is to be made. You can get a suitable dataframe that extends into the future a specified number of days using the helper method `Prophet.make_future_dataframe`. By default it will also include the dates from the history, so we will see the model fit as well.

```
    future = m.make_future_dataframe(periods=365)
    future.tail()
```

> The `predict` method will assign each row in `future` a predicted value which it names `yhat`. If you pass in historical dates, it will provide an in-sample fit. The `forecast` object here is a new dataframe that includes a column `yhat` with the forecast, as well as columns for components and uncertainty intervals.

```
    forecast = m.predict(future)
```

> You can plot the forecast by calling the `Prophet.plot` method and passing in your forecast dataframe.

```
    fig1 = m.plot(forecast)
```

> If you want to see the forecast components, you can use the Prophet.plot_components method. By default you’ll see the trend, yearly seasonality, and weekly seasonality of the time series. If you include holidays, you’ll see those here, too.

```
    fig2 = m.plot_components(forecast)
```

> More details about the options available for each method are available in the docstrings, for example, via `help(Prophet)` or `help(Prophet.fit)`.