[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ourownstory/neural_prophet/blob/main/tutorials/feature-use/sub_daily_data_yosemite_temps.ipynb)

# Sub-daily data
NeuralProphet can make forecasts for time series with sub-daily observations by passing in a dataframe with timestamps in the ds column. The format of the timestamps should be `YYYY-MM-DD HH:MM:SS` - see the example csv [here](https://github.com/ourownstory/neuralprophet-data/blob/main/datasets/yosemite_temps.csv). When sub-daily data are used, daily seasonality will automatically be fit.

Here we fit NeuralProphet to data with 5-minute resolution (daily temperatures at Yosemite).

In [1]:
if "google.colab" in str(get_ipython()):
    !pip install git+https://github.com/ourownstory/neural_prophet.git # may take a while
    #!pip install neuralprophet # much faster, but may not have the latest upgrades/bugfixes

import pandas as pd
from neuralprophet import NeuralProphet, set_log_level

set_log_level("ERROR")

In [2]:
data_location = "https://raw.githubusercontent.com/ourownstory/neuralprophet-data/main/datasets/"

df = pd.read_csv(data_location + "yosemite_temps.csv")

Now we will attempt to forecast the next 7 days. The `5min` data resulution means that we have `60/5*24=288` daily values. Thus, we want to forecast `7*288` periods ahead.

Using some common sense, we set:
* First, we disable weekly seasonality, as nature does not follow the human week's calendar.
* Second, we disable changepoints, as the dataset only contains two months of data

In [3]:
m = NeuralProphet(
    n_changepoints=0,
    weekly_seasonality=False,
)
metrics = m.fit(df, freq="5min")
future = m.make_future_dataframe(df, periods=7 * 288, n_historic_predictions=True)
forecast = m.predict(future)
m.plot(forecast)

Finding best initial lr:   0%|          | 0/256 [00:00<?, ?it/s]

Training: 0it [00:00, ?it/s]

Predicting: 293it [00:00, ?it/s]

FigureWidgetResampler({
    'data': [{'fill': 'none',
              'line': {'color': 'rgba(45, 146, 255, 1.0)', 'width': 2},
              'mode': 'lines',
              'name': '<b style="color:sandybrown">[R]</b> yhat1 <i style="color:#fc9944">~2h</i>',
              'type': 'scatter',
              'uid': 'dce7fb06-83c0-430d-b25f-33bf110c2140',
              'x': array([datetime.datetime(2017, 5, 1, 0, 0),
                          datetime.datetime(2017, 5, 1, 1, 40),
                          datetime.datetime(2017, 5, 1, 3, 0), ...,
                          datetime.datetime(2017, 7, 11, 21, 20),
                          datetime.datetime(2017, 7, 11, 22, 45),
                          datetime.datetime(2017, 7, 12, 0, 0)], dtype=object),
              'y': array([23.37998009, 13.04577637,  6.24330425, ..., 46.98901367, 44.46917725,
                          39.54145813])},
             {'marker': {'color': 'black', 'size': 4},
              'mode': 'markers',
              'n

In [4]:
m.plot_parameters()

FigureWidgetResampler({
    'data': [{'fill': 'none',
              'line': {'color': '#2d92ff', 'width': 2},
              'mode': 'lines',
              'name': 'Trend',
              'type': 'scatter',
              'uid': 'edca4c19-6fa6-4b1d-b260-75263d580dd3',
              'x': array([datetime.datetime(2017, 5, 1, 0, 0),
                          datetime.datetime(2017, 7, 5, 0, 0)], dtype=object),
              'xaxis': 'x',
              'y': array([11.13221, 25.72243], dtype=float32),
              'yaxis': 'y'},
             {'fill': 'none',
              'line': {'color': '#2d92ff', 'width': 2},
              'mode': 'lines',
              'name': 'daily',
              'type': 'scatter',
              'uid': 'aeaf8479-3dcd-4590-a6d9-f69fd55075ea',
              'x': array([  0,   1,   2, ..., 285, 286, 287]),
              'xaxis': 'x2',
              'y': array([12.24777119, 11.81259984, 11.36428174, ..., 13.47226656, 13.07773348,
                          12.66953895]),
 

The daily seasonality seems to make sense, when we account for the time being recorded in GMT, while Yosemite local time is GMT-8.

## Improving trend and seasonality
As we have `288` daily values recorded, we can increase the flexibility of `daily_seasonality`, without danger of overfitting.

Further, we may want to re-visit our decision to disable changepoints, as the data clearly shows changes in trend, as is typical with the weather. We make the following changes:
* increase the `changepoints_range`, as the we are doing a short-term prediction
* inrease the `n_changepoints` to allow to fit to the sudden changes in trend
* carefully regularize the trend changepoints by setting `trend_reg` in order to avoid overfitting

In [5]:
m = NeuralProphet(
    changepoints_range=0.95,
    n_changepoints=50,
    trend_reg=1,
    weekly_seasonality=False,
    daily_seasonality=10,
)
metrics = m.fit(df, freq="5min")
future = m.make_future_dataframe(df, periods=60 // 5 * 24 * 7, n_historic_predictions=True)
forecast = m.predict(future)
m.plot(forecast)

Finding best initial lr:   0%|          | 0/256 [00:00<?, ?it/s]

Training: 0it [00:00, ?it/s]

Predicting: 293it [00:00, ?it/s]

FigureWidgetResampler({
    'data': [{'fill': 'none',
              'line': {'color': 'rgba(45, 146, 255, 1.0)', 'width': 2},
              'mode': 'lines',
              'name': '<b style="color:sandybrown">[R]</b> yhat1 <i style="color:#fc9944">~2h</i>',
              'type': 'scatter',
              'uid': '75b1e757-21a9-40fd-b59f-ad8826d9678a',
              'x': array([datetime.datetime(2017, 5, 1, 0, 0),
                          datetime.datetime(2017, 5, 1, 1, 40),
                          datetime.datetime(2017, 5, 1, 2, 50), ...,
                          datetime.datetime(2017, 7, 11, 21, 30),
                          datetime.datetime(2017, 7, 11, 22, 25),
                          datetime.datetime(2017, 7, 12, 0, 0)], dtype=object),
              'y': array([29.07491493, 18.56568909, 12.57799149, ..., 64.31974792, 62.64848328,
                          57.01608276])},
             {'marker': {'color': 'black', 'size': 4},
              'mode': 'markers',
              '

In [6]:
m.plot_parameters()

FigureWidgetResampler({
    'data': [{'fill': 'none',
              'line': {'color': '#2d92ff', 'width': 2},
              'mode': 'lines',
              'name': '<b style="color:sandybrown">[R]</b> Trend <i style="color:#fc9944">~2h</i>',
              'type': 'scatter',
              'uid': '27bdcfe9-8447-4b6f-9d90-1c10b6e47dec',
              'x': array([datetime.datetime(2017, 5, 1, 0, 0),
                          datetime.datetime(2017, 5, 1, 1, 30),
                          datetime.datetime(2017, 5, 1, 2, 50), ...,
                          datetime.datetime(2017, 7, 4, 22, 20),
                          datetime.datetime(2017, 7, 4, 22, 25),
                          datetime.datetime(2017, 7, 5, 0, 0)], dtype=object),
              'xaxis': 'x',
              'y': array([16.83814185, 16.85154296, 16.86345506, ..., 29.99771135, 30.00497146,
                          30.14291366]),
              'yaxis': 'y'},
             {'marker': {'color': '#2d92ff'},
              'name'