# Data Format

<a href="https://colab.research.google.com/github/Nixtla/neuralforecast/blob/main/nbs/examples/Data_Format.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this example we will show the data format required by NeuralForecast.

## Long format

### Multiple time series

Store your time series in a pandas dataframe in long format, that is, each row represents an observation for a specific series and timestamp. Let's see an example using the `datasetsforecast` library.

In [None]:
%%capture
!pip install datasetsforecast

In [None]:
from datasetsforecast.m3 import M3

In [None]:
Y_df, *_ = M3.load('./data', group='Other')

In [None]:
Y_df.head()

Unnamed: 0,unique_id,ds,y
0,O1,1970-01-01,3060.42
1,O1,1970-01-02,3021.19
2,O1,1970-01-03,3301.13
3,O1,1970-01-04,3287.03
4,O1,1970-01-05,3080.71


In [None]:
Y_df.tail()

Unnamed: 0,unique_id,ds,y
13320,O99,1970-03-08,27437.0
13321,O99,1970-03-09,27136.0
13322,O99,1970-03-10,26714.0
13323,O99,1970-03-11,26407.0
13324,O99,1970-03-12,26265.0


`Y_df` is a dataframe with three columns: `unique_id` with a unique identifier for each time series, a column `ds` with the datestamp and a column `y` with the values of the series.

### Single time series

If you have only one time series, you have to include the `unique_id` column. Consider, for example, the [AirPassengers](https://github.com/Nixtla/transfer-learning-time-series/blob/main/datasets/air_passengers.csv) dataset.

In [None]:
Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')

In [None]:
Y_df

Unnamed: 0,timestamp,value
0,1949-01-01,112
1,1949-02-01,118
2,1949-03-01,132
3,1949-04-01,129
4,1949-05-01,121
...,...,...
139,1960-08-01,606
140,1960-09-01,508
141,1960-10-01,461
142,1960-11-01,390


In this example `Y_df` only contains two columns: `timestamp`, and `value`. To use `NeuralForecast` we have to include the `unique_id` column and rename the previuos ones.

In [None]:
Y_df['unique_id'] = 1. # We can add an integer as identifier
Y_df = Y_df.rename(columns={'timestamp': 'ds', 'value': 'y'})
Y_df = Y_df[['unique_id', 'ds', 'y']]

In [None]:
Y_df

Unnamed: 0,unique_id,ds,y
0,1.0,1949-01-01,112
1,1.0,1949-02-01,118
2,1.0,1949-03-01,132
3,1.0,1949-04-01,129
4,1.0,1949-05-01,121
...,...,...,...
139,1.0,1960-08-01,606
140,1.0,1960-09-01,508
141,1.0,1960-10-01,461
142,1.0,1960-11-01,390


<a href="https://colab.research.google.com/github/Nixtla/neuralforecast/blob/main/nbs/examples/Data_Format.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>