### Pandas time series
A series or dataframe using a datetime index instead of a range index.

Used fpr storing events/data the fits on a timeline.
- Weather data.
- Temperature readings.
- Heart rate monitoring (EKG).
- Quarterly sales.
- Stock prices.

In [1]:
import pandas as pd
import numpy as np

### Creating a datetimeindex
Use date_range() with 3 of the 4 parameters:
- start
- end
- periods (if start and end periods = points in between, if with freq periods = amount of jumps with freq)
- freq (size of the jump either between two points (start,end), or amount of periods)

In [2]:
datetimeindex = pd.date_range(start='2018-01-01', end='2019-12-31', freq='d')
print(datetimeindex)

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',
               '2018-01-09', '2018-01-10',
               ...
               '2019-12-22', '2019-12-23', '2019-12-24', '2019-12-25',
               '2019-12-26', '2019-12-27', '2019-12-28', '2019-12-29',
               '2019-12-30', '2019-12-31'],
              dtype='datetime64[ns]', length=730, freq='D')


In [3]:
df = pd.DataFrame(
    dict(
        n = range(len(datetimeindex)),
        rand = np.random.random(len(datetimeindex))
    ),
    index=datetimeindex
)

df

Unnamed: 0,n,rand
2018-01-01,0,0.126925
2018-01-02,1,0.093664
2018-01-03,2,0.705756
2018-01-04,3,0.274367
2018-01-05,4,0.751243
...,...,...
2019-12-27,725,0.671867
2019-12-28,726,0.036130
2019-12-29,727,0.282953
2019-12-30,728,0.859586


In [4]:
df.loc["2018-03-15"]
df.loc[:"2018-01-15"]
df.loc["2018-03-15":"2018-03-20"]
df.loc["2018-03"]
df.loc["2019"]
df.loc["2019-03-29":"2019-05"]

Unnamed: 0,n,rand
2019-03-29,452,0.332843
2019-03-30,453,0.820084
2019-03-31,454,0.491686
2019-04-01,455,0.712248
2019-04-02,456,0.796318
...,...,...
2019-05-27,511,0.973269
2019-05-28,512,0.378365
2019-05-29,513,0.502481
2019-05-30,514,0.199627


### Resampling 
Resampling is the practice of creating new samples with a lower or higher frequency than the original.

### Downsampling
When the sample frequency is lower than the original we can aggregate values from the original data.

In [10]:
from helpers import hdisplay

resample_method = "Y"

hdisplay([
    df.head(10),
    df.resample("2D").agg({"n": "max", "rand": "sum"}.head(10))]
    ["Original", f"Resampled using '{resample_method}'"]
)




ImportError: cannot import name 'hdisplay' from 'helpers' (c:\Users\maxra\.virtualenvs\Databehandling-Max-Rain--ARGqDdif\Lib\site-packages\helpers\__init__.py)

### Upsampling 
When the sample frequency is higher than the original, we can fill between values:
* ffill
* bfill
* nearest
* interpolate
* fillna