---
#image: dataloading.png
title: Data plotting
subtitle: Time series with python
date: '2024-03-01'
categories: [Python, Pandas, time_series, ]
author: Kunal Khurana
jupyter: python3
toc: True
---

# Time Series

## Date and Time Data types and tools
- Converting between Sting and Datetime

## Time series basics
- Indexing, selection, subsetting
- Time series with duplicate indices

## Data Ranges, Frequencies, and Shifting
- Generating Date Ranges
- Frequencies and Date offsets
- Shifting (Leading and Lagging) Data
- Shifting dates with offsets

## Time Zone Handling
- Time zone Loacalization and Conversion
- Operations with Time Zone-Aware Timestamp Objects
- Operations between different time zones

## Periods and Period Arithmetic
- period frequency conversion
- Quaterly period frequencies
- Converting timestamps to periods (and back)
- Creating a PeriodIndex from Arrays

## Resampling and Frequency Conversion
- Donwsampling
- Open-high-low-close(OHLC) resampling
- Unsampling and Interpolation
- Resampling with periods
- Grouped time resampling

## Moving window functions
- Exponentially weighted functions
- Binary moving window functions
- User-defined window functions


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from datetime import datetime


In [4]:
## Data and Time Data Types and Tools

now = datetime.now()

now

datetime.datetime(2024, 2, 28, 11, 9, 33, 283067)

In [5]:
now.year, now.month, now.day

(2024, 2, 28)

In [7]:
delta = datetime(2011, 1, 7) - datetime(2008, 6, 3, 4, 1, 4)

delta

datetime.timedelta(days=947, seconds=71936)

In [8]:
delta.days

947

In [9]:
delta.seconds

71936

In [11]:
# add or subtract a timedelta to yield a new shifted object

from datetime import timedelta

start = datetime(2024, 2, 28)

start + timedelta(12)


datetime.datetime(2024, 3, 11, 0, 0)

In [12]:
start - 2 * timedelta(12)

datetime.datetime(2024, 2, 4, 0, 0)

In [None]:
help(datetime)

In [15]:
### Converting between String and DateFrame

stamp = datetime(2024, 1, 30)

str(stamp)

'2024-01-30 00:00:00'

In [16]:
stamp.strftime("%Y-%m-%d")

'2024-01-30'

In [20]:
value = '2024-2-28'
datetime.strptime(value, '%Y-%m-%d')

datetime.datetime(2024, 2, 28, 0, 0)

In [26]:
datestrs = ["2/23/2024", "2/28/2024"]

[datetime.strptime(x, "%m/%d/%Y") for x in datestrs]

[datetime.datetime(2024, 2, 23, 0, 0), datetime.datetime(2024, 2, 28, 0, 0)]

In [27]:
# using pandas
datestrs = ["2024-2-23 12:00:00", "2024-2-28 00:00:00"]

In [28]:
pd.to_datetime(datestrs)

DatetimeIndex(['2024-02-23 12:00:00', '2024-02-28 00:00:00'], dtype='datetime64[ns]', freq=None)

In [29]:
idx = pd.to_datetime(datestrs+ [None])

In [30]:
idx

DatetimeIndex(['2024-02-23 12:00:00', '2024-02-28 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [31]:
idx[2]

NaT

In [34]:
pd.isna(idx)

array([False, False,  True])

## Time series basics

In [37]:
dates = [datetime(2024,2,23), datetime(2024,2,24),
        datetime(2024,2,25), datetime(2024,2,26),
        datetime(2024,2,27), datetime(2024,2,28)]

In [38]:
ts = pd.Series(np.random.standard_normal(6),
              index = dates)

In [39]:
ts

2024-02-23    1.711187
2024-02-24   -1.511684
2024-02-25   -0.891120
2024-02-26    0.525799
2024-02-27   -1.431700
2024-02-28    0.166120
dtype: float64

In [40]:
ts.index

DatetimeIndex(['2024-02-23', '2024-02-24', '2024-02-25', '2024-02-26',
               '2024-02-27', '2024-02-28'],
              dtype='datetime64[ns]', freq=None)

In [41]:
ts + ts[::2]

2024-02-23    3.422373
2024-02-24         NaN
2024-02-25   -1.782240
2024-02-26         NaN
2024-02-27   -2.863400
2024-02-28         NaN
dtype: float64

In [42]:
ts.index.dtype

dtype('<M8[ns]')

In [43]:
stamp = ts.index[0]

In [44]:
stamp

Timestamp('2024-02-23 00:00:00')

### Indexing, Selection, Subsetting

In [45]:
stamp = ts.index[2]

In [46]:
ts[stamp]

-0.8911199836482235

In [48]:
stamp

Timestamp('2024-02-25 00:00:00')

In [51]:
ts["2024-02-25"]

-0.8911199836482235

In [53]:
longer_ts = pd.Series(np.random.standard_normal(1000),
                     index = pd.date_range("2000-01-01", 
                                           periods=1000))

In [55]:
longer_ts

2000-01-01   -0.082193
2000-01-02    0.326903
2000-01-03   -1.096137
2000-01-04   -0.719159
2000-01-05   -0.267210
                ...   
2002-09-22    0.589006
2002-09-23   -1.248204
2002-09-24    0.499634
2002-09-25   -0.878508
2002-09-26   -0.126318
Freq: D, Length: 1000, dtype: float64

In [57]:
longer_ts["2001"]

2001-01-01    2.014188
2001-01-02   -0.343596
2001-01-03    0.376814
2001-01-04   -1.255311
2001-01-05   -2.419849
                ...   
2001-12-27    0.390457
2001-12-28    0.734758
2001-12-29   -1.082129
2001-12-30   -1.775678
2001-12-31   -1.224774
Freq: D, Length: 365, dtype: float64

In [58]:
longer_ts["2001-05"]

2001-05-01   -1.144894
2001-05-02    2.971409
2001-05-03    0.548497
2001-05-04   -1.250213
2001-05-05    0.200735
2001-05-06   -1.326385
2001-05-07    1.086268
2001-05-08    0.265488
2001-05-09    0.770084
2001-05-10    0.199019
2001-05-11   -1.069022
2001-05-12   -0.039820
2001-05-13    0.017193
2001-05-14   -0.307260
2001-05-15   -0.371033
2001-05-16    0.387411
2001-05-17   -0.640516
2001-05-18    1.024957
2001-05-19    0.880015
2001-05-20    0.692794
2001-05-21   -0.911137
2001-05-22    0.579512
2001-05-23    1.211284
2001-05-24    0.816900
2001-05-25   -1.654473
2001-05-26   -1.113558
2001-05-27    0.877119
2001-05-28   -3.241814
2001-05-29    0.244903
2001-05-30   -0.335317
2001-05-31    2.499217
Freq: D, dtype: float64

In [59]:
# slicing with datatime objects works aswell

ts[datetime(2011,1,7):]

2024-02-23    1.711187
2024-02-24   -1.511684
2024-02-25   -0.891120
2024-02-26    0.525799
2024-02-27   -1.431700
2024-02-28    0.166120
dtype: float64