# CH 12 Time Series Analysis from Pandas Cookbook
- pandas includes functionality to manipulate dates, aggregate over different time periods, sample different periods of time, and more

---
## Python vs pandas date tools
1. Py
    - module is datetime 
    - datatypes/types of object are date, time, datetime, timedelta, timezone, tzinfo
        - **date** being a moment in time - yr, month, and day (eg : June 21, 2022)
        - **time** being in hrs, mins, secs, and microsecs (one-millionth of a second) and is unattached to any date (eg : 12 hrs and 30 mins)
        - **datetime** being both **date** and **time**
2. pandas
    - Single object is Timestamp which goes to nanosecs (one-billionth of a second) and derived from NumPy's datetime64 data type
    - Timedelta
    - Both ts and td have all functionality of datetime module

3. Both
    - Have a timedelta object that's useful for date +, -, *, /
 
---

---
# How to do it...

---
## <center> 1. Exploring Py's datetime module</center>

- SYNTAX : module.datatype(parameters)

In [3]:
import pandas as pd
import numpy as np
import datetime

In [2]:
date = datetime.date(year=2013, month=6, day=7)
date

datetime.date(2013, 6, 7)

In [3]:
time = datetime.time(hour=12, minute=30, second=19, microsecond=463198)
time

datetime.time(12, 30, 19, 463198)

In [4]:
dt = datetime.datetime(year=2013, month=6, day=7, hour=12, minute=30, second=19, microsecond=463198)
dt

datetime.datetime(2013, 6, 7, 12, 30, 19, 463198)

In [18]:
td = datetime.timedelta(weeks=2, days=5, hours=10, minutes=20, seconds=6.73, milliseconds=99, microseconds=8)
td

datetime.timedelta(days=19, seconds=37206, microseconds=829008)

### How to properly format datetime module

In [19]:
print(f"date is {date}")

date is 2013-06-07


In [20]:
print(f"time is {time}")

time is 12:30:19.463198


In [21]:
print(f"dt is {dt}")

dt is 2013-06-07 12:30:19.463198


In [25]:
# timedelta can handle (+) and (-)
print(f'new date is {date + td}')

new date is 2013-06-26


In [26]:
print(f'new datetime is {dt + td}')

new datetime is 2013-06-26 22:50:26.292206


In [27]:
# timedelta to a time object is NOT possible
time + td

TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.timedelta'

## <center>2. pandas Timestamp object</center>

- SYNTAX : library.object(parameters)

In [32]:
# How to create my own Timestamp object
pd.Timestamp(year=2012, month=12, day=21, hour=5, minute=10, second=8, microsecond=99)

Timestamp('2012-12-21 05:10:08.000099')

In [33]:
# Another way to create a Timestamp object
pd.Timestamp('2016/1/10')

Timestamp('2016-01-10 00:00:00')

In [38]:
# Another way to create a Timestamp object
pd.Timestamp('2014-5/10')

Timestamp('2014-05-10 00:00:00')

In [39]:
# Another way to create a Timestamp object
pd.Timestamp('Jan 3, 2019 20:45:56')

Timestamp('2019-01-03 20:45:56')

In [44]:
# Another way to create a Timestamp object
# They have "T"... why?
pd.Timestamp('2016-01-05T05:34:43.123456789')

Timestamp('2016-01-05 05:34:43.123456789')

In [43]:
# Another way to create a Timestamp object
pd.Timestamp('2016-01-05 05:34:43.123456789')

Timestamp('2016-01-05 05:34:43.123456789')

In [51]:
# Can pass in a single integer or float to Timestamp constructor
# returns a date equivalent to the # of nanoseconds after the Unix epoch (Jan 1, 1970)
pd.Timestamp(500000000000000000)

Timestamp('1985-11-05 00:53:20')

In [52]:
# Another ex when passing in an int
pd.Timestamp(2)

Timestamp('1970-01-01 00:00:00.000000002')

In [54]:
# Another ex when passing in an int
# specify the unit so ‘D’, ‘h’, ‘m’, ‘s’, ‘ms’, ‘us’, and ‘ns’
pd.Timestamp(5000, unit='D')

Timestamp('1983-09-10 00:00:00')

### to_datetime pandas function

- Works similarly to the Timestamp constructor
- Comes with different parameters
- WHY/PURPOSE : For converting string columns in DataFrames to dates 
    - Capable of converting entire lists or Series of strings or integers to Timestamp objects
    - Use Timestamp when accessing single scalar values
    - Timedelta (constructor) and to_timedelta (function) to represent an amount of time $\rightarrow$ single Timedelta object
    - to_datetime and to_timedelta can covert entire lists or Series into Timedelta objects
- SYNTAX : library.function(parameters)

In [55]:
pd.to_datetime('2015-5-13')

Timestamp('2015-05-13 00:00:00')

In [56]:
# Set a parse order of two diff params - (1) dayfirst and (2) yearfirst
# both are bool
pd.to_datetime('2015-13-5', dayfirst=True)

Timestamp('2015-05-13 00:00:00')

In [59]:
pd.to_datetime('10/11/12', yearfirst=True)

Timestamp('2010-11-12 00:00:00')

In [62]:
# If both are used, then yearfirst is preceded
pd.to_datetime('2015-13-5', dayfirst=True, yearfirst=True)

Timestamp('2015-05-13 00:00:00')

In [63]:
# If both are used, then yearfirst is preceded
pd.to_datetime('10/11/12', dayfirst=True, yearfirst=True)

Timestamp('2010-12-11 00:00:00')

In [71]:
pd.to_datetime('Start Date: Sep 30, 2017 Start Time: 130 PM', format='Start Date: %b %d, %Y Start Time: %I%M %p')

Timestamp('2017-09-30 13:30:00')

In [73]:
# Get x days from origin
pd.to_datetime(1, unit='D', origin='2013-1-1')

Timestamp('2013-01-02 00:00:00')

In [74]:
# Get x days from origin
pd.to_datetime(365, unit='D', origin='2013-1-1')

Timestamp('2014-01-01 00:00:00')

In [5]:
family_gathering_dates = pd.Series([10, 100, 1000, 10000])
family_gathering_dates

0       10
1      100
2     1000
3    10000
dtype: int64

In [8]:
pd.to_datetime(family_gathering_dates, unit='D')

0   1970-01-11
1   1970-04-11
2   1972-09-27
3   1997-05-19
dtype: datetime64[ns]

In [9]:
pd.Timestamp(family_gathering_dates)

TypeError: Cannot convert input [0       10
1      100
2     1000
3    10000
dtype: int64] of type <class 'pandas.core.series.Series'> to Timestamp

In [13]:
pd.Timedelta('12 days 5 hours 3 minutes 123456789 nanoseconds')

Timedelta('12 days 05:03:00.123456')

In [14]:
pd.to_timedelta('12 days 5 hours 3 minutes 123456789 nanoseconds')

Timedelta('12 days 05:03:00.123456')

#### attributes and methods of constructors - Timestamp and Timedelta 

- [Entire list of attributes and methods for Timestamp](https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html?highlight=timestam#pandas.Timestamp)

- [Entire list of attributes and methods for Timedelta](https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html?highlight=timedelta#pandas.Timedelta)

In [16]:
ts = pd.Timestamp('2016-10-1 4:23:23.9')
ts

Timestamp('2016-10-01 04:23:23.900000')

In [19]:
td = pd.Timedelta('2016-10-1 4:23:23.9')
td

ValueError: only leading negative signs are allowed

In [35]:
td = pd.Timedelta(10000, "h")
td

Timedelta('416 days 16:00:00')

In [37]:
td.components

Components(days=416, hours=16, minutes=0, seconds=0, milliseconds=0, microseconds=0, nanoseconds=0)

In [38]:
ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second

(2016, 10, 1, 4, 23, 23)