## Working with Dates and Times in Datasets

1-Review of Python's datetime Module

2-The `pandas Timestamp` Object

3-The `pandas DateTimeIndex` Object

4-The `pd.to_datetime()` Method

5-Create Range of Dates with the `pd.date_range()` Method, Part 1

6-Create Range of Dates with the `pd.date_range()` Method, Part 2

7-Create Range of Dates with the `pd.date_range()` Method, Part 3

8-The `.dt` Accessor

9-Import Financial Data Set with `pandas_datareader` Library

10-Selecting from a `DataFrame` with a `DateTimeIndex` (Also there is Truncate) 

11-`Timestamp` Object Attributes

12- The `pd.DateOffset` Objects

13-Timeseries Offsets

14-The `Timedeltas` Object

15-The `Timedeltas` Dataset

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.tseries.offsets.DateOffset.html
https://pandas.pydata.org/pandas-docs/stable/reference/offset_frequency.html
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timedelta.html

In [1]:
import pandas as pd
import datetime as dt
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
%config Completer.use_jedi = False

### Review of Python's datetime Module

In [2]:
someday = dt.date(2019, 1, 26)

In [3]:
someday.year
someday.month
someday.day

26

In [4]:
str(someday)

'2019-01-26'

In [5]:
str(dt.datetime(2010, 1, 10, 17, 13, 57))

'2010-01-10 17:13:57'

In [6]:
sometime = dt.datetime(2010, 1, 10, 17, 13, 57)

In [7]:
sometime.year
sometime.month
sometime.day
sometime.hour
sometime.minute
sometime.second

57

### The pandas Timestamp Object

In [8]:
pd.Timestamp("2015-03-31")

Timestamp('2015-03-31 00:00:00')

In [9]:
pd.Timestamp("2015/03/31")

Timestamp('2015-03-31 00:00:00')

In [10]:
pd.Timestamp("2013, 11, 04")

Timestamp('2013-11-04 00:00:00')

In [11]:
pd.Timestamp("1/1/2015")

Timestamp('2015-01-01 00:00:00')

In [12]:
pd.Timestamp("19/12/2015")

Timestamp('2015-12-19 00:00:00')

In [13]:
pd.Timestamp("12/19/2015")

Timestamp('2015-12-19 00:00:00')

In [14]:
pd.Timestamp("4/3/2000")

Timestamp('2000-04-03 00:00:00')

In [15]:
pd.Timestamp("2021-03-08 08:35:15")

Timestamp('2021-03-08 08:35:15')

In [16]:
pd.Timestamp("2021-03-08 6:13:29 PM")

Timestamp('2021-03-08 18:13:29')

In [17]:
pd.Timestamp(dt.date(2015, 1, 1))

Timestamp('2015-01-01 00:00:00')

In [18]:
pd.Timestamp(dt.datetime(2000, 2, 3, 21, 35, 22))

Timestamp('2000-02-03 21:35:22')

### The pandas DateTimeIndex Object

In [19]:
dates = ["2016/01/02", "2016/04/12", "2009/09/07"]
pd.DatetimeIndex(dates)

DatetimeIndex(['2016-01-02', '2016-04-12', '2009-09-07'], dtype='datetime64[ns]', freq=None)

In [20]:
dates = [dt.date(2016, 1, 10), dt.date(1994, 6, 13), dt.date(2003, 12, 29)]
dtIndex = pd.DatetimeIndex(dates)

In [21]:
values = [100, 200, 300]
pd.Series(data = values, index = dtIndex)

2016-01-10    100
1994-06-13    200
2003-12-29    300
dtype: int64

### The pd.to_datetime() Method

In [22]:
pd.to_datetime("2001-04-19")

Timestamp('2001-04-19 00:00:00')

In [23]:
pd.to_datetime(dt.date(2015, 1, 1))

Timestamp('2015-01-01 00:00:00')

In [24]:
pd.to_datetime(dt.datetime(2015, 1, 1, 14, 35, 20))

Timestamp('2015-01-01 14:35:20')

In [25]:
pd.to_datetime(["2015-01-03", "2014/02/08", "2016", "July 4th, 1996"])

DatetimeIndex(['2015-01-03', '2014-02-08', '2016-01-01', '1996-07-04'], dtype='datetime64[ns]', freq=None)

In [26]:
# you can add times as different type in series. 

times = pd.Series(["2015-01-03", "2014/02/08", "2016", "July 4th, 1996"])
times

0        2015-01-03
1        2014/02/08
2              2016
3    July 4th, 1996
dtype: object

In [27]:
#when you call a series you'll see all objects as datetime type. 

pd.to_datetime(times)

0   2015-01-03
1   2014-02-08
2   2016-01-01
3   1996-07-04
dtype: datetime64[ns]

In [28]:
dates = pd.Series(["Hello","July 4th, 1996", "10/04/1991", "2015-02-31"])
dates

0             Hello
1    July 4th, 1996
2        10/04/1991
3        2015-02-31
dtype: object

In [29]:
pd.to_datetime(dates, errors = "coerce")

0          NaT
1   1996-07-04
2   1991-10-04
3          NaT
dtype: datetime64[ns]

In [30]:
pd.to_datetime([1349720105, 1349806505, 1349892905, 1349979305, 1350065705], unit = "s")

DatetimeIndex(['2012-10-08 18:15:05', '2012-10-09 18:15:05',
               '2012-10-10 18:15:05', '2012-10-11 18:15:05',
               '2012-10-12 18:15:05'],
              dtype='datetime64[ns]', freq=None)

In [31]:
pd.Period("2016-01-08", freq = "10D")

Period('2016-01-08', '10D')

In [32]:
dates = ["2016-01-01", "2016-02-01", "2016-03-01"]
pd.Series([1, 2, 3], index = pd.PeriodIndex(dates, freq = "2M"))

2016-01    1
2016-02    2
2016-03    3
Freq: 2M, dtype: int64

In [33]:
pd.Period("2016-01-08", freq = "W")

Period('2016-01-04/2016-01-10', 'W-SUN')

In [34]:
pd.Period("2016-01-08", freq = "W-SUN")
pd.Period("2016-01-08", freq = "W-WED")
pd.Period("2015-12-10", freq = "10D")

Period('2015-12-10', '10D')

In [35]:
dates = ["2016-01-01", "2016-02-01", "2016-02-01"]
pd.PeriodIndex(dates, freq = "W-MON")
weeks = pd.PeriodIndex(dates, freq = "W-MON")

pd.Series([999, 500, 325], index = weeks, name = "Weekly Revenue")

2015-12-29/2016-01-04    999
2016-01-26/2016-02-01    500
2016-01-26/2016-02-01    325
Freq: W-MON, Name: Weekly Revenue, dtype: int64

### Create Range of Dates with the pd.date_range() Method, Part 1

freq = "n" :

n = D shows a day. / 
n = 2D shows two days. /
n = B shows business days./ 
n = W shows week./ 
n = M shows month./
n = MS shows firdt day of month./ 
n = W-FRI or W-SUN shows week start day. / 
n = H shows hours. / 
n = 6H shows six hours. / 
n = A shows last day of each year.

In [36]:
times = pd.date_range(start = "2016-01-01", end = "2016-01-10", freq = "D") # D is a day

In [37]:
times

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10'],
              dtype='datetime64[ns]', freq='D')

In [38]:
type(times)

pandas.core.indexes.datetimes.DatetimeIndex

In [39]:
type(times[0])

pandas._libs.tslibs.timestamps.Timestamp

In [40]:
pd.date_range(start = "2016-01-01", end = "2016-01-10", freq = "2D") # 2D is two days

DatetimeIndex(['2016-01-01', '2016-01-03', '2016-01-05', '2016-01-07',
               '2016-01-09'],
              dtype='datetime64[ns]', freq='2D')

In [41]:
pd.date_range(start = "2016-01-01", end = "2016-01-10", freq = "B") # B is business days

DatetimeIndex(['2016-01-01', '2016-01-04', '2016-01-05', '2016-01-06',
               '2016-01-07', '2016-01-08'],
              dtype='datetime64[ns]', freq='B')

In [42]:
pd.date_range(start = "2016-01-01", end = "2016-01-15", freq = "W") # W is week.
pd.date_range(start = "2016-01-01", end = "2016-01-15", freq = "W-FRI") # W-FRI's meaning is week starts day is friday

DatetimeIndex(['2016-01-01', '2016-01-08', '2016-01-15'], dtype='datetime64[ns]', freq='W-FRI')

In [43]:
pd.date_range(start = "2016-01-01", end = "2050-01-01", freq = "A")

DatetimeIndex(['2016-12-31', '2017-12-31', '2018-12-31', '2019-12-31',
               '2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31',
               '2024-12-31', '2025-12-31', '2026-12-31', '2027-12-31',
               '2028-12-31', '2029-12-31', '2030-12-31', '2031-12-31',
               '2032-12-31', '2033-12-31', '2034-12-31', '2035-12-31',
               '2036-12-31', '2037-12-31', '2038-12-31', '2039-12-31',
               '2040-12-31', '2041-12-31', '2042-12-31', '2043-12-31',
               '2044-12-31', '2045-12-31', '2046-12-31', '2047-12-31',
               '2048-12-31', '2049-12-31'],
              dtype='datetime64[ns]', freq='A-DEC')

### Create Range of Dates with the `pd.date_range()` Method, Part 2

In [44]:
pd.date_range(start = "2012-09-09", periods = 15, freq = "W")

DatetimeIndex(['2012-09-09', '2012-09-16', '2012-09-23', '2012-09-30',
               '2012-10-07', '2012-10-14', '2012-10-21', '2012-10-28',
               '2012-11-04', '2012-11-11', '2012-11-18', '2012-11-25',
               '2012-12-02', '2012-12-09', '2012-12-16'],
              dtype='datetime64[ns]', freq='W-SUN')

In [45]:
pd.date_range(start = "2012-09-09", periods = 15, freq = "12H")

DatetimeIndex(['2012-09-09 00:00:00', '2012-09-09 12:00:00',
               '2012-09-10 00:00:00', '2012-09-10 12:00:00',
               '2012-09-11 00:00:00', '2012-09-11 12:00:00',
               '2012-09-12 00:00:00', '2012-09-12 12:00:00',
               '2012-09-13 00:00:00', '2012-09-13 12:00:00',
               '2012-09-14 00:00:00', '2012-09-14 12:00:00',
               '2012-09-15 00:00:00', '2012-09-15 12:00:00',
               '2012-09-16 00:00:00'],
              dtype='datetime64[ns]', freq='12H')

### Create Range of Dates with the `pd.date_range()` Method, Part 3

In [46]:
pd.date_range(end = "1999-12-31", periods = 15, freq = "7H")

DatetimeIndex(['1999-12-26 22:00:00', '1999-12-27 05:00:00',
               '1999-12-27 12:00:00', '1999-12-27 19:00:00',
               '1999-12-28 02:00:00', '1999-12-28 09:00:00',
               '1999-12-28 16:00:00', '1999-12-28 23:00:00',
               '1999-12-29 06:00:00', '1999-12-29 13:00:00',
               '1999-12-29 20:00:00', '1999-12-30 03:00:00',
               '1999-12-30 10:00:00', '1999-12-30 17:00:00',
               '1999-12-31 00:00:00'],
              dtype='datetime64[ns]', freq='7H')

In [47]:
pd.date_range(end = "1999-12-31", periods = 15, freq = "2W")

DatetimeIndex(['1999-06-13', '1999-06-27', '1999-07-11', '1999-07-25',
               '1999-08-08', '1999-08-22', '1999-09-05', '1999-09-19',
               '1999-10-03', '1999-10-17', '1999-10-31', '1999-11-14',
               '1999-11-28', '1999-12-12', '1999-12-26'],
              dtype='datetime64[ns]', freq='2W-SUN')

In [48]:
pd.date_range(end = "1999-12-31", periods = 15, freq = "MS")

DatetimeIndex(['1998-10-01', '1998-11-01', '1998-12-01', '1999-01-01',
               '1999-02-01', '1999-03-01', '1999-04-01', '1999-05-01',
               '1999-06-01', '1999-07-01', '1999-08-01', '1999-09-01',
               '1999-10-01', '1999-11-01', '1999-12-01'],
              dtype='datetime64[ns]', freq='MS')

### The `.dt` Accessor

In [49]:
bunch_of_dates = pd.date_range(start = "2000-01-01", end = "2010-12-31", freq = "24D")

In [50]:
s = pd.Series(bunch_of_dates)
s.head()

0   2000-01-01
1   2000-01-25
2   2000-02-18
3   2000-03-13
4   2000-04-06
dtype: datetime64[ns]

In [51]:
s.dt.day.head()

0     1
1    25
2    18
3    13
4     6
dtype: int64

In [52]:
s.dt.week.head()

0    52
1     4
2     7
3    11
4    14
dtype: int64

In [53]:
s.dt.day_name().head()

0    Saturday
1     Tuesday
2      Friday
3      Monday
4    Thursday
dtype: object

In [54]:
a = s.dt.is_quarter_start
s[a]

0     2000-01-01
19    2001-04-01
38    2002-07-01
137   2009-01-01
dtype: datetime64[ns]

In [55]:
mask = s.dt.is_month_end
s[mask]

5     2000-04-30
57    2003-09-30
71    2004-08-31
90    2005-11-30
123   2008-01-31
161   2010-07-31
dtype: datetime64[ns]

### Import Financial Data Set with `pandas_datareader` Library

In [56]:
import pandas as pd
import datetime as dt
from pandas_datareader import data

In [57]:
company = "MSFT"
start = "2010-01-01"
end = "2020-12-31"

stocks = data.DataReader(name = company, data_source = "yahoo", start = start, end = end)
stocks.head(3)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,31.08,30.52,30.879999,30.77,58182400.0,23.965164


In [58]:
stocks.values

array([[3.11000004e+01, 3.05900002e+01, 3.06200008e+01, 3.09500008e+01,
        3.84091000e+07, 2.41053600e+01],
       [3.11000004e+01, 3.06399994e+01, 3.08500004e+01, 3.09599991e+01,
        4.97496000e+07, 2.41131477e+01],
       [3.10799999e+01, 3.05200005e+01, 3.08799992e+01, 3.07700005e+01,
        5.81824000e+07, 2.39651642e+01],
       ...,
       [2.27179993e+02, 2.23580002e+02, 2.26309998e+02, 2.24149994e+02,
        1.74032000e+07, 2.24149994e+02],
       [2.25630005e+02, 2.21470001e+02, 2.25229996e+02, 2.21679993e+02,
        2.02723000e+07, 2.21679993e+02],
       [2.23000000e+02, 2.19679993e+02, 2.21699997e+02, 2.22419998e+02,
        2.09269000e+07, 2.22419998e+02]])

In [59]:
stocks.columns

Index(['High', 'Low', 'Open', 'Close', 'Volume', 'Adj Close'], dtype='object')

In [60]:
stocks.index[0]

Timestamp('2010-01-04 00:00:00')

In [61]:
stocks.axes

[DatetimeIndex(['2010-01-04', '2010-01-05', '2010-01-06', '2010-01-07',
                '2010-01-08', '2010-01-11', '2010-01-12', '2010-01-13',
                '2010-01-14', '2010-01-15',
                ...
                '2020-12-17', '2020-12-18', '2020-12-21', '2020-12-22',
                '2020-12-23', '2020-12-24', '2020-12-28', '2020-12-29',
                '2020-12-30', '2020-12-31'],
               dtype='datetime64[ns]', name='Date', length=2769, freq=None),
 Index(['High', 'Low', 'Open', 'Close', 'Volume', 'Adj Close'], dtype='object')]

### Selecting from a `DataFrame` with a `DateTimeIndex`

In [62]:
stocks = data.DataReader(name = "MSFT", data_source = "yahoo", start = "2010-01-01", end = "2020-12-31")
stocks.head(3)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,31.08,30.52,30.879999,30.77,58182400.0,23.965164


In [63]:
stocks.loc["2010-01-04"]
# or 
stocks.loc[pd.Timestamp("2010-01-04")]

High         3.110000e+01
Low          3.059000e+01
Open         3.062000e+01
Close        3.095000e+01
Volume       3.840910e+07
Adj Close    2.410536e+01
Name: 2010-01-04 00:00:00, dtype: float64

In [64]:
stocks.iloc[3] #with index position

High         3.070000e+01
Low          3.019000e+01
Open         3.063000e+01
Close        3.045000e+01
Volume       5.055970e+07
Adj Close    2.371593e+01
Name: 2010-01-07 00:00:00, dtype: float64

In [65]:
stocks.loc["2013-10-01" : "2013-10-07"]

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-10-01,33.610001,33.299999,33.349998,33.580002,36718700.0,28.817102
2013-10-02,34.029999,33.290001,33.360001,33.919998,46946800.0,29.108875
2013-10-03,34.0,33.419998,33.880001,33.860001,38703800.0,29.057384
2013-10-04,33.990002,33.619999,33.689999,33.880001,33008100.0,29.074543
2013-10-07,33.709999,33.200001,33.599998,33.299999,35069300.0,28.576813


In [66]:
#other way: 
stocks.loc[pd.Timestamp("2013-10-01") : pd.Timestamp("2013-10-07")]

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-10-01,33.610001,33.299999,33.349998,33.580002,36718700.0,28.817102
2013-10-02,34.029999,33.290001,33.360001,33.919998,46946800.0,29.108875
2013-10-03,34.0,33.419998,33.880001,33.860001,38703800.0,29.057384
2013-10-04,33.990002,33.619999,33.689999,33.880001,33008100.0,29.074543
2013-10-07,33.709999,33.200001,33.599998,33.299999,35069300.0,28.576813


In [67]:
#other way:
stocks.truncate(before = "2013-10-01", after = "2013-10-07")

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-10-01,33.610001,33.299999,33.349998,33.580002,36718700.0,28.817102
2013-10-02,34.029999,33.290001,33.360001,33.919998,46946800.0,29.108875
2013-10-03,34.0,33.419998,33.880001,33.860001,38703800.0,29.057384
2013-10-04,33.990002,33.619999,33.689999,33.880001,33008100.0,29.074543
2013-10-07,33.709999,33.200001,33.599998,33.299999,35069300.0,28.576813


In [68]:
stocks.iloc[ : 2]

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,31.1,30.639999,30.85,30.959999,49749600.0,24.113148


In [69]:
stocks.iloc[[2,16,22,85]]

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-06,31.08,30.52,30.879999,30.77,58182400.0,23.965164
2010-01-27,29.82,29.02,29.35,29.67,63949500.0,23.108438
2010-02-04,28.5,27.809999,28.379999,27.84,77850000.0,21.683146
2010-05-06,29.879999,27.91,29.59,28.98,128613000.0,22.676584


In [70]:
birthdays = pd.date_range(start = "2010-06-29", end = "2019-12-31", freq = pd.DateOffset(years = 1))

In [71]:
birthdays_stocks = stocks.index.isin(birthdays)

In [72]:
stocks[birthdays_stocks]
# or
stocks.loc[birthdays_stocks]

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-06-29,24.200001,23.110001,24.129999,23.309999,119882100.0,18.322161
2011-06-29,25.709999,25.360001,25.709999,25.620001,66051000.0,20.624397
2012-06-29,30.690001,30.139999,30.450001,30.59,55227200.0,25.296597
2015-06-29,45.23,44.360001,45.040001,44.369999,34081700.0,39.954479
2016-06-29,50.720001,49.799999,49.91,50.540001,31304000.0,46.775375
2017-06-29,69.489998,68.089996,69.379997,68.489998,28918700.0,64.973892
2018-06-29,99.910004,98.330002,98.93,98.610001,28053200.0,95.37394


### `Timestamp` Object Attributes

In [73]:
stocks = data.DataReader(name = "MSFT", data_source = "yahoo", start = "2010-01-01", end = "2020-12-31")
stocks.head(3)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,31.08,30.52,30.879999,30.77,58182400.0,23.965164


In [74]:
someday = stocks.index[500]
someday

Timestamp('2011-12-27 00:00:00')

In [75]:
someday.day
someday.month
someday.year
someday.is_month_end
someday.is_month_start

False

In [76]:
someday.month_name()
someday.day_name()

'Tuesday'

In [77]:
stocks.index.day_name()

Index(['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Monday',
       'Tuesday', 'Wednesday', 'Thursday', 'Friday',
       ...
       'Thursday', 'Friday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday',
       'Monday', 'Tuesday', 'Wednesday', 'Thursday'],
      dtype='object', name='Date', length=2769)

In [78]:
# We can add day names as a new column:

stocks.insert(0, "Day of Week",stocks.index.day_name())

In [79]:
stocks.head()

Unnamed: 0_level_0,Day of Week,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2010-01-04,Monday,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,Tuesday,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,Wednesday,31.08,30.52,30.879999,30.77,58182400.0,23.965164
2010-01-07,Thursday,30.700001,30.190001,30.629999,30.450001,50559700.0,23.715933
2010-01-08,Friday,30.879999,30.24,30.280001,30.66,51197400.0,23.879499


In [80]:
stocks.insert(1,"Is Start of Month", stocks.index.is_month_start)
stocks.head()

Unnamed: 0_level_0,Day of Week,Is Start of Month,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2010-01-04,Monday,False,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,Tuesday,False,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,Wednesday,False,31.08,30.52,30.879999,30.77,58182400.0,23.965164
2010-01-07,Thursday,False,30.700001,30.190001,30.629999,30.450001,50559700.0,23.715933
2010-01-08,Friday,False,30.879999,30.24,30.280001,30.66,51197400.0,23.879499


In [82]:
stocks[stocks["Is Start of Month"]].head()

Unnamed: 0_level_0,Day of Week,Is Start of Month,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2010-02-01,Monday,True,28.48,27.92,28.389999,28.41,85931100.0,22.127089
2010-03-01,Monday,True,29.049999,28.530001,28.77,29.02,43805400.0,22.707878
2010-04-01,Thursday,True,29.540001,28.620001,29.35,29.16,74768100.0,22.817423
2010-06-01,Tuesday,True,26.309999,25.52,25.530001,25.889999,76152400.0,20.350094
2010-07-01,Thursday,True,23.32,22.73,23.09,23.16,92239400.0,18.204264


### The `pd.DateOffset` Objects

In [83]:
stocks = data.DataReader(name = "MSFT", data_source = "yahoo", start = "2010-01-01", end = "2020-12-31")
stocks.head(3)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,31.08,30.52,30.879999,30.77,58182400.0,23.965164


In [85]:
stocks.index + pd.DateOffset(days = 5)

DatetimeIndex(['2010-01-09', '2010-01-10', '2010-01-11', '2010-01-12',
               '2010-01-13', '2010-01-16', '2010-01-17', '2010-01-18',
               '2010-01-19', '2010-01-20',
               ...
               '2020-12-22', '2020-12-23', '2020-12-26', '2020-12-27',
               '2020-12-28', '2020-12-29', '2021-01-02', '2021-01-03',
               '2021-01-04', '2021-01-05'],
              dtype='datetime64[ns]', name='Date', length=2769, freq=None)

In [86]:
stocks.index - pd.DateOffset(days = 5)

DatetimeIndex(['2009-12-30', '2009-12-31', '2010-01-01', '2010-01-02',
               '2010-01-03', '2010-01-06', '2010-01-07', '2010-01-08',
               '2010-01-09', '2010-01-10',
               ...
               '2020-12-12', '2020-12-13', '2020-12-16', '2020-12-17',
               '2020-12-18', '2020-12-19', '2020-12-23', '2020-12-24',
               '2020-12-25', '2020-12-26'],
              dtype='datetime64[ns]', name='Date', length=2769, freq=None)

In [87]:
stocks.index = stocks.index - pd.DateOffset(days = 5)

In [94]:
stocks.index + pd.DateOffset(months = 8, years = 5, days = 12, hours = 3, minutes = 42)

DatetimeIndex(['2015-09-11 03:42:00', '2015-09-12 03:42:00',
               '2015-09-13 03:42:00', '2015-09-14 03:42:00',
               '2015-09-15 03:42:00', '2015-09-18 03:42:00',
               '2015-09-19 03:42:00', '2015-09-20 03:42:00',
               '2015-09-21 03:42:00', '2015-09-22 03:42:00',
               ...
               '2026-08-24 03:42:00', '2026-08-25 03:42:00',
               '2026-08-28 03:42:00', '2026-08-29 03:42:00',
               '2026-08-30 03:42:00', '2026-08-31 03:42:00',
               '2026-09-04 03:42:00', '2026-09-05 03:42:00',
               '2026-09-06 03:42:00', '2026-09-07 03:42:00'],
              dtype='datetime64[ns]', name='Date', length=2769, freq=None)

### Timeseries Offsets

In [95]:
stocks = data.DataReader(name = "MSFT", data_source = "yahoo", start = "2010-01-01", end = "2020-12-31")
stocks.head(3)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,31.1,30.59,30.620001,30.950001,38409100.0,24.10536
2010-01-05,31.1,30.639999,30.85,30.959999,49749600.0,24.113148
2010-01-06,31.08,30.52,30.879999,30.77,58182400.0,23.965164


In [101]:
stocks.index + pd.tseries.offsets.MonthEnd()
stocks.index - pd.tseries.offsets.MonthEnd()

stocks.index + pd.tseries.offsets.MonthBegin()
stocks.index - pd.tseries.offsets.MonthBegin()

DatetimeIndex(['2010-01-01', '2010-01-01', '2010-01-01', '2010-01-01',
               '2010-01-01', '2010-01-01', '2010-01-01', '2010-01-01',
               '2010-01-01', '2010-01-01',
               ...
               '2020-12-01', '2020-12-01', '2020-12-01', '2020-12-01',
               '2020-12-01', '2020-12-01', '2020-12-01', '2020-12-01',
               '2020-12-01', '2020-12-01'],
              dtype='datetime64[ns]', name='Date', length=2769, freq=None)

In [102]:
from pandas.tseries import offsets

In [104]:
stocks.index + offsets.MonthEnd()

DatetimeIndex(['2010-01-31', '2010-01-31', '2010-01-31', '2010-01-31',
               '2010-01-31', '2010-01-31', '2010-01-31', '2010-01-31',
               '2010-01-31', '2010-01-31',
               ...
               '2020-12-31', '2020-12-31', '2020-12-31', '2020-12-31',
               '2020-12-31', '2020-12-31', '2020-12-31', '2020-12-31',
               '2020-12-31', '2021-01-31'],
              dtype='datetime64[ns]', name='Date', length=2769, freq=None)

In [98]:
stocks.tail(3)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2020-12-29,227.179993,223.580002,226.309998,224.149994,17403200.0,224.149994
2020-12-30,225.630005,221.470001,225.229996,221.679993,20272300.0,221.679993
2020-12-31,223.0,219.679993,221.699997,222.419998,20926900.0,222.419998


In [105]:
stocks.index + offsets.BMonthEnd()

DatetimeIndex(['2010-01-29', '2010-01-29', '2010-01-29', '2010-01-29',
               '2010-01-29', '2010-01-29', '2010-01-29', '2010-01-29',
               '2010-01-29', '2010-01-29',
               ...
               '2020-12-31', '2020-12-31', '2020-12-31', '2020-12-31',
               '2020-12-31', '2020-12-31', '2020-12-31', '2020-12-31',
               '2020-12-31', '2021-01-29'],
              dtype='datetime64[ns]', name='Date', length=2769, freq=None)

### The `Timedeltas` Object

In [106]:
timeA = pd.Timestamp("2016-03-31 04:35:16 PM")
timeB = pd.Timestamp("2016-03-20 02:16:49 AM")

In [112]:
timeA - timeB

Timedelta('11 days 14:18:27')

In [107]:
timeB - timeA

Timedelta('-12 days +09:41:33')

In [108]:
type(timeA - timeB)

pandas._libs.tslibs.timedeltas.Timedelta

In [109]:
type(timeA)

pandas._libs.tslibs.timestamps.Timestamp

In [113]:
pd.Timedelta(days = 3)

Timedelta('3 days 00:00:00')

In [114]:
timeA + pd.Timedelta(days = 3)

Timestamp('2016-04-03 16:35:16')

In [110]:
pd.Timedelta(weeks = 8, days = 3, hours = 12, minutes = 45)

Timedelta('59 days 12:45:00')

In [116]:
timeB - pd.Timedelta(weeks = 8, days = 3, hours = 12, minutes = 45)

Timestamp('2016-01-20 13:31:49')

In [111]:
pd.Timedelta("14 days 6 hours 12 minutes 49 seconds")

Timedelta('14 days 06:12:49')

### `Timedeltas` in a Dataset

In [120]:
shipping = pd.read_csv("files/ecommerce.csv", index_col = "ID", parse_dates = ["order_date","delivery_date"])
shipping.head(3)

Unnamed: 0_level_0,order_date,delivery_date
ID,Unnamed: 1_level_1,Unnamed: 2_level_1
1,1998-05-24,1999-02-05
2,1992-04-22,1998-03-06
4,1991-02-10,1992-08-26


In [121]:
shipping["Delivery Time"] = shipping["delivery_date"] - shipping["order_date"]

In [122]:
shipping.head()

Unnamed: 0_level_0,order_date,delivery_date,Delivery Time
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1998-05-24,1999-02-05,257 days
2,1992-04-22,1998-03-06,2144 days
4,1991-02-10,1992-08-26,563 days
5,1992-07-21,1997-11-20,1948 days
7,1993-09-02,1998-06-10,1742 days


In [123]:
shipping["Twice As Long"] = shipping["delivery_date"] + shipping["Delivery Time"]

In [124]:
shipping.head()

Unnamed: 0_level_0,order_date,delivery_date,Delivery Time,Twice As Long
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1998-05-24,1999-02-05,257 days,1999-10-20
2,1992-04-22,1998-03-06,2144 days,2004-01-18
4,1991-02-10,1992-08-26,563 days,1994-03-12
5,1992-07-21,1997-11-20,1948 days,2003-03-22
7,1993-09-02,1998-06-10,1742 days,2003-03-18


In [125]:
shipping.dtypes

order_date        datetime64[ns]
delivery_date     datetime64[ns]
Delivery Time    timedelta64[ns]
Twice As Long     datetime64[ns]
dtype: object

In [129]:
mask = shipping["Delivery Time"] > "3423 days"
shipping[mask]

Unnamed: 0_level_0,order_date,delivery_date,Delivery Time,Twice As Long
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
314,1990-03-07,1999-12-25,3580 days,2009-10-13
884,1990-01-20,1999-11-12,3583 days,2009-09-03
904,1990-02-13,1999-11-15,3562 days,2009-08-16


In [130]:
shipping["Delivery Time"].max()

Timedelta('3583 days 00:00:00')

In [131]:
shipping["Delivery Time"].min()

Timedelta('8 days 00:00:00')