## Dates and Times

## Table of Contents

<ul>
    <li><a href="#1">1. Intro to the Working with Dates and Times Module</a></li>
    <li><a href="#2">2. Review of Python's datetime Module</a></li>
    <li><a href="#3">3. The pandas Timestamp Object</a></li>
    <li><a href="#4">4. The pandas DateTimeIndex Object</a></li>
    <li><a href="#5">5. The pd.datetime() Method</a></li>
    <li><a href="#6">6. Create Range of Dates with the pd.date_range() Method</a></li>
    <li><a href="#7">7. The .dt Accessor</a></li>
    <li><a href="#8">8. Install pandas-datareader Library</a></li>
    <li><a href="#9">9. Import Financial Data Set with pandas_datareader Library</a></li>
    <li><a href="#10">10. Selecting Rows from a DataFrame with DateTimeIndex</a></li>
    <li><a href="#11">11. Timestamp Object Attributes</a></li>
    <li><a href="#12">12. The .truncate() Method</a></li>
    <li><a href="#13">13. pd.DateOffset Objects</a></li>
    <li><a href="#14">14. More fun with pd.DateOffset Objects</a></li>
    <li><a href="#15">15. The pandas Timedelta Object</a></li>
    <li><a href="#16">16. Timedeltas in a Dataset</a></li>
</ul>

In [1]:
%%HTML
<style type="text/css">
table.dataframe td, table.dataframe th {
    border: 1.5px  black solid !important;
  color: black !important;
}

In [2]:
#To show multiple outputs and misc. changes
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all" # Show all results without print
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:78% !important; }</style>"))
import warnings
warnings.filterwarnings("ignore")

#Load Packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime as dt
from pandas_datareader import data
from pandas.tseries import *
pd.__version__

  from IPython.core.display import display, HTML


'1.5.2'

<a id='1'></a>
### 1. Intro to the Working with Dates and Times Module

<a id='2'></a>
### 2. Review of Python's `datetime` Module

In [3]:
birthday = dt.date(1984, 6, 27) #year, month, day

In [4]:
birthday.year
birthday.month
birthday.day

1984

6

27

In [5]:
birthday = dt.datetime(1984, 6, 27, 18, 25, 59) #year, month, day[, hour[, minute[, second[, microsecond[,tzinfo]]]]]

In [6]:
birthday.year
birthday.month
birthday.day
birthday.hour
birthday.minute
birthday.second

1984

6

27

18

25

59

In [7]:
dt.date(1984, 6, 27)
dt.datetime(1984, 6, 27, 18, 25, 59)
str(dt.datetime(1984, 6, 27, 18, 25, 59))
str(dt.date(1984, 6, 27))

datetime.date(1984, 6, 27)

datetime.datetime(1984, 6, 27, 18, 25, 59)

'1984-06-27 18:25:59'

'1984-06-27'

<a id='3'></a>
### 3.  The pandas `Timestamp` Object

In [8]:
pd.Timestamp('2015-03-31')
pd.Timestamp('2015/03/31')
pd.Timestamp('2015, 03, 31')
pd.Timestamp('03, 31, 2015')
pd.Timestamp('2015-03-31 08:35:15')
pd.Timestamp('2015-03-31 08:35:15 PM')

Timestamp('2015-03-31 00:00:00')

Timestamp('2015-03-31 00:00:00')

Timestamp('2015-03-31 00:00:00')

Timestamp('2015-03-31 00:00:00')

Timestamp('2015-03-31 08:35:15')

Timestamp('2015-03-31 20:35:15')

In [9]:
pd.Timestamp(dt.date(1984, 6, 27))
pd.Timestamp(dt.datetime(1984, 6, 27, 18, 25, 59))

Timestamp('1984-06-27 00:00:00')

Timestamp('1984-06-27 18:25:59')

<a id='4'></a>
### 4. The pandas `DateTimeIndex` Object

In [10]:
dates = ["2016-01-02", "2016-04-12", "2009-09-07", "2019-10-30"]
pd.DatetimeIndex(data=dates)

DatetimeIndex(['2016-01-02', '2016-04-12', '2009-09-07', '2019-10-30'], dtype='datetime64[ns]', freq=None)

In [11]:
dates = [dt.date(1984, 6, 27), dt.date(1985, 6, 27), dt.date(1986, 6, 27)]
pd.DatetimeIndex(data=dates)

DatetimeIndex(['1984-06-27', '1985-06-27', '1986-06-27'], dtype='datetime64[ns]', freq=None)

In [12]:
values = [100, 200, 300]
dates = [dt.date(1984, 6, 27), dt.date(1985, 6, 27), dt.date(1986, 6, 27)]
dt_index = pd.DatetimeIndex(data=dates)
pd.Series(data=values, index=dt_index)

1984-06-27    100
1985-06-27    200
1986-06-27    300
dtype: int64

<a id='5'></a>
### 5. The `pd.datetime()` Method

In [13]:
pd.to_datetime('2015-03-31')
pd.to_datetime('2015-03-31 08:35:15 PM')
pd.to_datetime(dt.date(1984, 6, 27))
pd.to_datetime(dt.datetime(1984, 6, 27, 18, 25, 59))
pd.to_datetime(["2015-03-31", "2015/02/08", "2016", "July 4th 1996"])

Timestamp('2015-03-31 00:00:00')

Timestamp('2015-03-31 20:35:15')

Timestamp('1984-06-27 00:00:00')

Timestamp('1984-06-27 18:25:59')

DatetimeIndex(['2015-03-31', '2015-02-08', '2016-01-01', '1996-07-04'], dtype='datetime64[ns]', freq=None)

In [14]:
times = pd.Series(["2015-03-31", "2015/02/08", "2016", "July 4th 1996 8PM"])
times

0           2015-03-31
1           2015/02/08
2                 2016
3    July 4th 1996 8PM
dtype: object

In [15]:
pd.to_datetime(times)

0   2015-03-31 00:00:00
1   2015-02-08 00:00:00
2   2016-01-01 00:00:00
3   1996-07-04 20:00:00
dtype: datetime64[ns]

In [16]:
times = pd.Series(["July 4th 1996", "10/12/1991", "Hello", "2015-02-31"])
times

0    July 4th 1996
1       10/12/1991
2            Hello
3       2015-02-31
dtype: object

In [17]:
pd.to_datetime(times, errors='coerce')
pd.to_datetime(times, errors='ignore')

0   1996-07-04
1   1991-10-12
2          NaT
3          NaT
dtype: datetime64[ns]

0    July 4th 1996
1       10/12/1991
2            Hello
3       2015-02-31
dtype: object

In [18]:
pd.to_datetime(arg=[1578519366, 1578599999], unit='s')

DatetimeIndex(['2020-01-08 21:36:06', '2020-01-09 19:59:59'], dtype='datetime64[ns]', freq=None)

<a id='6'></a>
### 6. Create Range of Dates with the `pd.date_range()` Method

In [19]:
pd.date_range(start='2016-01-01', end='2016-01-10', freq='D')

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10'],
              dtype='datetime64[ns]', freq='D')

In [20]:
times = pd.date_range(start='2016-01-01', end='2016-01-10', freq='D')
type(times)
type(times[0])

pandas.core.indexes.datetimes.DatetimeIndex

pandas._libs.tslibs.timestamps.Timestamp

In [21]:
pd.date_range(start='2016-01-01', end='2016-01-10', freq='H') # hour
pd.date_range(start='2016-01-01', end='2016-01-10', freq='20H') # 20hour
pd.date_range(start='2016-01-01', end='2016-01-10', freq='1D')
pd.date_range(start='2016-01-01', end='2016-01-10', freq='2D')
pd.date_range(start='2016-01-01', end='2016-01-11', freq='2D') # 2 day interval
pd.date_range(start='2016-01-01', end='2016-01-10', freq='B') # Business days

DatetimeIndex(['2016-01-01 00:00:00', '2016-01-01 01:00:00',
               '2016-01-01 02:00:00', '2016-01-01 03:00:00',
               '2016-01-01 04:00:00', '2016-01-01 05:00:00',
               '2016-01-01 06:00:00', '2016-01-01 07:00:00',
               '2016-01-01 08:00:00', '2016-01-01 09:00:00',
               ...
               '2016-01-09 15:00:00', '2016-01-09 16:00:00',
               '2016-01-09 17:00:00', '2016-01-09 18:00:00',
               '2016-01-09 19:00:00', '2016-01-09 20:00:00',
               '2016-01-09 21:00:00', '2016-01-09 22:00:00',
               '2016-01-09 23:00:00', '2016-01-10 00:00:00'],
              dtype='datetime64[ns]', length=217, freq='H')

DatetimeIndex(['2016-01-01 00:00:00', '2016-01-01 20:00:00',
               '2016-01-02 16:00:00', '2016-01-03 12:00:00',
               '2016-01-04 08:00:00', '2016-01-05 04:00:00',
               '2016-01-06 00:00:00', '2016-01-06 20:00:00',
               '2016-01-07 16:00:00', '2016-01-08 12:00:00',
               '2016-01-09 08:00:00'],
              dtype='datetime64[ns]', freq='20H')

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10'],
              dtype='datetime64[ns]', freq='D')

DatetimeIndex(['2016-01-01', '2016-01-03', '2016-01-05', '2016-01-07',
               '2016-01-09'],
              dtype='datetime64[ns]', freq='2D')

DatetimeIndex(['2016-01-01', '2016-01-03', '2016-01-05', '2016-01-07',
               '2016-01-09', '2016-01-11'],
              dtype='datetime64[ns]', freq='2D')

DatetimeIndex(['2016-01-01', '2016-01-04', '2016-01-05', '2016-01-06',
               '2016-01-07', '2016-01-08'],
              dtype='datetime64[ns]', freq='B')

In [22]:
pd.date_range(start='2016-01-01', end='2016-01-10', freq='W') # Week (Starts on Sunday)
pd.date_range(start='2016-01-01', end='2016-01-10', freq='W-FRI') # Week (Starts on Friday)
pd.date_range(start='2016-01-01', end='2016-12-31', freq='M') #returns month end
pd.date_range(start='2016-01-01', end='2016-12-31', freq='MS') #returns month starts
pd.date_range(start='2016-01-01', end='2050-12-31', freq='A') #returns last day of year

DatetimeIndex(['2016-01-03', '2016-01-10'], dtype='datetime64[ns]', freq='W-SUN')

DatetimeIndex(['2016-01-01', '2016-01-08'], dtype='datetime64[ns]', freq='W-FRI')

DatetimeIndex(['2016-01-31', '2016-02-29', '2016-03-31', '2016-04-30',
               '2016-05-31', '2016-06-30', '2016-07-31', '2016-08-31',
               '2016-09-30', '2016-10-31', '2016-11-30', '2016-12-31'],
              dtype='datetime64[ns]', freq='M')

DatetimeIndex(['2016-01-01', '2016-02-01', '2016-03-01', '2016-04-01',
               '2016-05-01', '2016-06-01', '2016-07-01', '2016-08-01',
               '2016-09-01', '2016-10-01', '2016-11-01', '2016-12-01'],
              dtype='datetime64[ns]', freq='MS')

DatetimeIndex(['2016-12-31', '2017-12-31', '2018-12-31', '2019-12-31',
               '2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31',
               '2024-12-31', '2025-12-31', '2026-12-31', '2027-12-31',
               '2028-12-31', '2029-12-31', '2030-12-31', '2031-12-31',
               '2032-12-31', '2033-12-31', '2034-12-31', '2035-12-31',
               '2036-12-31', '2037-12-31', '2038-12-31', '2039-12-31',
               '2040-12-31', '2041-12-31', '2042-12-31', '2043-12-31',
               '2044-12-31', '2045-12-31', '2046-12-31', '2047-12-31',
               '2048-12-31', '2049-12-31', '2050-12-31'],
              dtype='datetime64[ns]', freq='A-DEC')

In [23]:
pd.date_range(start='2016-01-01', periods=12, freq='1d')
len(pd.date_range(start='2016-01-01', periods=12, freq='1d'))

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10', '2016-01-11', '2016-01-12'],
              dtype='datetime64[ns]', freq='D')

12

In [24]:
pd.date_range(start='2016-01-01', periods=12, freq='B')

DatetimeIndex(['2016-01-01', '2016-01-04', '2016-01-05', '2016-01-06',
               '2016-01-07', '2016-01-08', '2016-01-11', '2016-01-12',
               '2016-01-13', '2016-01-14', '2016-01-15', '2016-01-18'],
              dtype='datetime64[ns]', freq='B')

In [25]:
pd.date_range(start='2016-01-01', periods=12, freq='W')

DatetimeIndex(['2016-01-03', '2016-01-10', '2016-01-17', '2016-01-24',
               '2016-01-31', '2016-02-07', '2016-02-14', '2016-02-21',
               '2016-02-28', '2016-03-06', '2016-03-13', '2016-03-20'],
              dtype='datetime64[ns]', freq='W-SUN')

In [26]:
pd.date_range(start='2016-01-01', periods=12, freq='W-FRI')

DatetimeIndex(['2016-01-01', '2016-01-08', '2016-01-15', '2016-01-22',
               '2016-01-29', '2016-02-05', '2016-02-12', '2016-02-19',
               '2016-02-26', '2016-03-04', '2016-03-11', '2016-03-18'],
              dtype='datetime64[ns]', freq='W-FRI')

In [27]:
pd.date_range(start='2016-01-01', periods=12, freq='MS')

DatetimeIndex(['2016-01-01', '2016-02-01', '2016-03-01', '2016-04-01',
               '2016-05-01', '2016-06-01', '2016-07-01', '2016-08-01',
               '2016-09-01', '2016-10-01', '2016-11-01', '2016-12-01'],
              dtype='datetime64[ns]', freq='MS')

In [28]:
pd.date_range(start='2016-01-01', periods=12, freq='M')

DatetimeIndex(['2016-01-31', '2016-02-29', '2016-03-31', '2016-04-30',
               '2016-05-31', '2016-06-30', '2016-07-31', '2016-08-31',
               '2016-09-30', '2016-10-31', '2016-11-30', '2016-12-31'],
              dtype='datetime64[ns]', freq='M')

In [29]:
pd.date_range(start='2016-01-01', periods=12, freq='A')

DatetimeIndex(['2016-12-31', '2017-12-31', '2018-12-31', '2019-12-31',
               '2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31',
               '2024-12-31', '2025-12-31', '2026-12-31', '2027-12-31'],
              dtype='datetime64[ns]', freq='A-DEC')

In [30]:
pd.date_range(start='2016-01-01', periods=12, freq='A-JUN')

DatetimeIndex(['2016-06-30', '2017-06-30', '2018-06-30', '2019-06-30',
               '2020-06-30', '2021-06-30', '2022-06-30', '2023-06-30',
               '2024-06-30', '2025-06-30', '2026-06-30', '2027-06-30'],
              dtype='datetime64[ns]', freq='A-JUN')

In [31]:
pd.date_range(end='2016-01-01', periods=12, freq='1D')

DatetimeIndex(['2015-12-21', '2015-12-22', '2015-12-23', '2015-12-24',
               '2015-12-25', '2015-12-26', '2015-12-27', '2015-12-28',
               '2015-12-29', '2015-12-30', '2015-12-31', '2016-01-01'],
              dtype='datetime64[ns]', freq='D')

In [32]:
pd.date_range(end='2016-01-01', periods=12, freq='B')

DatetimeIndex(['2015-12-17', '2015-12-18', '2015-12-21', '2015-12-22',
               '2015-12-23', '2015-12-24', '2015-12-25', '2015-12-28',
               '2015-12-29', '2015-12-30', '2015-12-31', '2016-01-01'],
              dtype='datetime64[ns]', freq='B')

In [33]:
pd.date_range(end='2016-01-01', periods=12, freq='W-SUN')

DatetimeIndex(['2015-10-11', '2015-10-18', '2015-10-25', '2015-11-01',
               '2015-11-08', '2015-11-15', '2015-11-22', '2015-11-29',
               '2015-12-06', '2015-12-13', '2015-12-20', '2015-12-27'],
              dtype='datetime64[ns]', freq='W-SUN')

In [34]:
pd.date_range(end='2016-01-01', periods=12, freq='MS')

DatetimeIndex(['2015-02-01', '2015-03-01', '2015-04-01', '2015-05-01',
               '2015-06-01', '2015-07-01', '2015-08-01', '2015-09-01',
               '2015-10-01', '2015-11-01', '2015-12-01', '2016-01-01'],
              dtype='datetime64[ns]', freq='MS')

In [35]:
pd.date_range(end='2016-01-01', periods=12, freq='a')

DatetimeIndex(['2004-12-31', '2005-12-31', '2006-12-31', '2007-12-31',
               '2008-12-31', '2009-12-31', '2010-12-31', '2011-12-31',
               '2012-12-31', '2013-12-31', '2014-12-31', '2015-12-31'],
              dtype='datetime64[ns]', freq='A-DEC')

<a id='7'></a>
### 7. The `.dt` Accessor

In [36]:
bunch_of_dates = pd.date_range(start="2000-01-01", end="2010-12-31", freq="24D")

In [37]:
series1 = pd.Series(bunch_of_dates)
series1.head(n=3)

0   2000-01-01
1   2000-01-25
2   2000-02-18
dtype: datetime64[ns]

In [38]:
series1.dt.day.head(n=2)
series1.dt.month.head(n=2)
series1.dt.day_name().head(n=2)
series1.dt.year.head(n=2)
series1.dt.is_quarter_start.head(n=2)

0     1
1    25
dtype: int64

0    1
1    1
dtype: int64

0    Saturday
1     Tuesday
dtype: object

0    2000
1    2000
dtype: int64

0     True
1    False
dtype: bool

In [39]:
filter1 = series1.dt.is_quarter_start
series1[filter1]

0     2000-01-01
19    2001-04-01
38    2002-07-01
137   2009-01-01
dtype: datetime64[ns]

In [40]:
filter1 = series1.dt.is_month_end
series1[filter1]

5     2000-04-30
57    2003-09-30
71    2004-08-31
90    2005-11-30
123   2008-01-31
161   2010-07-31
dtype: datetime64[ns]

<a id='8'></a>
### 8. Install `pandas-datareader` Library

<a id='9'></a>
### 9. Import Financial Data Set with `pandas_datareader` Library

In [41]:
company = "MSFT"
start = "2013-01-01"
end = "2020-12-31"
import yfinance as yf
yf.pdr_override()
# stocks = data.DataReader(name=["MSFT"],data_source="yahoo", start=start, end=end)
stocks = data.get_data_yahoo(["MSFT"], start=start, end=end)
stocks.head(n=3)

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-02,27.25,27.73,27.15,27.620001,22.774685,52899300
2013-01-03,27.629999,27.65,27.16,27.25,22.469589,48294400
2013-01-04,27.27,27.34,26.73,26.74,22.049057,52521100


In [42]:
stocks.values

array([[2.72500000e+01, 2.77299995e+01, 2.71499996e+01, 2.76200008e+01,
        2.27746849e+01, 5.28993000e+07],
       [2.76299992e+01, 2.76499996e+01, 2.71599998e+01, 2.72500000e+01,
        2.24695892e+01, 4.82944000e+07],
       [2.72700005e+01, 2.73400002e+01, 2.67299995e+01, 2.67399998e+01,
        2.20490570e+01, 5.25211000e+07],
       ...,
       [2.24449997e+02, 2.26029999e+02, 2.23020004e+02, 2.24960007e+02,
        2.21018997e+02, 1.79335000e+07],
       [2.26309998e+02, 2.27179993e+02, 2.23580002e+02, 2.24149994e+02,
        2.20223221e+02, 1.74032000e+07],
       [2.25229996e+02, 2.25630005e+02, 2.21470001e+02, 2.21679993e+02,
        2.17796463e+02, 2.02723000e+07]])

In [43]:
stocks.columns

Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object')

In [44]:
stocks.index

DatetimeIndex(['2013-01-02', '2013-01-03', '2013-01-04', '2013-01-07',
               '2013-01-08', '2013-01-09', '2013-01-10', '2013-01-11',
               '2013-01-14', '2013-01-15',
               ...
               '2020-12-16', '2020-12-17', '2020-12-18', '2020-12-21',
               '2020-12-22', '2020-12-23', '2020-12-24', '2020-12-28',
               '2020-12-29', '2020-12-30'],
              dtype='datetime64[ns]', name='Date', length=2014, freq=None)

In [45]:
stocks.index[0]

Timestamp('2013-01-02 00:00:00')

In [46]:
stocks.axes

[DatetimeIndex(['2013-01-02', '2013-01-03', '2013-01-04', '2013-01-07',
                '2013-01-08', '2013-01-09', '2013-01-10', '2013-01-11',
                '2013-01-14', '2013-01-15',
                ...
                '2020-12-16', '2020-12-17', '2020-12-18', '2020-12-21',
                '2020-12-22', '2020-12-23', '2020-12-24', '2020-12-28',
                '2020-12-29', '2020-12-30'],
               dtype='datetime64[ns]', name='Date', length=2014, freq=None),
 Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object')]

<a id='10'></a>
### 10. Selecting Rows from a DataFrame with `DateTimeIndex`

In [47]:
company = "MSFT"
start = "2013-01-01"
end = "2020-12-31"
import yfinance as yf
yf.pdr_override()
# stocks = data.DataReader(name=["MSFT"],data_source="yahoo", start=start, end=end)
stocks = data.get_data_yahoo(["SCOR"], start=start, end=end)
# stocks = data.DataReader(name="SCOR",data_source="yahoo", start=start, end=end)
stocks.head(n=3)

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-02,14.01,14.4,13.96,14.1,14.1,214600
2013-01-03,14.09,14.17,13.9,13.99,13.99,139800
2013-01-04,14.08,14.22,13.91,14.16,14.16,219900


In [48]:
stocks.loc["2014-03-04"]

Open             31.860001
High             32.910000
Low              31.540001
Close            32.279999
Adj Close        32.279999
Volume       361700.000000
Name: 2014-03-04 00:00:00, dtype: float64

In [49]:
stocks.iloc[0]

Open             14.01
High             14.40
Low              13.96
Close            14.10
Adj Close        14.10
Volume       214600.00
Name: 2013-01-02 00:00:00, dtype: float64

In [50]:
# stocks.loc[0]
# stocks.loc["2013-01-02"]

In [51]:
stocks.loc["2013-10-01":"2013-10-07"]

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-10-01,28.879999,29.110001,28.76,29.09,29.09,141900
2013-10-02,28.709999,29.040001,28.34,28.700001,28.700001,189300
2013-10-03,28.559999,28.709999,28.209999,28.51,28.51,140400
2013-10-04,28.450001,29.059999,28.35,28.83,28.83,82100
2013-10-07,28.549999,28.68,28.16,28.370001,28.370001,153600


In [52]:
stocks.loc["2013-10-01":"2013-10-07"]

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-10-01,28.879999,29.110001,28.76,29.09,29.09,141900
2013-10-02,28.709999,29.040001,28.34,28.700001,28.700001,189300
2013-10-03,28.559999,28.709999,28.209999,28.51,28.51,140400
2013-10-04,28.450001,29.059999,28.35,28.83,28.83,82100
2013-10-07,28.549999,28.68,28.16,28.370001,28.370001,153600


In [53]:
pd.date_range(start="1991-06-27", end = "2020-06-27", freq=pd.DateOffset(years=1))

DatetimeIndex(['1991-06-27', '1992-06-27', '1993-06-27', '1994-06-27',
               '1995-06-27', '1996-06-27', '1997-06-27', '1998-06-27',
               '1999-06-27', '2000-06-27', '2001-06-27', '2002-06-27',
               '2003-06-27', '2004-06-27', '2005-06-27', '2006-06-27',
               '2007-06-27', '2008-06-27', '2009-06-27', '2010-06-27',
               '2011-06-27', '2012-06-27', '2013-06-27', '2014-06-27',
               '2015-06-27', '2016-06-27', '2017-06-27', '2018-06-27',
               '2019-06-27', '2020-06-27'],
              dtype='datetime64[ns]', freq='<DateOffset: years=1>')

In [54]:
birthdays = pd.date_range(start="1991-06-27", end = "2020-06-27", freq=pd.DateOffset(years=1))

In [55]:
filter1 = stocks.index.isin(birthdays)
stocks.loc[filter1]

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-06-27,23.469999,23.99,23.42,23.940001,23.940001,130700
2014-06-27,35.0,35.48,34.810001,35.34,35.34,548700
2016-06-27,30.129999,30.209999,28.42,29.4,29.4,749100
2017-06-27,26.4,26.42,26.25,26.25,26.25,53400
2018-06-27,21.690001,22.059999,20.549999,21.120001,21.120001,119500
2019-06-27,5.68,5.84,5.18,5.32,5.32,1882000


<a id='11'></a>
### 11. Timestamp Object Attributes

In [56]:
company = "MSFT"
start = "2013-01-01"
end = "2020-12-31"
# stocks = data.DataReader(name="SCOR",data_source="yahoo", start=start, end=end)
import yfinance as yf
yf.pdr_override()
# stocks = data.DataReader(name=["MSFT"],data_source="yahoo", start=start, end=end)
stocks = data.get_data_yahoo(["MSFT"], start=start, end=end)
stocks.head(n=3)

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-02,27.25,27.73,27.15,27.620001,22.774691,52899300
2013-01-03,27.629999,27.65,27.16,27.25,22.469601,48294400
2013-01-04,27.27,27.34,26.73,26.74,22.049065,52521100


In [57]:
some_day = stocks.index[500]
some_day

Timestamp('2014-12-26 00:00:00')

In [58]:
some_day.day
some_day.month
some_day.year
some_day.day_name()
some_day.is_month_end
some_day.is_month_start

26

12

2014

'Friday'

False

False

In [59]:
stocks.insert(loc=0, column = "Day of Week", value=stocks.index.day_name())
stocks.head(n=3)

Unnamed: 0_level_0,Day of Week,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2013-01-02,Wednesday,27.25,27.73,27.15,27.620001,22.774691,52899300
2013-01-03,Thursday,27.629999,27.65,27.16,27.25,22.469601,48294400
2013-01-04,Friday,27.27,27.34,26.73,26.74,22.049065,52521100


In [60]:
stocks.insert(loc=1, column = "Is Start of Month", value=stocks.index.is_month_start)
stocks.head(n=3)

Unnamed: 0_level_0,Day of Week,Is Start of Month,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2013-01-02,Wednesday,False,27.25,27.73,27.15,27.620001,22.774691,52899300
2013-01-03,Thursday,False,27.629999,27.65,27.16,27.25,22.469601,48294400
2013-01-04,Friday,False,27.27,27.34,26.73,26.74,22.049065,52521100


In [61]:
stocks[stocks["Is Start of Month"]].head(n=3)

Unnamed: 0_level_0,Day of Week,Is Start of Month,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2013-02-01,Friday,True,27.67,28.049999,27.549999,27.93,23.030298,55565900
2013-03-01,Friday,True,27.719999,27.98,27.52,27.950001,23.237606,34849700
2013-04-01,Monday,True,28.639999,28.66,28.360001,28.610001,23.786331,29201100


<a id='12'></a>
### 12. The `.truncate()` Method

In [62]:
company = "MSFT"
start = "2013-01-01"
end = "2020-12-31"
# stocks = data.DataReader(name="SCOR",data_source="yahoo", start=start, end=end)
import yfinance as yf
yf.pdr_override()
# stocks = data.DataReader(name=["MSFT"],data_source="yahoo", start=start, end=end)
stocks = data.get_data_yahoo(["MSFT"], start=start, end=end)
stocks.head(n=3)

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-01-02,27.25,27.73,27.15,27.620001,22.774685,52899300
2013-01-03,27.629999,27.65,27.16,27.25,22.469587,48294400
2013-01-04,27.27,27.34,26.73,26.74,22.049065,52521100


In [63]:
stocks.truncate(before='2014-02-05', after='2018-02-01')

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2014-02-05,36.290001,36.470001,35.799999,35.820000,30.429918,55814400
2014-02-06,35.799999,36.250000,35.689999,36.180000,30.735748,35351800
2014-02-07,36.320000,36.590000,36.009998,36.560001,31.058567,33260500
2014-02-10,36.630001,36.799999,36.290001,36.799999,31.262447,26767000
2014-02-11,36.880001,37.259998,36.860001,37.169998,31.576771,32141400
...,...,...,...,...,...,...
2018-01-26,93.120003,94.059998,92.580002,94.059998,88.577705,29172200
2018-01-29,95.139999,95.449997,93.720001,93.919998,88.445847,31569900
2018-01-30,93.300003,93.660004,92.099998,92.739998,87.334641,38635100
2018-01-31,93.750000,95.400002,93.510002,95.010002,89.472336,48756300


<a id='13'></a>
### 13. `pd.DateOffset` Objects

In [64]:
company = "MSFT"
start = "2013-01-01"
end = "2020-12-31"
# stocks = data.DataReader(name="SCOR",data_source="yahoo", start=dt.date(2000,1,1),
#                          end=dt.datetime.now())

import yfinance as yf
yf.pdr_override()
# stocks = data.DataReader(name=["MSFT"],data_source="yahoo", start=start, end=end)
stocks = data.get_data_yahoo(["SCOR"], start="2000-01-01", end=end)
stocks.head(n=3)
stocks.tail(n=3)

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2007-06-27,22.0,24.389999,19.700001,23.469999,23.469999,0
2007-06-28,23.889999,26.27,22.5,25.59,25.59,804300
2007-06-29,26.200001,26.200001,22.26,23.15,23.15,604000


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2020-12-28,2.2,2.34,2.15,2.28,2.28,405000
2020-12-29,2.34,2.4,2.21,2.28,2.28,525200
2020-12-30,2.34,2.52,2.32,2.47,2.47,2620800


In [65]:
stocks.index

DatetimeIndex(['2007-06-27', '2007-06-28', '2007-06-29', '2007-07-02',
               '2007-07-03', '2007-07-05', '2007-07-06', '2007-07-09',
               '2007-07-10', '2007-07-11',
               ...
               '2020-12-16', '2020-12-17', '2020-12-18', '2020-12-21',
               '2020-12-22', '2020-12-23', '2020-12-24', '2020-12-28',
               '2020-12-29', '2020-12-30'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

In [66]:
stocks.index + pd.DateOffset(days=5)
stocks.index + pd.DateOffset(weeks=2)
stocks.index - pd.DateOffset(years=20)

DatetimeIndex(['2007-07-02', '2007-07-03', '2007-07-04', '2007-07-07',
               '2007-07-08', '2007-07-10', '2007-07-11', '2007-07-14',
               '2007-07-15', '2007-07-16',
               ...
               '2020-12-21', '2020-12-22', '2020-12-23', '2020-12-26',
               '2020-12-27', '2020-12-28', '2020-12-29', '2021-01-02',
               '2021-01-03', '2021-01-04'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

DatetimeIndex(['2007-07-11', '2007-07-12', '2007-07-13', '2007-07-16',
               '2007-07-17', '2007-07-19', '2007-07-20', '2007-07-23',
               '2007-07-24', '2007-07-25',
               ...
               '2020-12-30', '2020-12-31', '2021-01-01', '2021-01-04',
               '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-11',
               '2021-01-12', '2021-01-13'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

DatetimeIndex(['1987-06-27', '1987-06-28', '1987-06-29', '1987-07-02',
               '1987-07-03', '1987-07-05', '1987-07-06', '1987-07-09',
               '1987-07-10', '1987-07-11',
               ...
               '2000-12-16', '2000-12-17', '2000-12-18', '2000-12-21',
               '2000-12-22', '2000-12-23', '2000-12-24', '2000-12-28',
               '2000-12-29', '2000-12-30'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

In [67]:
stocks.index + pd.DateOffset(days=5, hours=3)
stocks.index + pd.DateOffset(weeks=2, day=2)
stocks.index - pd.DateOffset(years=20, months=4, hours=2, minutes=32)

DatetimeIndex(['2007-07-02 03:00:00', '2007-07-03 03:00:00',
               '2007-07-04 03:00:00', '2007-07-07 03:00:00',
               '2007-07-08 03:00:00', '2007-07-10 03:00:00',
               '2007-07-11 03:00:00', '2007-07-14 03:00:00',
               '2007-07-15 03:00:00', '2007-07-16 03:00:00',
               ...
               '2020-12-21 03:00:00', '2020-12-22 03:00:00',
               '2020-12-23 03:00:00', '2020-12-26 03:00:00',
               '2020-12-27 03:00:00', '2020-12-28 03:00:00',
               '2020-12-29 03:00:00', '2021-01-02 03:00:00',
               '2021-01-03 03:00:00', '2021-01-04 03:00:00'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

DatetimeIndex(['2007-06-16', '2007-06-16', '2007-06-16', '2007-07-16',
               '2007-07-16', '2007-07-16', '2007-07-16', '2007-07-16',
               '2007-07-16', '2007-07-16',
               ...
               '2020-12-16', '2020-12-16', '2020-12-16', '2020-12-16',
               '2020-12-16', '2020-12-16', '2020-12-16', '2020-12-16',
               '2020-12-16', '2020-12-16'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

DatetimeIndex(['1987-02-26 21:28:00', '1987-02-27 21:28:00',
               '1987-02-27 21:28:00', '1987-03-01 21:28:00',
               '1987-03-02 21:28:00', '1987-03-04 21:28:00',
               '1987-03-05 21:28:00', '1987-03-08 21:28:00',
               '1987-03-09 21:28:00', '1987-03-10 21:28:00',
               ...
               '2000-08-15 21:28:00', '2000-08-16 21:28:00',
               '2000-08-17 21:28:00', '2000-08-20 21:28:00',
               '2000-08-21 21:28:00', '2000-08-22 21:28:00',
               '2000-08-23 21:28:00', '2000-08-27 21:28:00',
               '2000-08-28 21:28:00', '2000-08-29 21:28:00'],
              dtype='datetime64[ns]', name='Date', length=3403, freq=None)

<a id='14'></a>
### 14. More fun with `pd.DateOffset` Objects

In [68]:
company = "SCOR"
start = "2013-01-01"
end = "2020-12-31"
# stocks = data.DataReader(name=company,data_source="yahoo", start=dt.date(2000,1,1),
#                          end=dt.datetime.now())

import yfinance as yf
yf.pdr_override()
# stocks = data.DataReader(name=["MSFT"],data_source="yahoo", start=start, end=end)
stocks = data.get_data_yahoo(["SCOR"], start="2000-01-01", end=end)
stocks.head(n=3)
stocks.tail(n=3)

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2007-06-27,22.0,24.389999,19.700001,23.469999,23.469999,0
2007-06-28,23.889999,26.27,22.5,25.59,25.59,804300
2007-06-29,26.200001,26.200001,22.26,23.15,23.15,604000


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2020-12-28,2.2,2.34,2.15,2.28,2.28,405000
2020-12-29,2.34,2.4,2.21,2.28,2.28,525200
2020-12-30,2.34,2.52,2.32,2.47,2.47,2620800


In [69]:
stocks["new_date"] = stocks.index + pd.tseries.offsets.MonthEnd(n=1)
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31
...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31


In [70]:
stocks["new_date2"] = stocks.index - pd.tseries.offsets.MonthEnd(n=1)
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date,new_date2
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30,2007-05-31
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30,2007-05-31
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30,2007-05-31
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31,2007-06-30
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31,2007-06-30
...,...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31,2020-11-30
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31,2020-11-30
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31,2020-11-30
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31,2020-11-30


In [71]:
stocks["new_date3"] = stocks.index - pd.tseries.offsets.MonthBegin(n=1)
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date,new_date2,new_date3
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30,2007-05-31,2007-06-01
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30,2007-05-31,2007-06-01
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30,2007-05-31,2007-06-01
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31,2007-06-30,2007-07-01
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31,2007-06-30,2007-07-01
...,...,...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31,2020-11-30,2020-12-01
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31,2020-11-30,2020-12-01
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31,2020-11-30,2020-12-01
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31,2020-11-30,2020-12-01


In [72]:
stocks["new_date4"] = stocks.index - pd.tseries.offsets.BMonthEnd()
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date,new_date2,new_date3,new_date4
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30,2007-05-31,2007-06-01,2007-05-31
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30,2007-05-31,2007-06-01,2007-05-31
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30,2007-05-31,2007-06-01,2007-05-31
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31,2007-06-30,2007-07-01,2007-06-29
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31,2007-06-30,2007-07-01,2007-06-29
...,...,...,...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31,2020-11-30,2020-12-01,2020-11-30
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31,2020-11-30,2020-12-01,2020-11-30
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31,2020-11-30,2020-12-01,2020-11-30
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31,2020-11-30,2020-12-01,2020-11-30


In [73]:
stocks["new_date5"] = stocks.index - pd.tseries.offsets.QuarterEnd()
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date,new_date2,new_date3,new_date4,new_date5
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31,2007-06-30,2007-07-01,2007-06-29,2007-06-30
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31,2007-06-30,2007-07-01,2007-06-29,2007-06-30
...,...,...,...,...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30


In [74]:
stocks["new_date6"] = stocks.index - pd.tseries.offsets.YearEnd()
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date,new_date2,new_date3,new_date4,new_date5,new_date6
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31,2006-12-31
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31,2006-12-31
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31,2006-12-31
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31,2007-06-30,2007-07-01,2007-06-29,2007-06-30,2006-12-31
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31,2007-06-30,2007-07-01,2007-06-29,2007-06-30,2006-12-31
...,...,...,...,...,...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31


In [75]:
stocks["new_date7"] = stocks.index + pd.tseries.offsets.YearEnd()
stocks

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,new_date,new_date2,new_date3,new_date4,new_date5,new_date6,new_date7
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2007-06-27,22.000000,24.389999,19.700001,23.469999,23.469999,0,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31,2006-12-31,2007-12-31
2007-06-28,23.889999,26.270000,22.500000,25.590000,25.590000,804300,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31,2006-12-31,2007-12-31
2007-06-29,26.200001,26.200001,22.260000,23.150000,23.150000,604000,2007-06-30,2007-05-31,2007-06-01,2007-05-31,2007-03-31,2006-12-31,2007-12-31
2007-07-02,23.490000,23.490000,22.000000,22.410000,22.410000,324100,2007-07-31,2007-06-30,2007-07-01,2007-06-29,2007-06-30,2006-12-31,2007-12-31
2007-07-03,22.770000,22.770000,22.000000,22.389999,22.389999,86300,2007-07-31,2007-06-30,2007-07-01,2007-06-29,2007-06-30,2006-12-31,2007-12-31
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-12-23,2.280000,2.340000,2.170000,2.210000,2.210000,775800,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31,2020-12-31
2020-12-24,2.230000,2.330000,2.100000,2.120000,2.120000,445600,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31,2020-12-31
2020-12-28,2.200000,2.340000,2.150000,2.280000,2.280000,405000,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31,2020-12-31
2020-12-29,2.340000,2.400000,2.210000,2.280000,2.280000,525200,2020-12-31,2020-11-30,2020-12-01,2020-11-30,2020-09-30,2019-12-31,2020-12-31


<a id='15'></a>
### 15. The pandas `Timedelta` Object

In [76]:
time_a = pd.Timestamp("1984-06-27")
time_a

Timestamp('1984-06-27 00:00:00')

In [77]:
time_b = pd.Timestamp.now()
time_b

Timestamp('2022-12-22 17:27:45.779497')

In [78]:
time_b - time_a

Timedelta('14057 days 17:27:45.779497')

In [79]:
time_a - time_b

Timedelta('-14058 days +06:32:14.220503')

In [80]:
pd.Timedelta(days=3)
pd.Timedelta(days=3, hours=14, minutes=45) # cannot accept years
pd.Timedelta("5 minutes")
pd.Timedelta("14 days 6 hours 12 minutes 49 seconds") # cannot accept years/weeks

Timedelta('3 days 00:00:00')

Timedelta('3 days 14:45:00')

Timedelta('0 days 00:05:00')

Timedelta('14 days 06:12:49')

<a id='16'></a>
### 16. Timedeltas in a Dataset

In [81]:
shipping = pd.read_csv("ecommerce.csv", index_col="ID", 
                       parse_dates=["order_date", "delivery_date"])
shipping.head(n=3)

Unnamed: 0_level_0,order_date,delivery_date
ID,Unnamed: 1_level_1,Unnamed: 2_level_1
1,1998-05-24,1999-02-05
2,1992-04-22,1998-03-06
4,1991-02-10,1992-08-26


In [82]:
shipping["delivery_time"] = shipping["delivery_date"] - shipping["order_date"]
shipping.head(n=3)

Unnamed: 0_level_0,order_date,delivery_date,delivery_time
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1998-05-24,1999-02-05,257 days
2,1992-04-22,1998-03-06,2144 days
4,1991-02-10,1992-08-26,563 days


In [83]:
shipping["twice as long"] = shipping["delivery_date"] + shipping["delivery_time"]
shipping.head(n=3)
shipping.info()

Unnamed: 0_level_0,order_date,delivery_date,delivery_time,twice as long
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1998-05-24,1999-02-05,257 days,1999-10-20
2,1992-04-22,1998-03-06,2144 days,2004-01-18
4,1991-02-10,1992-08-26,563 days,1994-03-12


<class 'pandas.core.frame.DataFrame'>
Int64Index: 501 entries, 1 to 997
Data columns (total 4 columns):
 #   Column         Non-Null Count  Dtype          
---  ------         --------------  -----          
 0   order_date     501 non-null    datetime64[ns] 
 1   delivery_date  501 non-null    datetime64[ns] 
 2   delivery_time  501 non-null    timedelta64[ns]
 3   twice as long  501 non-null    datetime64[ns] 
dtypes: datetime64[ns](3), timedelta64[ns](1)
memory usage: 19.6 KB


In [84]:
shipping[shipping["delivery_time"] > "365 days"].sort_values("delivery_time", ascending=False)

Unnamed: 0_level_0,order_date,delivery_date,delivery_time,twice as long
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
884,1990-01-20,1999-11-12,3583 days,2009-09-03
314,1990-03-07,1999-12-25,3580 days,2009-10-13
904,1990-02-13,1999-11-15,3562 days,2009-08-16
130,1990-04-02,1999-08-16,3423 days,2008-12-29
331,1990-09-18,1999-12-19,3379 days,2009-03-20
...,...,...,...,...
326,1998-05-12,1999-05-29,382 days,2000-06-14
445,1993-02-11,1994-02-24,378 days,1995-03-09
76,1997-05-26,1998-06-05,375 days,1999-06-15
457,1991-06-17,1992-06-18,367 days,1993-06-20


In [85]:
shipping[shipping["delivery_time"] == "3583 days"].sort_values("delivery_time", ascending=False)

Unnamed: 0_level_0,order_date,delivery_date,delivery_time,twice as long
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
884,1990-01-20,1999-11-12,3583 days,2009-09-03


In [86]:
shipping["delivery_time"].max()
shipping["delivery_time"].min()

Timedelta('3583 days 00:00:00')

Timedelta('8 days 00:00:00')