# **Introduction to Dates & Time with pandas**

This jupyter notebook can be found on my GitHub account: https://github.com/mbonnemaison/Learning-Python
### **pandas** is a python library that facilitates data analysis organized in a table.

### Sources:
- Information to install pandas, introduce pandas and the user guide: https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html
- Python for Data Analysis by Wes McKinney (2nd edition used here) - Chapter 5 (Introduction), Chapter 11 (Time Series)
- Video on Data Analysis (go to comments to go to part you're interested in): https://www.youtube.com/watch?v=r-uOLxNrNk8&list=RDCMUC8butISFwT-Wl7EV0hUK0BQ&index=3

## Introduction to Time & Dates
Some of the elementary data structures for working with date & time data are:

- **Timestamps** : specific instants in time
- **Timedeltas**: Intervals of time indicated by a start and end timestamp.

### **Timestamp**
Python provides the date and time functionality in the **datetime** module that contains three popular classes:

- **Date class**: to work with dates (day, month, year)
- **Time class**: to work with times (hours, minutes, seconds, microseconds)
- **Datetime class**: to work with components of both date and time

***Timestamp*** is pandas equivalent of python’s datetime.datetime object and is interchangeable with it in most cases. It’s the type used for the entries that make up a DatetimeIndex, and other timeseries oriented data structures in pandas.

### **What time is it now?**

In [1]:
import pandas as pd

In [2]:
now = pd.to_datetime('now')

In [3]:
now

Timestamp('2021-04-04 21:12:53.164645')

In [4]:
now_utc = now.tz_localize('US/Eastern')

In [5]:
now_utc

Timestamp('2021-04-04 21:12:53.164645-0400', tz='US/Eastern')

In [7]:
now_est = now_utc.tz_convert('US/Pacific')

In [8]:
now_est

Timestamp('2021-04-04 18:12:53.164645-0700', tz='US/Pacific')

### **Convert strings to Datetimes**
Strings can be converted to dates using **pd.to_datetime**.

Note: Information on format can be found here: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior)

In [None]:
pd.to_datetime('2021-02-19 22:45:56')

In [None]:
pd.to_datetime('2021-02-19 22:45:56', format = '%Y-%m-%d')

### **Convert a list of dates from string to datetime or Timestamp**

In [None]:
date_list_str = ['2021-03-14', '2020-12-25', '2025-02-19']

In [None]:
pd.to_datetime(date_list_str)

### **Dealing with missing values**

In [None]:
date_list_str2 = ['2021-03-14', '2020-12-25', '2025-02-19', '2021-04-14', None]

In [None]:
pd.to_datetime(date_list_str2)

**NaT** means Not a Time

### **Reading data from a csv file using pandas**
More information on data here: https://github.com/mbonnemaison/adelego

In [None]:
data = pd.read_csv("data3months.csv", sep = '\t')

In [None]:
data

In [None]:
data.info()

In [None]:
data['Date']

In [None]:
pd.Series([1,2,3,4,5])

### **Convert "Date" from string to timestamp**

In [None]:
data3["Date"] = pd.to_datetime(data3["Date"])

In [None]:
data3["Date"][0]

***Missing values in DataFrame...***

In [None]:
data4 = pd.read_csv("data3months-Copy1.csv", sep = '\t')

In [None]:
data4.head(10)

In [None]:
data4.info()

In [None]:
data4["Date"] = pd.to_datetime(data4["Date"])

In [None]:
data.info()

In [None]:
data4["Date"][33]

### **Data manipulations with Timestamps in pandas**
**Select rows**

In [None]:
data.iloc[1:3]

In [None]:
data.loc[(data['Date'] > '2021-02-01') & (data['Date'] < '2021-02-02') & (data['Type'] == 'HUMIDITY')]

**Sort values**

In [None]:
data.sort_values(by = ["Date"], ascending=True)

### **Generate Timestamps at fixed frequency**
*Fixed frequency* consists of data points that occur at regular intervals, like every 5 minutes.

In [None]:
tsff = pd.date_range(start = '1/1/2021', periods = 50, freq = '4h')

In [None]:
tsff

## **Timedeltas**
Timedelta represents the temporal difference between two datetime objects.

In [None]:
pd.Timedelta(weeks = 1, days = 4, hours = 5)

### **Timedelta operations**
**Add time to Timestamps**

In [None]:
ts = pd.to_datetime('2021/3/23 23:20:00') + pd.Timedelta(days=-3)

In [None]:
ts

**Difference between Timestamps generates a Timedelta**

In [None]:
delta = pd.to_datetime('2021/3/23 23:20:00') - pd.to_datetime('2021/3/20 2:34:14')

In [None]:
delta

**Adding Timedeltas**

In [None]:
td1 = pd.Timedelta(weeks = 3, days = 3, hours = 3)
td2 = pd.Timedelta(weeks = 1, days = 1, hours = 1)

In [None]:
td1+td2

### **Convert strings to Timedelta**

In [None]:
pd.to_timedelta('233:23:23')