<a href="https://colab.research.google.com/github/EthanDuog/Datetime_Pandas/blob/main/Datetime_Pandas.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Datetime_Pandas
How Pandas handles dates and times through the DateTime data type.

**CLEAR EXPLAINATION THROUGH BLOGGING: [Working with date and time in Pandas](https://medium.com/@ethan.duong1120/working-with-date-and-time-in-pandas-data-science-journey-fc6d599ea90a)**

By: Ethan Duong | 

Date: 04/02/2023


Dates and times are an essential part of many data sets and can play a critical role in data analysis. In this blog, we'll explore how to work with dates and times in Pandas, one of the most widely used data analysis libraries in Python.

First, let's cover the basics. In Pandas, dates and times are stored as special data types, namely Timestamp and DatetimeIndex. To create a Timestamp, we can use the pandas.Timestamp function, which accepts a variety of inputs, including strings and integers. DatetimeIndex is a type of Index in Pandas that is specifically designed to store dates and times. To create a DatetimeIndex, we can use the pandas.to_datetime function, which can parse a wide range of date and time formats.

# Outline:
Convert string to datetime and handle missing values.

1. Convert string to datetime and handle missing values
2. Assemble datetime from multiple columns.
3. Extract year month day from a date column.
4. Select data between two dates.
5. Calculate the duration between two dates.
6. Select data with a specific year and perform aggregations.


**1. Convert string to datatime and handle missing values**

In [None]:
df = pd.DataFrame({'date': ['2016-6-10 20:30:0', 
                            '2016-7-1 19:45:30', 
                            '2013-10-12 4:5:1'],
                   'value': [2, 3, 4]})
df

df['date'] = pd.to_datetime(df['date'], dayfirst = False)


df['date'] = pd.to_datetime(df['date'], format="%Y-%d-%m %H:%M:%S")

The date data types is still string. We can use to_datetime function to convert it to datetime data type.
If the given format is not in order, you can custom date format.

In [None]:
#Handle Null value
# Ignore it: (the incorrect or null value still be read in string format)
df['date'] = pd.to_datetime(df['date'], errors='ignore')

# Eliminate it (let s assume there are null values in date column: 
df = df.dropna(subset=['date'])
# if you want to drop null value in multiple columns you can just add more
# column name in the subset (beside date)

**2. Assemble datetime from multiple columns.**

In [None]:
df = pd.DataFrame({'id': ['1', '2', '3', '4'],
                   'name': ['Ethan', 'Alison', 'Jolie', 'nick'],
                   'date': ['2022-01-01', '2022-01-02', '2022-01-03','2022-01-04' ],
                   'time': ['12:00:00', '13:00:00', '14:00:00', '15:00:00']})

df['datetime'] = pd.to_datetime(df['date'] + ' ' + df['time'])


**3. Extract year month day from a date column.**

In [None]:
df['year']= df['datetime'].dt.year
df['month']= df['datetime'].dt.month
df['day']= df['datetime'].dt.day

**4. Select data between two dates.**

In [None]:
start_date = '2022-01-02'
end_date = '2022-01-04'

mask = (df['datetime'] >= start_date) & (df['datetime'] <= end_date)
result = df.loc[mask]

print(result)

**5. Calculate the duration between two dates.**




In [None]:
df['second_datetime'] = pd.to_datetime('2022-01-06 12:00:00')
df['duration'] = df['second_datetime'] - df['datetime']
df['duration_days'] = df['duration'].dt.days


**6. Select data with a specific year and perform aggregations.**

In [None]:
#Change candy to numeric and change birth_daty to datetime 
df['candy'] = pd.to_numeric(df['candy'])
df['birth_day'] = pd.to_datetime(df['birth_day'])


In [None]:
#get the year from birth_day
year_born = df['birth_day'].dt.year

#get data for people born in 2022, then group by year born, sum up 
df_day = df[year_born == 2022].groupby(year_born).sum()
df_day

This is a complicated aggregations and require critical thinking to understand how the function operate.

Thank you for being here, I hope it helps you in some ways.

You can contact me through email: ethan.duong1120@gmail.com


I will see you soon !