
# Day 18 - Handling Date and Time Data



## Why is Handling Date and Time Data Important?
Dates and times are fundamental components of many datasets. Whether you're tracking website traffic, sales, or other metrics over time, understanding how to work with date and time data is crucial for accurate analysis. Pandas provides powerful tools to handle, manipulate, and analyze time-series data, allowing you to gain insights from temporal patterns.



## Tutorial: Working with Dates and Times in DataFrames
Pandas makes it easy to work with dates and times, providing a variety of functions for parsing, formatting, and extracting components from datetime objects. Let's go through some practical examples to illustrate these techniques.


In [None]:
!pip install pandas

In [None]:

import pandas as pd

# Example DataFrame with date strings
data = {
    'Event': ['Event A', 'Event B', 'Event C', 'Event D'],
    'Date': ['2023-01-01', '2023-03-15', '2023-07-04', '2023-10-20']
}
df = pd.DataFrame(data)

# Converting the 'Date' column to datetime
df['Date'] = pd.to_datetime(df['Date'])

# Display the DataFrame with parsed dates
print("DataFrame with parsed dates:")
print(df)



### Extracting Date Components
You can easily extract components like the year, month, day, or even the day of the week from datetime objects. This is useful for analyzing trends or grouping data by specific time periods.


In [None]:

# Extracting components from the date
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day of Week'] = df['Date'].dt.day_name()

print("DataFrame with extracted date components:")
print(df)



## Use Case: Analyzing Time-Series Data with Google Trends
For this use case, we'll use the Google Trends data to analyze the popularity of the search term "Data Science" over time. We'll fetch the data using the `pytrends` library and then perform time-series analysis.



### Step 1: Fetching Google Trends Data
We'll use the `pytrends` library to download time-series data related to specific search terms. First, we'll need to install the `pytrends` library.


In [None]:
!pip install pytrends

In [None]:

from pytrends.request import TrendReq
import pandas as pd

# Initialize pytrends request
pytrends = TrendReq(hl='en-US', tz=360)

# Define the search term and fetch data
search_term = "Data Science"
pytrends.build_payload([search_term], cat=0, timeframe='2010-01-01 2024-08-19', geo='', gprop='')
trends_df = pytrends.interest_over_time()

# Display the first few rows of the dataset
print("First few rows of Google Trends data:")
print(trends_df.head())



### Step 2: Parsing and Handling Dates
We'll ensure that the date column is parsed correctly and then extract relevant components for analysis.


In [None]:

# Ensure the index (date) is in datetime format
trends_df.reset_index(inplace=True)

# Extracting components from the date
trends_df['Year'] = trends_df['date'].dt.year
trends_df['Month'] = trends_df['date'].dt.month
trends_df['Day of Week'] = trends_df['date'].dt.day_name()

print("Google Trends data with extracted date components:")
print(trends_df.head())



### Step 3: Analyzing Trends Over Time
We’ll group the data by month to analyze trends in interest over time for the search term.


In [None]:

# Grouping by month to analyze monthly trends
trends_df['Year'] = trends_df['Year'].astype(int)
trends_df['Month'] = trends_df['Month'].astype(int)

# Grouping by year and month, then calculating the mean for the 'Data Science' column
monthly_trends = trends_df.groupby(['Year', 'Month'])['Data Science'].mean().reset_index()

print("Monthly trend summary for Data Science:")
print(monthly_trends.head())


In [None]:
!pip install matplotlib

In [None]:
import matplotlib.pyplot as plt

monthly_trends['Date'] = pd.to_datetime(monthly_trends[['Year', 'Month']].assign(Day=1))

# Plotting the line graph
plt.figure(figsize=(10, 6))
plt.plot(monthly_trends['Date'], monthly_trends['Data Science'], marker='o', linestyle='-')
plt.title('Monthly Average Trend for Data Science Since 2010')
plt.xlabel('Date')
plt.ylabel('Average Trend')
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

In [None]:
!pip install scikit.learn

In [None]:
import numpy as np
from sklearn.linear_model import LinearRegression
monthly_trends['Date'] = pd.to_datetime(monthly_trends[['Year', 'Month']].assign(Day=1))

# Convert datetime to ordinal for linear regression
monthly_trends['Date_ordinal'] = monthly_trends['Date'].map(pd.Timestamp.toordinal)

# Reshape for sklearn
X = monthly_trends['Date_ordinal'].values.reshape(-1, 1)
y = monthly_trends['Data Science'].values

# Fit the linear regression model
model = LinearRegression()
model.fit(X, y)

# Predict the trendline
trendline = model.predict(X)

In [None]:
# Plotting the line graph
plt.figure(figsize=(10, 6))
plt.plot(monthly_trends['Date'], monthly_trends['Data Science'], marker='o', linestyle='-', label='Data Science Trend')
plt.plot(monthly_trends['Date'], trendline, color='red', linestyle='--', label='Trendline')
plt.title('Monthly Average Trend for Data Science Since 2010')
plt.xlabel('Date')
plt.ylabel('Average Trend')
plt.grid(True)
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()


## Conclusion
In today’s post, we explored how to handle date and time data in Pandas, focusing on analyzing time-series data using the Google Trends API. We demonstrated how to fetch, parse, and analyze trends over time, providing valuable insights into the popularity of search terms.
