# 1. Extracting Components: Year, month, day, day of week, hour, minute

Extracting components from a Date/Time feature involves breaking down a single date or timestamp into its individual parts, such as the year, month, day, day of the week, hour, and minute. Each of these components can then be used as a separate numerical or categorical feature in a machine learning model.

Why Extract Date/Time Components?

1. Capture Temporal Patterns: Many phenomena exhibit patterns related to specific times. For example, sales of certain apparel items (like those in the image) might peak during specific months (e.g., before festivals), on certain days of the week (e.g., weekends), or even at particular hours of the day.
2. Enable Non-Linear Relationships: While the original timestamp is a continuous variable, its relationship with the target variable might be non-linear. By extracting components, you allow the model to learn different effects for different months, days, or hours. For instance, the impact of a discount might be different on a weekday versus a weekend.
3. Simplify Complex Time Dependencies: Instead of the model having to learn complex patterns from a single timestamp, it can learn simpler relationships with individual time components.
4. Feature Engineering for Specific Models: Some models, especially those that don't inherently understand cyclical data (like linear models or tree-based models without special handling), can benefit from these extracted components.
5. Create Categorical Features: Components like month, day of the week, or hour can be treated as categorical features, allowing for different coefficients or splits for each category.

How to Extract Date/Time Components:

Most programming libraries that handle dates and times (like Python's datetime module or Pandas) provide easy ways to access these components from a datetime object.

Common Components and Their Potential Use:

1. Year: Can capture long-term trends or seasonality that occurs annually. For example, sales might generally increase year over year.
2. Month: Captures seasonal patterns within a year (e.g., increased sales before Durga Puja in Kolkata). Can be treated as numerical (1-12) or categorical (January, February, etc.).
3. Day of the Month: Can capture patterns related to specific days (e.g., payday effects on spending). Numerical (1-31).
4. Day of the Week: Captures weekly patterns (e.g., higher website traffic or sales on weekends). Can be numerical (0-6 or 1-7) or categorical (Monday, Tuesday, etc.).
5. Hour: Captures daily patterns (e.g., peak browsing times for sports apparel). Numerical (0-23). Can also be binned into time periods (morning, afternoon, evening, night).
6. Minute: Can be relevant for very granular data or real-time systems, but often less important for daily or weekly trends. Numerical (0-59).
7. Second: Similar to minute, usually relevant for high-frequency data. Numerical (0-59).

# Import necessary dependencies

In [1]:
import pandas as pd

# Create sample dataset

In [6]:
# Sample DataFrame with a Date/Time column (representing order timestamps)
data = pd.DataFrame({
    'OrderTimestamp': pd.to_datetime([
        '2025-03-10 10:30:00',
        '2025-03-10 14:45:30',
        '2025-03-11 09:15:10',
        '2025-03-15 18:00:00',
        '2025-04-01 12:00:00',
        '2025-04-05 20:30:00'
    ])
})

print("Original Data:")
data

Original Data:


Unnamed: 0,OrderTimestamp
0,2025-03-10 10:30:00
1,2025-03-10 14:45:30
2,2025-03-11 09:15:10
3,2025-03-15 18:00:00
4,2025-04-01 12:00:00
5,2025-04-05 20:30:00


# Extracting temporal components

In [7]:
# Extracting components

data['Year'] = data['OrderTimestamp'].dt.year
data['Month'] = data['OrderTimestamp'].dt.month
data['Day'] = data['OrderTimestamp'].dt.day
data['DayOfWeek'] = data['OrderTimestamp'].dt.dayofweek  # Monday=0, Sunday=6
data['Hour'] = data['OrderTimestamp'].dt.hour
data['Minute'] = data['OrderTimestamp'].dt.minute

print("\nData with Extracted Date/Time Components:")
data


Data with Extracted Date/Time Components:


Unnamed: 0,OrderTimestamp,Year,Month,Day,DayOfWeek,Hour,Minute
0,2025-03-10 10:30:00,2025,3,10,0,10,30
1,2025-03-10 14:45:30,2025,3,10,0,14,45
2,2025-03-11 09:15:10,2025,3,11,1,9,15
3,2025-03-15 18:00:00,2025,3,15,5,18,0
4,2025-04-01 12:00:00,2025,4,1,1,12,0
5,2025-04-05 20:30:00,2025,4,5,5,20,30


In [8]:
# further transform these components if needed
# Example: Convert DayOfWeek to categorical names

days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
data['DayOfWeekName'] = data['DayOfWeek'].apply(lambda x: days[x])
print("\nData with Day of Week Name:")
data



Data with Day of Week Name:


Unnamed: 0,OrderTimestamp,Year,Month,Day,DayOfWeek,Hour,Minute,DayOfWeekName
0,2025-03-10 10:30:00,2025,3,10,0,10,30,Monday
1,2025-03-10 14:45:30,2025,3,10,0,14,45,Monday
2,2025-03-11 09:15:10,2025,3,11,1,9,15,Tuesday
3,2025-03-15 18:00:00,2025,3,15,5,18,0,Saturday
4,2025-04-01 12:00:00,2025,4,1,1,12,0,Tuesday
5,2025-04-05 20:30:00,2025,4,5,5,20,30,Saturday


If you had data on when users viewed or purchased items, extracting these components could help you understand:

1. Monthly Trends: Are certain types of sports apparel more popular in specific months (e.g., lighter clothing in summer)?
2. Weekly Patterns: Do more purchases happen on weekends when people have more leisure time?
3. Daily Peaks: Are there specific times of day when users are more likely to browse or buy sports apparel online?

By creating these individual features, your machine learning model can learn and leverage these temporal patterns to make better predictions (e.g., predicting sales volume, likelihood of purchase).