# Pandas datetime object - Timeseries Data Analysis

In Pandas, the term "datetime object" primarily refers to two related, yet distinct, entities:

* pd.Timestamp: This is the scalar (single point in time) representation. It's Pandas' equivalent of Python's built-in datetime.datetime object, but it's optimized for efficiency and interoperability within Pandas' data structures. A Timestamp represents a specific moment in time with nanosecond precision.
* pd.DatetimeIndex: This is the collection (sequence of times) representation. It's a specialized type of Pandas Index (like a list of labels for rows) where each label is a pd.Timestamp object. It's the cornerstone for time-series analysis in Pandas, allowing for powerful time-based operations on DataFrames and Series.



# 1. Using pd.to_datetime() to convert existing data

This is the most common way to create a DatetimeIndex when your date/time information is already present in a DataFrame column or a list of strings. Pandas is quite flexible in parsing various string formats.

### A. Scenario 1: Converting a column of strings to datetime and setting as index

In [52]:
import pandas as pd

# Scenario 1: Converting a column of strings to datetime and setting as index
print("--- Scenario 1: Converting a string column to DatetimeIndex ---")
data = {
    'Date_String': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04'],
    'Sales': [100, 105, 98, 110]
}
df1 = pd.DataFrame(data)
print("Original DataFrame (string dates):\n", df1)

# Convert 'Date_String' column to datetime objects and set as index
df1['Date_String'] = pd.to_datetime(df1['Date_String'])
df1 = df1.set_index('Date_String')
print("\nDataFrame with DatetimeIndex (Scenario 1):\n", df1)
print("Index type:", type(df1.index))
print("Index dtype:", df1.index.dtype)

--- Scenario 1: Converting a string column to DatetimeIndex ---
Original DataFrame (string dates):
   Date_String  Sales
0  2023-01-01    100
1  2023-01-02    105
2  2023-01-03     98
3  2023-01-04    110

DataFrame with DatetimeIndex (Scenario 1):
              Sales
Date_String       
2023-01-01     100
2023-01-02     105
2023-01-03      98
2023-01-04     110
Index type: <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Index dtype: datetime64[ns]


### B. Scenario 2: Converting a column of mixed date formats directly

In [53]:
# Scenario 2: Converting a column of mixed date formats directly

print("\n--- Scenario 2: Converting mixed date formats in a Series ---")
mixed_dates = pd.Series(['2024-03-15', '16/03/2024', 'March 17, 2024 10:30', '20240318'])
print("Original Series (mixed date strings):\n", mixed_dates)

# Convert the Series to datetime objects, specifying mixed formats and dayfirst
datetime_series = pd.to_datetime(mixed_dates, format='mixed', dayfirst=True)
print("\nConverted Series (datetime objects):\n", datetime_series)
print("Series dtype:", datetime_series.dtype)


--- Scenario 2: Converting mixed date formats in a Series ---
Original Series (mixed date strings):
 0              2024-03-15
1              16/03/2024
2    March 17, 2024 10:30
3                20240318
dtype: object

Converted Series (datetime objects):
 0   2024-03-15 00:00:00
1   2024-03-16 00:00:00
2   2024-03-17 10:30:00
3   2024-03-18 00:00:00
dtype: datetime64[ns]
Series dtype: datetime64[ns]


### C. Scenario 3: Converting a column that also contains time information

In [54]:
# Scenario 3: Converting a column that also contains time information

print("\n--- Scenario 3: Converting strings with time information ---")
data_time = {
    'Timestamp_String': ['2023-05-10 10:00:00', '2023-05-10 11:30:00', '2023-05-11 09:15:00'],
    'Sensor_Reading': [25.5, 26.1, 24.9]
}
df_time = pd.DataFrame(data_time)
df_time['Timestamp'] = pd.to_datetime(df_time['Timestamp_String'])
df_time = df_time.set_index('Timestamp')
print("\nDataFrame with DatetimeIndex (including time):\n", df_time)
print("Index type:", type(df_time.index))
print("Index dtype:", df_time.index.dtype)


--- Scenario 3: Converting strings with time information ---

DataFrame with DatetimeIndex (including time):
                         Timestamp_String  Sensor_Reading
Timestamp                                               
2023-05-10 10:00:00  2023-05-10 10:00:00            25.5
2023-05-10 11:30:00  2023-05-10 11:30:00            26.1
2023-05-11 09:15:00  2023-05-11 09:15:00            24.9
Index type: <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Index dtype: datetime64[ns]


# 2. Using pd.date_range() to generate a DatetimeIndex

This is ideal when you need to create a new time series from scratch with a defined start, end, and frequency, or a specific number of periods.

### A. Scenario 4: Generating a daily DatetimeIndex

In [55]:
# Create a daily date range for 7 days starting from a specific date

daily_dates = pd.date_range(start='2023-06-01', periods=7, freq='D')
s1 = pd.Series(range(7), index=daily_dates)
print("Series with daily DatetimeIndex:\n", s1)
print("Index type:", type(s1.index))
print("Index dtype:", s1.index.dtype)

Series with daily DatetimeIndex:
 2023-06-01    0
2023-06-02    1
2023-06-03    2
2023-06-04    3
2023-06-05    4
2023-06-06    5
2023-06-07    6
Freq: D, dtype: int64
Index type: <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Index dtype: datetime64[ns]


### B. Scenario 5: Generating a monthly-end DatetimeIndex

In [56]:
# Create a monthly-end date range for a full year

monthly_dates = pd.date_range(start='2024-01-01', end='2024-12-31', freq='M')
df_monthly = pd.DataFrame({'Revenue': [i * 1000 + 500 for i in range(12)]}, index=monthly_dates)
print("DataFrame with monthly-end DatetimeIndex:\n", df_monthly)
print("Index type:", type(df_monthly.index))
print("Index dtype:", df_monthly.index.dtype)

DataFrame with monthly-end DatetimeIndex:
             Revenue
2024-01-31      500
2024-02-29     1500
2024-03-31     2500
2024-04-30     3500
2024-05-31     4500
2024-06-30     5500
2024-07-31     6500
2024-08-31     7500
2024-09-30     8500
2024-10-31     9500
2024-11-30    10500
2024-12-31    11500
Index type: <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Index dtype: datetime64[ns]


  monthly_dates = pd.date_range(start='2024-01-01', end='2024-12-31', freq='M')


### C. Scenario 6: Generating an hourly DatetimeIndex

In [57]:
# Create an hourly date range for a specific day

hourly_dates = pd.date_range(start='2023-11-15 09:00', end='2023-11-15 12:00', freq='H')
s_hourly = pd.Series([10, 12, 11, 13], index=hourly_dates)
print("Series with hourly DatetimeIndex:\n", s_hourly)
print("Index type:", type(s_hourly.index))
print("Index dtype:", s_hourly.index.dtype)

Series with hourly DatetimeIndex:
 2023-11-15 09:00:00    10
2023-11-15 10:00:00    12
2023-11-15 11:00:00    11
2023-11-15 12:00:00    13
Freq: h, dtype: int64
Index type: <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Index dtype: datetime64[ns]


  hourly_dates = pd.date_range(start='2023-11-15 09:00', end='2023-11-15 12:00', freq='H')


### 3. Creating Individual Timestamp Objects with pd.Timestamp()

While pd.to_datetime() is great for Series or DataFrames, pd.Timestamp() is perfect for creating a single, precise datetime object from a string or other time components. You can then easily access its various attributes.me.

### A. Creating a single Timestamp object and accessing attributes

In [63]:
import pandas as pd

# Convert a string to a Timestamp object
# Pandas is smart enough to parse various common date/time formats

timestamp_var = pd.Timestamp('24-09-1988 11:30:45.001452')
print("Created Timestamp:", timestamp_var)
print("Timestamp type:", type(timestamp_var))



# Get and print various date and time attributes from the Timestamp object

print('\nExtracted Attributes:')
print('Year:', timestamp_var.year)
print('Month:', timestamp_var.month)
print('Day:', timestamp_var.day)
print('Hour:', timestamp_var.hour)
print('Minute:', timestamp_var.minute)
print('Second:', timestamp_var.second)
print('Microsecond:', timestamp_var.microsecond)
print('Nanosecond:', timestamp_var.nanosecond) # Nanoseconds are typically 0 unless specified

# You can also get other useful attributes

print('Day of Week (Monday=0, Sunday=6):', timestamp_var.dayofweek)
print('Day Name:', timestamp_var.day_name())
print('Quarter:', timestamp_var.quarter)
print('Is Leap Year:', timestamp_var.is_leap_year)
print('Date Component:', timestamp_var.date())
print('Time Component:', timestamp_var.time())

Created Timestamp: 1988-09-24 11:30:45.001452
Timestamp type: <class 'pandas._libs.tslibs.timestamps.Timestamp'>

Extracted Attributes:
Year: 1988
Month: 9
Day: 24
Hour: 11
Minute: 30
Second: 45
Microsecond: 1452
Nanosecond: 0
Day of Week (Monday=0, Sunday=6): 5
Day Name: Saturday
Quarter: 3
Is Leap Year: True
Date Component: 1988-09-24
Time Component: 11:30:45.001452


# COMPLETED