<h3>1. pd.to_datetime()</h3>
This function is used to convert a column or series into datetime format. It is versatile and can handle various date formats.



In [7]:
import pandas as pd
from datetime import datetime, timedelta

# Sample data
data = {
    'date_column': [
        '2021-01-01', '2022-03-15', '2020-06-30', 
        '2019-11-25', '2021-09-01', '2023-02-18',
        '2020-08-12', '2021-12-24'
    ],
    'time_column': [
        '12:30:45', '15:45:00', '08:10:20',
        '22:55:12', '06:12:45', '14:34:00',
        '03:21:50', '19:58:10'
    ]
}



In [9]:
# Create DataFrame
df = pd.DataFrame(data)



In [11]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   date_column  8 non-null      object
 1   time_column  8 non-null      object
dtypes: object(2)
memory usage: 260.0+ bytes


errors='coerce': Invalid parsing will be set as NaT.

utc=True: Convert the date to UTC timezone.

In [13]:
# Convert the 'date_column' to datetime format
df['date_column'] = pd.to_datetime(df['date_column'])

In [15]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype         
---  ------       --------------  -----         
 0   date_column  8 non-null      datetime64[ns]
 1   time_column  8 non-null      object        
dtypes: datetime64[ns](1), object(1)
memory usage: 260.0+ bytes


In [17]:
# Combine date and time columns into a single datetime column
df['datetime_column'] = pd.to_datetime(df['date_column'].dt.date.astype(str) + ' ' + df['time_column'])

print(df)

  date_column time_column     datetime_column
0  2021-01-01    12:30:45 2021-01-01 12:30:45
1  2022-03-15    15:45:00 2022-03-15 15:45:00
2  2020-06-30    08:10:20 2020-06-30 08:10:20
3  2019-11-25    22:55:12 2019-11-25 22:55:12
4  2021-09-01    06:12:45 2021-09-01 06:12:45
5  2023-02-18    14:34:00 2023-02-18 14:34:00
6  2020-08-12    03:21:50 2020-08-12 03:21:50
7  2021-12-24    19:58:10 2021-12-24 19:58:10


In [19]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 3 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   date_column      8 non-null      datetime64[ns]
 1   time_column      8 non-null      object        
 2   datetime_column  8 non-null      datetime64[ns]
dtypes: datetime64[ns](2), object(1)
memory usage: 324.0+ bytes


In [21]:
# 1. Extracting year, month, and day
df['year'] = df['date_column'].dt.year


In [23]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   date_column      8 non-null      datetime64[ns]
 1   time_column      8 non-null      object        
 2   datetime_column  8 non-null      datetime64[ns]
 3   year             8 non-null      int32         
dtypes: datetime64[ns](2), int32(1), object(1)
memory usage: 356.0+ bytes


In [25]:
df['month'] = df['date_column'].dt.month


In [27]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 5 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   date_column      8 non-null      datetime64[ns]
 1   time_column      8 non-null      object        
 2   datetime_column  8 non-null      datetime64[ns]
 3   year             8 non-null      int32         
 4   month            8 non-null      int32         
dtypes: datetime64[ns](2), int32(2), object(1)
memory usage: 388.0+ bytes


<h3>dt accessor</h3>

The dt accessor allows you to extract specific components from a datetime column, such as year, month, day, etc.

In [29]:
df['day'] = df['date_column'].dt.day

In [31]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 6 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   date_column      8 non-null      datetime64[ns]
 1   time_column      8 non-null      object        
 2   datetime_column  8 non-null      datetime64[ns]
 3   year             8 non-null      int32         
 4   month            8 non-null      int32         
 5   day              8 non-null      int32         
dtypes: datetime64[ns](2), int32(3), object(1)
memory usage: 420.0+ bytes


In [33]:
print(df)

  date_column time_column     datetime_column  year  month  day
0  2021-01-01    12:30:45 2021-01-01 12:30:45  2021      1    1
1  2022-03-15    15:45:00 2022-03-15 15:45:00  2022      3   15
2  2020-06-30    08:10:20 2020-06-30 08:10:20  2020      6   30
3  2019-11-25    22:55:12 2019-11-25 22:55:12  2019     11   25
4  2021-09-01    06:12:45 2021-09-01 06:12:45  2021      9    1
5  2023-02-18    14:34:00 2023-02-18 14:34:00  2023      2   18
6  2020-08-12    03:21:50 2020-08-12 03:21:50  2020      8   12
7  2021-12-24    19:58:10 2021-12-24 19:58:10  2021     12   24


In [35]:
df['weekday'] = df['date_column'].dt.weekday  # Monday=0, Sunday=6

In [37]:
print(df)

  date_column time_column     datetime_column  year  month  day  weekday
0  2021-01-01    12:30:45 2021-01-01 12:30:45  2021      1    1        4
1  2022-03-15    15:45:00 2022-03-15 15:45:00  2022      3   15        1
2  2020-06-30    08:10:20 2020-06-30 08:10:20  2020      6   30        1
3  2019-11-25    22:55:12 2019-11-25 22:55:12  2019     11   25        0
4  2021-09-01    06:12:45 2021-09-01 06:12:45  2021      9    1        2
5  2023-02-18    14:34:00 2023-02-18 14:34:00  2023      2   18        5
6  2020-08-12    03:21:50 2020-08-12 03:21:50  2020      8   12        2
7  2021-12-24    19:58:10 2021-12-24 19:58:10  2021     12   24        4


In [39]:
# 2. Checking if date falls within a range
df['is_between'] = df['date_column'].between('2020-01-01', '2022-12-31')

In [49]:
df[['is_between','date_column']]


Unnamed: 0,is_between,date_column
0,True,2021-01-01
1,True,2022-03-15
2,True,2020-06-30
3,False,2019-11-25
4,True,2021-09-01
5,False,2023-02-18
6,True,2020-08-12
7,True,2021-12-24



<h3>pd.to_timedelta()</h3>

This function is used to convert a series or list of time differences into timedelta objects (used for date arithmetic).

You can perform arithmetic operations with datetime objects using Timedelta.

# Adding time to a date
df['date_column'] + pd.Timedelta(days=5)  # Adds 5 days

# Subtracting time from a date
df['date_column'] - pd.Timedelta(weeks=1)  # Subtracts 1 week


In [51]:
# 3. Adding days using Timedelta
df['date_plus_5_days'] = df['date_column'] + pd.Timedelta(days=5)

In [53]:
df[['date_plus_5_days','date_column']]

Unnamed: 0,date_plus_5_days,date_column
0,2021-01-06,2021-01-01
1,2022-03-20,2022-03-15
2,2020-07-05,2020-06-30
3,2019-11-30,2019-11-25
4,2021-09-06,2021-09-01
5,2023-02-23,2023-02-18
6,2020-08-17,2020-08-12
7,2021-12-29,2021-12-24


<h2>strftime() and strptime()</h2>
strftime(): Formats a datetime object as a string.

strptime(): Converts a string to a datetime object using a specified format.

In [55]:
# 4. Formatting datetime to string
df['formatted_date'] = df['date_column'].dt.strftime('%Y-%m-%d')

In [91]:
#Convert string to datetime with a specific format
df['date_column'] = pd.to_datetime(df['date_column'], format='%Y-%m-%d')

In [93]:
df

Unnamed: 0,date_column,time_column,datetime_column,year,month,day,weekday,is_between,date_plus_5_days,formatted_date,is_leap_year
0,2021-01-01,12:30:45,2021-01-01 12:30:45,2021,1,1,4,True,2021-01-06,2021-01-01,False
1,2022-03-15,15:45:00,2022-03-15 15:45:00,2022,3,15,1,True,2022-03-20,2022-03-15,False
2,2020-06-30,08:10:20,2020-06-30 08:10:20,2020,6,30,1,True,2020-07-05,2020-06-30,True
3,2019-11-25,22:55:12,2019-11-25 22:55:12,2019,11,25,0,False,2019-11-30,2019-11-25,False
4,2021-09-01,06:12:45,2021-09-01 06:12:45,2021,9,1,2,True,2021-09-06,2021-09-01,False
5,2023-02-18,14:34:00,2023-02-18 14:34:00,2023,2,18,5,False,2023-02-23,2023-02-18,False
6,2020-08-12,03:21:50,2020-08-12 03:21:50,2020,8,12,2,True,2020-08-17,2020-08-12,True
7,2021-12-24,19:58:10,2021-12-24 19:58:10,2021,12,24,4,True,2021-12-29,2021-12-24,False


In [95]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 11 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   date_column       8 non-null      datetime64[ns]
 1   time_column       8 non-null      object        
 2   datetime_column   8 non-null      datetime64[ns]
 3   year              8 non-null      int32         
 4   month             8 non-null      int32         
 5   day               8 non-null      int32         
 6   weekday           8 non-null      int32         
 7   is_between        8 non-null      bool          
 8   date_plus_5_days  8 non-null      datetime64[ns]
 9   formatted_date    8 non-null      object        
 10  is_leap_year      8 non-null      bool          
dtypes: bool(2), datetime64[ns](3), int32(4), object(2)
memory usage: 596.0+ bytes


In [59]:
# 5. Checking if the year is a leap year
df['is_leap_year'] = df['date_column'].dt.is_leap_year

In [61]:
df

Unnamed: 0,date_column,time_column,datetime_column,year,month,day,weekday,is_between,date_plus_5_days,formatted_date,is_leap_year
0,2021-01-01,12:30:45,2021-01-01 12:30:45,2021,1,1,4,True,2021-01-06,2021-01-01,False
1,2022-03-15,15:45:00,2022-03-15 15:45:00,2022,3,15,1,True,2022-03-20,2022-03-15,False
2,2020-06-30,08:10:20,2020-06-30 08:10:20,2020,6,30,1,True,2020-07-05,2020-06-30,True
3,2019-11-25,22:55:12,2019-11-25 22:55:12,2019,11,25,0,False,2019-11-30,2019-11-25,False
4,2021-09-01,06:12:45,2021-09-01 06:12:45,2021,9,1,2,True,2021-09-06,2021-09-01,False
5,2023-02-18,14:34:00,2023-02-18 14:34:00,2023,2,18,5,False,2023-02-23,2023-02-18,False
6,2020-08-12,03:21:50,2020-08-12 03:21:50,2020,8,12,2,True,2020-08-17,2020-08-12,True
7,2021-12-24,19:58:10,2021-12-24 19:58:10,2021,12,24,4,True,2021-12-29,2021-12-24,False


In [63]:
# 6. Resampling data (e.g., grouping by year and getting the count)
df_resampled = df.resample('Y', on='date_column').count()

  df_resampled = df.resample('Y', on='date_column').count()


In [65]:
df_resampled

Unnamed: 0_level_0,time_column,datetime_column,year,month,day,weekday,is_between,date_plus_5_days,formatted_date,is_leap_year
date_column,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2019-12-31,1,1,1,1,1,1,1,1,1,1
2020-12-31,2,2,2,2,2,2,2,2,2,2
2021-12-31,3,3,3,3,3,3,3,3,3,3
2022-12-31,1,1,1,1,1,1,1,1,1,1
2023-12-31,1,1,1,1,1,1,1,1,1,1


In [71]:
#pd.date_range()
#Generates a sequence of dates.

In [73]:
# Generate a date range
date_range = pd.date_range(start='2021-01-01', end='2021-12-31', freq='D')  # Daily frequency


In [75]:
date_range

DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
               '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08',
               '2021-01-09', '2021-01-10',
               ...
               '2021-12-22', '2021-12-23', '2021-12-24', '2021-12-25',
               '2021-12-26', '2021-12-27', '2021-12-28', '2021-12-29',
               '2021-12-30', '2021-12-31'],
              dtype='datetime64[ns]', length=365, freq='D')

In [81]:
date_range1 = pd.date_range(start='2021-01-01', end='2021-12-31', freq='ME')  # Daily frequency


In [83]:
date_range1

DatetimeIndex(['2021-01-31', '2021-02-28', '2021-03-31', '2021-04-30',
               '2021-05-31', '2021-06-30', '2021-07-31', '2021-08-31',
               '2021-09-30', '2021-10-31', '2021-11-30', '2021-12-31'],
              dtype='datetime64[ns]', freq='ME')

<h1>DateOffset</h1> 
Used to create offsets for date manipulation (e.g., adding months, days, etc.).


In [103]:
from pandas.tseries.offsets import DateOffset

# Adding 1 month to a date
df['new_date'] = df['date_column'] + DateOffset(months=1)

# Adding 1 year to a date
df['new_date_year'] = df['date_column'] + DateOffset(years=1)


In [105]:
df

Unnamed: 0,date_column,time_column,datetime_column,year,month,day,weekday,is_between,date_plus_5_days,formatted_date,is_leap_year,new_date,new_date_year
0,2021-01-01,12:30:45,2021-01-01 12:30:45,2021,1,1,4,True,2021-01-06,2021-01-01,False,2021-02-01,2022-01-01
1,2022-03-15,15:45:00,2022-03-15 15:45:00,2022,3,15,1,True,2022-03-20,2022-03-15,False,2022-04-15,2023-03-15
2,2020-06-30,08:10:20,2020-06-30 08:10:20,2020,6,30,1,True,2020-07-05,2020-06-30,True,2020-07-30,2021-06-30
3,2019-11-25,22:55:12,2019-11-25 22:55:12,2019,11,25,0,False,2019-11-30,2019-11-25,False,2019-12-25,2020-11-25
4,2021-09-01,06:12:45,2021-09-01 06:12:45,2021,9,1,2,True,2021-09-06,2021-09-01,False,2021-10-01,2022-09-01
5,2023-02-18,14:34:00,2023-02-18 14:34:00,2023,2,18,5,False,2023-02-23,2023-02-18,False,2023-03-18,2024-02-18
6,2020-08-12,03:21:50,2020-08-12 03:21:50,2020,8,12,2,True,2020-08-17,2020-08-12,True,2020-09-12,2021-08-12
7,2021-12-24,19:58:10,2021-12-24 19:58:10,2021,12,24,4,True,2021-12-29,2021-12-24,False,2022-01-24,2022-12-24


<h1>now vs. utcnow</h1> 
datetime.now() gives the local current time.
datetime.utcnow() gives the current time in UTC.