# Working with date and time using Pandas

While working with data, encountering time series data is very usual. Pandas is a very useful tool while working with time series data. 

Pandas provide a different set of tools using which we can perform all the necessary tasks on date-time data. Let’s try to understand with the examples discussed below.

#### Create a dates dataframe 

In [1]:
import pandas as pd
 
# Create dates dataframe with frequency 
data = pd.date_range('1/1/2011', periods = 10, freq ='H')
 
data

DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
               '2011-01-01 02:00:00', '2011-01-01 03:00:00',
               '2011-01-01 04:00:00', '2011-01-01 05:00:00',
               '2011-01-01 06:00:00', '2011-01-01 07:00:00',
               '2011-01-01 08:00:00', '2011-01-01 09:00:00'],
              dtype='datetime64[ns]', freq='H')

#### Create range of dates and show basic features 

In [2]:
# Create date and time with dataframe
data = pd.date_range('1/1/2011', periods = 10, freq ='H')
 
x = pd.datetime.now()
x.month, x.year

month = x.month
year = x.year

  x = pd.datetime.now()


(8, 2022)

Datetime features can be divided into two categories. The first one time moments in a period and second the time passed since a particular period. These features can be very useful to understand the patterns in the data.

## Divide a given date into features – 

- pandas.Series.dt.year returns the year of the date time. 
- pandas.Series.dt.month returns the month of the date time. 
- pandas.Series.dt.day returns the day of the date time. 
- pandas.Series.dt.hour returns the hour of the date time. 
- pandas.Series.dt.minute returns the minute of the date time.
- Refer all datetime properties from here : https://pandas.pydata.org/pandas-docs/stable/reference/index.html

#### Break date and time into separate features  

In [4]:
# Create date and time with dataframe
rng = pd.DataFrame()
rng['date'] = pd.date_range('1/1/2011', periods = 72, freq ='H')
 
# Print the dates in dd-mm-yy format
rng[:5]
 


Unnamed: 0,date
0,2011-01-01 00:00:00
1,2011-01-01 01:00:00
2,2011-01-01 02:00:00
3,2011-01-01 03:00:00
4,2011-01-01 04:00:00


In [5]:
# Create features for year, month, day, hour, and minute
rng['year'] = rng['date'].dt.year
rng['month'] = rng['date'].dt.month
rng['day'] = rng['date'].dt.day
rng['hour'] = rng['date'].dt.hour
rng['minute'] = rng['date'].dt.minute
 
# Print the dates divided into features
rng.head(3)

Unnamed: 0,date,year,month,day,hour,minute
0,2011-01-01 00:00:00,2011,1,1,0,0
1,2011-01-01 01:00:00,2011,1,1,1,0
2,2011-01-01 02:00:00,2011,1,1,2,0


To get the present time, use Timestamp.now() and then convert timestamp to datetime and directly access year, month or day.

In [10]:
t = pd.Timestamp.now()
type(t)

pandas._libs.tslibs.timestamps.Timestamp

In [9]:
# Convert timestamp to datetime
t.to_datetime()

AttributeError: 'Timestamp' object has no attribute 'to_datetime'

In [None]:
# Directly access and print the features
t.year
t.month
t.day
t.hour
t.minute
t.second

Let’s analyze this problem on a real dataset uforeports.

In [12]:
import pandas as pd
 
url = 'ufo.csv'
 
# read csv file
df = pd.read_csv(url)          
df.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [13]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18241 entries, 0 to 18240
Data columns (total 5 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   City             18216 non-null  object
 1   Colors Reported  2882 non-null   object
 2   Shape Reported   15597 non-null  object
 3   State            18241 non-null  object
 4   Time             18241 non-null  object
dtypes: object(5)
memory usage: 712.7+ KB


In [14]:
# Convert the Time column to datetime format
df['Time'] = pd.to_datetime(df.Time)
 
df.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,1930-06-01 22:00:00
1,Willingboro,,OTHER,NJ,1930-06-30 20:00:00
2,Holyoke,,OVAL,CO,1931-02-15 14:00:00
3,Abilene,,DISK,KS,1931-06-01 13:00:00
4,New York Worlds Fair,,LIGHT,NY,1933-04-18 19:00:00


In [15]:
# shows the type of each column data
df.dtypes

City                       object
Colors Reported            object
Shape Reported             object
State                      object
Time               datetime64[ns]
dtype: object

In [16]:
# Get hour detail from time data
df.Time.dt.hour.head()

0    22
1    20
2    14
3    13
4    19
Name: Time, dtype: int64

In [27]:
# Get name of each date
df.Time.dt.day_name().head()

0     Sunday
1     Monday
2     Sunday
3     Monday
4    Tuesday
Name: Time, dtype: object

In [18]:
# Get ordinal day of the year
df.Time.dt.dayofyear.head()

0    152
1    181
2     46
3    152
4    108
Name: Time, dtype: int64