Time and dates as types are useful, but what you really want is to be able to make them part of your indexable data in a series or Data Frame. You do this by creating a DateTimeIndex.

A time series is simply a series or Data Frame that has a DateTimeIndex as the index (row) label.

You can create a DateTimeIndex using an array or list of time information:

**DatetimeIndex([timestamp1, timestamp2, timestamp3..])**

If you have a lot of time to cover, you may prefer to set up an intervals of dates using the **date_range()** function. Date_range takes a starting and ending point and optionally a frequency (the default is day, so the range will include each day between the starting and ending date). The dates can be a Timestamp or a string with the date information.

In this example, we generate a date index (dates from 2016-01-01 to 2019-10-07) for 2500 random numbers (simulated data).

To generate a Time Series with a date range, you need to know how many values you have in your date range. You can use the **len()** function to find the number of values in the range. **len(pd.date_range('2016-01-01', '2019-10-07'))**

In [4]:
# create a time Series


import pandas as pd
import numpy as np

# option 1: using time stamps
t1 = pd.Timestamp('2016-01-01')
t2 = pd.Timestamp('2016-01-02')
t3 = pd.Timestamp('2016-01-03')

dr = pd.DatetimeIndex([t1,t2,t3])

ts = pd.Series(np.random.randn(3), index = dr)

print('Time series using a list of times to create the index')
print(ts)
print()

Time series using a list of times to create the index
2016-01-01   -0.647374
2016-01-02    0.698425
2016-01-03    1.613422
dtype: float64



In [6]:
# option 2: using date range
# create a date range: ever day between Jan 1 2016 and July 10 2019
dateRange = pd.date_range('2016-01-01','2019-10-07')

# create simulated data for these dates
# use len() to work out how many data value you need (one per date in the dateRange)
numberGen = len(dateRange) # 1376 numbers

# create a time series, using the dateRange as index
timeSeries = pd.Series(np.random.randn(numberGen), index = dateRange)

print('Time series using date_range to create the index')
print(timeSeries)

Time series using date_range to create the index
2016-01-01    1.229503
2016-01-02    1.446639
2016-01-03    2.176954
2016-01-04    1.047084
2016-01-05   -0.227227
                ...   
2019-10-03   -0.253456
2019-10-04    0.598865
2019-10-05   -0.577108
2019-10-06    0.169349
2019-10-07   -0.068497
Freq: D, Length: 1376, dtype: float64


The point of a time series is to be able to select the data based on the date. The date indexes can be used just like any other index and you can compare them to other dates to select specific data.

In [9]:
# select a species index
print('Select a specific index')
print("timeSeries['2019-10-4']: ",timeSeries['2019-10-4'])
print()

# select meeting a criteria on index
print('Select just the values before Sept 6 2018 and after Sept 2 2018')
print(timeSeries[(timeSeries.index < pd.Timestamp('September 6 2018'))
                 &(timeSeries.index >pd.Timestamp('September 2 2018'))])
print()

# select using a time delta
print('Today is', pd.Timestamp('now'))
print('Select the data from the last 13 days')
print(timeSeries[timeSeries.index > (pd.Timestamp('2019-10-07')-pd.Timedelta('13 days'))])

Select a specific index
timeSeries['2019-10-4']:  0.5988650857599176

Select just the values before Sept 6 2018 and after Sept 2 2018
2018-09-03    0.482283
2018-09-04    0.347461
2018-09-05    0.129873
Freq: D, dtype: float64

Today is 2021-05-27 21:13:39.631102
Select the data from the last 13 days
2019-09-25    1.756918
2019-09-26    0.783440
2019-09-27   -0.384135
2019-09-28    2.105463
2019-09-29    0.736659
2019-09-30   -2.239383
2019-10-01    0.409692
2019-10-02    1.279699
2019-10-03   -0.253456
2019-10-04    0.598865
2019-10-05   -0.577108
2019-10-06    0.169349
2019-10-07   -0.068497
Freq: D, dtype: float64


The Datetime objects that make up the DateTimeIndex have useful attributes that you can access about each of the index labels for use in comparisons:

year: the year of the datetime.\
month: the month as January=1, December=12.\
day: the days of the datetime\
hour: the hours of the datetime\
minute: the minutes of the datetime\
second: the seconds of the datetime\
microsecond: the microseconds of the datetime\
nanosecond: the nanoseconds of the datetime.

Note that these return numbers, not labels!

You can see a full list of attributes by accessing the Pandas API Reference documentation. 

https://pandas.pydata.org/pandas-docs/version/0.25/reference/api/pandas.DatetimeIndex.html

In [10]:
print(timeSeries.index.month)
print(timeSeries.index.day)

# create a list of which rows have a date in December
print(timeSeries.index.month == 12)

# count the number of December entries
shape = (timeSeries.index.month == 12).shape

print('There are', shape[0], 'data entries for December')

Int64Index([ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
            ...
             9,  9,  9, 10, 10, 10, 10, 10, 10, 10],
           dtype='int64', length=1376)
Int64Index([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10,
            ...
            28, 29, 30,  1,  2,  3,  4,  5,  6,  7],
           dtype='int64', length=1376)
[False False False ... False False False]
There are 1376 data entries for December


To get a better visualisation, we can convert the series to a Data Frame with column values.

In [11]:
timeFrame = pd.DataFrame(timeSeries, columns = ['data'])
timeFrame

Unnamed: 0,data
2016-01-01,1.229503
2016-01-02,1.446639
2016-01-03,2.176954
2016-01-04,1.047084
2016-01-05,-0.227227
...,...
2019-10-03,-0.253456
2019-10-04,0.598865
2019-10-05,-0.577108
2019-10-06,0.169349
