Time series data is an important form of structured data in many different fields, such as finance, economics, ecology, neuroscience, or physics. Anything that is observed or measured at many points in time forms a time series. Many time series are fixed frequency,which is to say that data points occur at regular intervals according to some rule, such as every 15 seconds, every 5 minutes, or once per month. Time series can also be irregular without a fixed unit or time or offset between units. How you mark and refer to time series data depends on the application and you may have one of the
following:
• Timestamps, specific instants in time
• Fixed periods, such as the month January 2007 or the full year 2010
• Intervals of time, indicated by a start and end timestamp. Periods can be thought
of as special cases of intervals
• Experiment or elapsed time; each timestamp is a measure of time relative to a
particular start time. For example, the diameter of a cookie baking each second
since being placed in the oven

Date and Time Data Types and Tools :The datetime, time, and calendar modules are the main places to start. The datetime.datetime type, or simply datetime, is widely used:

In [3]:
from datetime  import datetime

In [4]:
now = datetime.now()
now

datetime.datetime(2017, 9, 10, 9, 14, 18, 579195)

In [5]:
now.date(), now.hour , now.day , now.month ,now.year ,now.isoweekday() , now.time()

(datetime.date(2017, 9, 10),
 9,
 10,
 9,
 2017,
 7,
 datetime.time(9, 14, 18, 579195))

In [6]:
delta = datetime(2011, 1, 7) - datetime(2008, 6, 24, 8, 15)

In [7]:
delta

datetime.timedelta(926, 56700)

In [8]:
delta.days

926

Converting between string and datetime:

In [9]:
stamp = datetime(2011,1,3)

In [10]:
str(stamp) # String conversion from datetime to string

'2011-01-03 00:00:00'

In [11]:
stamp.strftime('%y-%m-%d') # strftime used for convert timedata to string

'11-01-03'

In [12]:
value = '11-01-03'
datetime.strptime(value,'%y-%m-%d')  # strftime used for convert string to datetime

datetime.datetime(2011, 1, 3, 0, 0)

In [13]:
listDate = ['11-01-03','13-05-13']
[datetime.strptime(x,'%y-%m-%d')for x in listDate]

[datetime.datetime(2011, 1, 3, 0, 0), datetime.datetime(2013, 5, 13, 0, 0)]

datetime.strptime is the best way to parse a date with a known format. However, it
can be a bit annoying to have to write a format spec each time, especially for common
date formats. In this case, you can use the parser.parse method in the third party
dateutil package:

In [21]:
from dateutil.parser import parse

In [22]:
parse('13-05-13')

datetime.datetime(2013, 5, 13, 0, 0)

In [23]:
parse('Jan 31, 1997 10:45 PM')

datetime.datetime(1997, 1, 31, 22, 45)

Time Series Basics:
The most basic kind of time series object in pandas is a Series indexed by timestamps,
which is often represented external to pandas as Python strings or datetime objects:

In [24]:
dates = [datetime(2011, 1, 2), datetime(2011, 1, 5), datetime(2011, 1, 7),
         datetime(2011, 1, 8), datetime(2011, 1, 10), datetime(2011, 1, 12)]

In [27]:
from pandas import Series,DataFrame as pd
import numpy as np

ts = Series(np.random.rand(6),index=dates)

In [28]:
ts

2011-01-02    0.942420
2011-01-05    0.197447
2011-01-07    0.534556
2011-01-08    0.068821
2011-01-10    0.001562
2011-01-12    0.972660
dtype: float64

In [29]:
type(ts)

pandas.core.series.Series

In [30]:
ts.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

In [31]:
ts

2011-01-02    0.942420
2011-01-05    0.197447
2011-01-07    0.534556
2011-01-08    0.068821
2011-01-10    0.001562
2011-01-12    0.972660
dtype: float64

In [32]:
ts + ts[::2]

2011-01-02    1.884839
2011-01-05         NaN
2011-01-07    1.069111
2011-01-08         NaN
2011-01-10    0.003124
2011-01-12         NaN
dtype: float64

In [33]:
ts.dtype

dtype('float64')

In [34]:
ts.index.dtype

dtype('<M8[ns]')

Indexing, Selection, Subsetting : TimeSeries is a subclass of Series and thus behaves in the same way with regard to
indexing and selecting data based on label.

In [35]:
stamp = ts.index[2]

In [36]:
stamp

Timestamp('2011-01-07 00:00:00')

In [37]:
ts[stamp]

0.53455560649584044

In [38]:
ts["2011-01-07"]

0.53455560649584044

In [41]:
# For longer time series, a year or only a year and month can be passed to easily select slices of data:

longer_ts = Series(np.random.randn(1000),index=pd.date_range('1/1/2000', periods=1000))

AttributeError: type object 'DataFrame' has no attribute 'date_range'

In [43]:
index = pd.date_range('4/1/2012', '6/1/2012')

AttributeError: type object 'DataFrame' has no attribute 'date_range'

In [49]:
from pandas.tseries.offsets import Hour, Minute

In [52]:
hour = Hour()
hour

<Hour>

In [53]:
four_hours = Hour(4)

In [54]:
four_hours

<4 * Hours>

In [57]:
import pytz as tz

In [60]:
tz.timezone('US/Eastern')

<DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>

In [61]:
ts_utc = ts.tz_localize('UTC')

In [62]:
ts_utc

2011-01-02 00:00:00+00:00    0.942420
2011-01-05 00:00:00+00:00    0.197447
2011-01-07 00:00:00+00:00    0.534556
2011-01-08 00:00:00+00:00    0.068821
2011-01-10 00:00:00+00:00    0.001562
2011-01-12 00:00:00+00:00    0.972660
dtype: float64