## Time Series
- Time Series is a Series of data points indexed (or listed or graphed) in time order. Therefore the - data, is organized by relatively deterministic timestamps, and may be compared to random sample - data.

- Time series data is an important form of structured data in many different fields, such as finance, - economics, ecology, neuroscience, and physics. Anything that is observed or measured at many points - in time forms a time series.

- Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a - model to predict future values based on previously observed values.

- While regression analysis is often employed in such a way as to test theories that the current values of one or more independent time series affect the current value of another time series.

- How you mark and refer to time series data depends on the application, and you may have one of the following:

- Predicting Stock price or predicting the weather conditions for tomorrow,time-series has significant role to play.

- Forecasting the birth rate at all hospitals in a city each year.
- Forecasting product sales in units sold each day for a store.
- Forecasting the number of passengers through a train station each day.

In [1]:
from datetime import datetime

#### Collect present details

In [2]:
a =datetime.now()
a

datetime.datetime(2020, 2, 13, 11, 31, 17, 809214)

In [9]:
a.year


2020

In [10]:
a.date()

datetime.date(2020, 2, 13)

In [11]:
print('sum of {} and {} = {}'.format(2,5,2+5))


sum of 2 and 5 = 7


In [6]:
now = datetime.now()
print(now)
print('Date now :{}-{}-{}'.format(now.day, now.month, now.year))
print('Time now :{}:{}:{}'.format(now.hour,now.minute, now.second))

2020-02-13 11:32:28.645225
Date now :13-2-2020
Time now :11:32:28


### Datetime stores both the date and time down to the microsecond. timedelta represents the temporal difference between two datetime objects:

In [19]:
delta = datetime(2020, 2, 11) - datetime(2020, 2, 13, 14, 30)
delta

datetime.timedelta(days=-3, seconds=34200)

In [20]:
datetime.now() - datetime(1998,5,28,17,45)

datetime.timedelta(days=7930, seconds=64464, microseconds=276508)

In [21]:
delta.seconds

34200

In [23]:
td = datetime(2020, 2, 11) - datetime(2020, 2, 13, 14, 30)
td

datetime.timedelta(days=-3, seconds=34200)

In [24]:
(td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6

-225000.0

In [25]:
# using timedelta
from datetime import *
datetime(2011,12,26) + timedelta(365)

datetime.datetime(2012, 12, 25, 0, 0)

### Converting Between String and Datetime

- The strptime() method creates a datetime object from the given string.

- Note: You cannot create datetime object from every string. The string needs to be in a certain format.



In [26]:
value = '2019 december 12' 
datetime.strptime(value, '%Y %B %d')

datetime.datetime(2019, 12, 12, 0, 0)

## Datetime format specification (ISO C89 compatible)
- %Y - 4-digit year
- %y - 2-digit year
- %m - 2-digit month [01, 12]
- %d - 2-digit day [01, 31]
- %H - Hour (24-hour clock) [00, 23]
- %I - Hour (12-hour clock) [01, 12]
- %M - 2-digit minute [00, 59]
- %S - Second [00, 61] (seconds 60, 61 account for leap seconds)
- %w - Weekday as integer [0 (Sunday), 6]
- %U - Week number of the year [00, 53]. Sunday is considered the first day of the week, and days - - - before the first Sunday of the year are “week 0”.
- %W - Week number of the year [00, 53]. Monday is considered the first day of the week, and days - - - before the first Monday of the year are “week 0”.
- %z - UTC time zone offset as +HHMM or -HHMM, empty if time zone naive
- %F - Shortcut for %Y-%m-%d, for example 2012-4-18
- %D - Shortcut for %m/%d/%y, for example 04/18/12
## Locale-specific date formatting
- %a - Abbreviated weekday name
- %A - Full weekday name
- %b - Abbreviated month name
- %B - Full month name
- %c - Full date and time, for example ‘Tue 01 May 2012 04:20:57 PM’
- %p - Locale equivalent of AM or PM
- %x - Locale-appropriate formatted date; e.g. in US May 1, 2012 yields ’05/01/2012’
- %X - Locale-appropriate time, e.g. ’04:24:12 PM’

In [27]:
datestrs = ['11-December-2019', '26-December-1993']
[datetime.strptime(x,"%d-%B-%Y") for x in datestrs]

[datetime.datetime(2019, 12, 11, 0, 0), datetime.datetime(1993, 12, 26, 0, 0)]

In [28]:
a= date.today()
a

datetime.date(2020, 2, 13)

In [29]:
a.strftime('%d  %m  %Y')

'13  02  2020'

In [30]:
dt_string = "12/11/2018 09:15:32"
# Considering date is in dd/mm/yyyy format
dt_object1 = datetime.strptime(dt_string, "%d/%m/%Y %H:%M:%S")
print("dt_object1 =", dt_object1)
# Considering date is in mm/dd/yyyy format
dt_object2 = datetime.strptime(dt_string, "%m/%d/%Y %H:%M:%S")
print("dt_object2 =", dt_object2)

dt_object1 = 2018-11-12 09:15:32
dt_object2 = 2018-12-11 09:15:32


### Generating data range


- [offset aliases](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html)

In [32]:
import pandas as pd
pd.date_range('2000-01-01', '2001-01-01', freq='6M')

DatetimeIndex(['2000-01-31', '2000-07-31'], dtype='datetime64[ns]', freq='6M')

In [34]:
pd.date_range('2000-01-01', '2001-01-01', freq='BM')

DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-28',
               '2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',
               '2000-09-29', '2000-10-31', '2000-11-30', '2000-12-29'],
              dtype='datetime64[ns]', freq='BM')

In [35]:
rng = pd.date_range('2019-01-01', '2020-01-01', freq='WOM-4FRI')
rng

DatetimeIndex(['2019-01-25', '2019-02-22', '2019-03-22', '2019-04-26',
               '2019-05-24', '2019-06-28', '2019-07-26', '2019-08-23',
               '2019-09-27', '2019-10-25', '2019-11-22', '2019-12-27'],
              dtype='datetime64[ns]', freq='WOM-4FRI')

In [36]:
pd.date_range('2019-01-01', '2020-01-01', freq='D')

DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',
               '2019-01-05', '2019-01-06', '2019-01-07', '2019-01-08',
               '2019-01-09', '2019-01-10',
               ...
               '2019-12-23', '2019-12-24', '2019-12-25', '2019-12-26',
               '2019-12-27', '2019-12-28', '2019-12-29', '2019-12-30',
               '2019-12-31', '2020-01-01'],
              dtype='datetime64[ns]', length=366, freq='D')

### Create a Data Series with data range as index


In [37]:

import pandas as pd
import numpy as np
longer_ts = pd.Series(np.random.randn(1000),
                      index=pd.date_range('1/1/2000', periods=1000))
longer_ts

2000-01-01   -1.035165
2000-01-02   -0.105818
2000-01-03   -0.678120
2000-01-04    0.354018
2000-01-05    0.922401
                ...   
2002-09-22    0.521529
2002-09-23   -0.171666
2002-09-24    1.566843
2002-09-25    1.501293
2002-09-26    0.281345
Freq: D, Length: 1000, dtype: float64

In [38]:
longer_ts['2001']

2001-01-01    0.934788
2001-01-02   -1.746761
2001-01-03   -1.325442
2001-01-04    0.563798
2001-01-05   -0.409987
                ...   
2001-12-27   -1.461299
2001-12-28   -0.152775
2001-12-29    0.579588
2001-12-30    1.091576
2001-12-31   -1.345360
Freq: D, Length: 365, dtype: float64

In [39]:
longer_ts['2001-05']

2001-05-01   -0.037713
2001-05-02   -0.716731
2001-05-03   -1.783294
2001-05-04    0.161208
2001-05-05   -1.486790
2001-05-06    0.115736
2001-05-07   -0.422307
2001-05-08    0.829415
2001-05-09    1.508537
2001-05-10    0.557393
2001-05-11    0.076024
2001-05-12   -0.308470
2001-05-13    0.436586
2001-05-14   -0.243743
2001-05-15   -0.005466
2001-05-16    1.250341
2001-05-17    0.627963
2001-05-18    0.095354
2001-05-19    1.126371
2001-05-20    1.503271
2001-05-21    1.230497
2001-05-22   -0.958886
2001-05-23    1.922828
2001-05-24   -1.911466
2001-05-25    1.937470
2001-05-26   -1.689481
2001-05-27   -1.290522
2001-05-28   -0.902891
2001-05-29    1.018093
2001-05-30    0.289841
2001-05-31    0.441249
Freq: D, dtype: float64

In [40]:
dates = pd.date_range('1/1/2019', periods=100, freq='W-WED')
dates

DatetimeIndex(['2019-01-02', '2019-01-09', '2019-01-16', '2019-01-23',
               '2019-01-30', '2019-02-06', '2019-02-13', '2019-02-20',
               '2019-02-27', '2019-03-06', '2019-03-13', '2019-03-20',
               '2019-03-27', '2019-04-03', '2019-04-10', '2019-04-17',
               '2019-04-24', '2019-05-01', '2019-05-08', '2019-05-15',
               '2019-05-22', '2019-05-29', '2019-06-05', '2019-06-12',
               '2019-06-19', '2019-06-26', '2019-07-03', '2019-07-10',
               '2019-07-17', '2019-07-24', '2019-07-31', '2019-08-07',
               '2019-08-14', '2019-08-21', '2019-08-28', '2019-09-04',
               '2019-09-11', '2019-09-18', '2019-09-25', '2019-10-02',
               '2019-10-09', '2019-10-16', '2019-10-23', '2019-10-30',
               '2019-11-06', '2019-11-13', '2019-11-20', '2019-11-27',
               '2019-12-04', '2019-12-11', '2019-12-18', '2019-12-25',
               '2020-01-01', '2020-01-08', '2020-01-15', '2020-01-22',
      

In [41]:

long_df = pd.DataFrame(np.random.randn(100, 4), 
                        index=dates, 
                       columns=['Colorado','Texas','New York', 'Ohio'])


In [43]:
long_df.head()

Unnamed: 0,Colorado,Texas,New York,Ohio
2019-01-02,1.700526,1.230644,1.354581,0.169942
2019-01-09,-0.044499,-1.385786,1.307881,3.111295
2019-01-16,1.367792,0.616747,-1.474092,0.34727
2019-01-23,0.976271,0.658303,-0.321321,0.729367
2019-01-30,1.255295,2.474693,1.932839,1.062833


In [44]:
long_df.loc['5-2019']

Unnamed: 0,Colorado,Texas,New York,Ohio
2019-05-01,-0.797864,-0.28848,-0.369869,1.75924
2019-05-08,0.55734,-0.626431,0.46164,-0.591556
2019-05-15,-2.037685,0.155189,-0.29122,-0.101812
2019-05-22,-0.070325,-0.827663,-2.080966,-1.696244
2019-05-29,0.29873,0.932624,-0.270876,1.348643


In [45]:
pd.date_range(start='2012-04-01',end ='2012-06-01',freq="W-Thu")

DatetimeIndex(['2012-04-05', '2012-04-12', '2012-04-19', '2012-04-26',
               '2012-05-03', '2012-05-10', '2012-05-17', '2012-05-24',
               '2012-05-31'],
              dtype='datetime64[ns]', freq='W-THU')

In [46]:
#freq can also be specified as an Offset object.
pd.date_range(start='1/1/2020', periods=5, freq=pd.offsets.MonthEnd(3))

DatetimeIndex(['2020-01-31', '2020-04-30', '2020-07-31', '2020-10-31',
               '2021-01-31'],
              dtype='datetime64[ns]', freq='3M')

In [47]:
#freq can also be specified as an Offset object.
pd.date_range(start='1/1/2019', periods=5, freq='3M')

DatetimeIndex(['2019-01-31', '2019-04-30', '2019-07-31', '2019-10-31',
               '2020-01-31'],
              dtype='datetime64[ns]', freq='3M')