**Time series may be regular or irregular.**

* Many time series are fixed frequency, which is to say that data points occur at regular intervals according to some rule, such as every 15 seconds, every 5 minutes, or once per month.

* Time series can also be irregular without a fixed unit of time or offset between units. How you mark and refer to time series data depends on the application.

In [None]:
import calender
from datetime import datetime


datetime stores both the date and time down to the microsecond. datetime.time.


In [None]:
datetime.now()

datetime.datetime(2023, 5, 9, 6, 24, 17, 451992)

In [None]:
print(f'year {datetime.now().year}')
print(f'month {datetime.now().month}')
print(f'day {datetime.now().day}')

year 2023
month 5
day 9


delta, or simply timedelta, represents the temporal difference between two date time objects.

In [None]:
delta = datetime(2023,1,8) - datetime(1992,1,8)
delta

datetime.timedelta(days=11323)

In [None]:
delta.days

11323

#### **Types in the datetime module**

* **date** Store calendar date (year, month, day) using the Gregorian calendar.
* **time** Store time of day as hours, minutes, seconds, and microseconds
* **datetime** Store both date and time.
* **timedelta** The difference between two datetime values (as days, seconds, and microseconds).
* **tzinfo** Base type for storing time zone information.

In [None]:
stamp=datetime(2023,1,5)

In [None]:
str(stamp)

'2023-01-05 00:00:00'

In [None]:
stamp.strftime('%d-%m-%Y')

'05-01-2023'

#### **datetime format specification**

**%Y** Four-digit year.  
**%y** Two-digit year.  
**%m** Two-digit month [01, 12].  
**%d** Two-digit day [01, 31].  
**%H** Hour (24-hour clock) [00, 23].  
**%I** Hour (12-hour clock) [01, 12].  
**%M** Two-digit minute [00, 59].  
**%S** Second [00, 61] (seconds 60, 61 account for leap seconds).  


In [None]:
# convert strings to dates using date time.strptime
date_val='2023-05-06'
datetime.strptime(date_val,'%Y-%m-%d')

datetime.datetime(2023, 5, 6, 0, 0)

In [None]:
datestrs = ["7/6/2023", "8/6/2023"]

for i in datestrs:
  print(datetime.strptime(i,'%d/%m/%Y'))

date_val_list=[datetime.strptime(dt_value,'%d/%m/%Y') for dt_value in datestrs]
print(f'Value using list comprehension {date_val_list}')
  

2023-06-07 00:00:00
2023-06-08 00:00:00
Value using list comprehension [datetime.datetime(2023, 6, 7, 0, 0), datetime.datetime(2023, 6, 8, 0, 0)]


Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object.

In [None]:
import pandas as pd
pd.to_datetime(datestrs)

DatetimeIndex(['2023-07-06', '2023-08-06'], dtype='datetime64[ns]', freq=None)

In [None]:
pd.to_datetime(datestrs+[None])

DatetimeIndex(['2023-07-06', '2023-08-06', 'NaT'], dtype='datetime64[ns]', freq=None)

In [None]:
dates = [datetime(2011, 1, 2), datetime(2011, 1, 5),
         datetime(2011, 1, 7), datetime(2011, 1, 8),
         datetime(2011, 1, 10), datetime(2011, 1, 12)]


In [None]:
import numpy as np
df=pd.Series(np.random.standard_normal(6),index=dates)

In [None]:
df

2011-01-02   -2.134755
2011-01-05    0.803260
2011-01-07    0.309928
2011-01-08    0.008767
2011-01-10   -1.637279
2011-01-12   -2.129953
dtype: float64

In [None]:
df.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

pandas stores timestamps using NumPy’s datetime64 data type at the nanosecond resolution:

In [None]:
df.index.dtype

dtype('<M8[ns]')

In [None]:
df.index[0]

Timestamp('2011-01-02 00:00:00')

In [None]:
#accessing value using index
ts=df.index[0]
df[ts]

-2.1347547036383188

**pandas.date_range** is responsible for generating a DatetimeIndex with an indicated length according to a particular frequency:

In [None]:
df_ts=pd.Series(np.random.standard_normal(10),index=pd.date_range('2023-01-01',periods=10))
df_ts

2023-01-01   -0.106901
2023-01-02    0.648031
2023-01-03   -0.329163
2023-01-04    0.145729
2023-01-05   -0.550669
2023-01-06    2.164763
2023-01-07   -1.044961
2023-01-08    0.776637
2023-01-09   -0.940898
2023-01-10   -0.471904
Freq: D, dtype: float64

By default, **pandas.date_range** generates daily timestamps. If you pass only a start or end date,you must pass a number of periods to generate:

In [None]:
#The start and end dates define strict boundaries for the generated date index.
pd.date_range('2023-01-01',periods=10) #pd.date_range(start='2023-01-01',periods=10)

DatetimeIndex(['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04',
               '2023-01-05', '2023-01-06', '2023-01-07', '2023-01-08',
               '2023-01-09', '2023-01-10'],
              dtype='datetime64[ns]', freq='D')

In [None]:
pd.date_range(end='2023-05-07',periods=10)

DatetimeIndex(['2023-04-28', '2023-04-29', '2023-04-30', '2023-05-01',
               '2023-05-02', '2023-05-03', '2023-05-04', '2023-05-05',
               '2023-05-06', '2023-05-07'],
              dtype='datetime64[ns]', freq='D')

In [None]:
#you can pass frequency strings, like "1h20min", that will effectively be parsed to the same expression:
pd.date_range("2000-01-01", periods=10, freq="1h20min")


DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 01:20:00',
               '2000-01-01 02:40:00', '2000-01-01 04:00:00',
               '2000-01-01 05:20:00', '2000-01-01 06:40:00',
               '2000-01-01 08:00:00', '2000-01-01 09:20:00',
               '2000-01-01 10:40:00', '2000-01-01 12:00:00'],
              dtype='datetime64[ns]', freq='80T')

In [None]:
df_ts['2023']

2023-01-01   -0.106901
2023-01-02    0.648031
2023-01-03   -0.329163
2023-01-04    0.145729
2023-01-05   -0.550669
2023-01-06    2.164763
2023-01-07   -1.044961
2023-01-08    0.776637
2023-01-09   -0.940898
2023-01-10   -0.471904
Freq: D, dtype: float64

In [None]:
df_ts['2023-01']

2023-01-01   -0.106901
2023-01-02    0.648031
2023-01-03   -0.329163
2023-01-04    0.145729
2023-01-05   -0.550669
2023-01-06    2.164763
2023-01-07   -1.044961
2023-01-08    0.776637
2023-01-09   -0.940898
2023-01-10   -0.471904
Freq: D, dtype: float64

In [None]:
df_ts['2023-01-07']

-1.04496052635202

In [None]:
df_ts[datetime(2023,1,7):]

2023-01-07   -1.044961
2023-01-08    0.776637
2023-01-09   -0.940898
2023-01-10   -0.471904
Freq: D, dtype: float64

In [None]:
dates=pd.DatetimeIndex(['2023-01-02','2023-01-01','2023-01-02','2023-01-04','2023-01-06'])
ser_date=pd.Series(np.random.standard_normal(5),index=dates)
ser_date

2023-01-02    0.243984
2023-01-01    0.237766
2023-01-02   -0.253486
2023-01-04   -1.645632
2023-01-06    0.169770
dtype: float64

In [None]:
ser_date.index.is_unique

False

In [None]:
ser_date.groupby(level=0).mean()

2023-01-01    0.237766
2023-01-02   -0.004751
2023-01-04   -1.645632
2023-01-06    0.169770
dtype: float64

In [None]:
ser_date['2023-01-02'] #duplicates value

2023-01-02    0.243984
2023-01-02   -0.253486
dtype: float64

In [None]:
ser_date['2023-01-04'] #not duplicate value

-1.645632496769876

In [None]:
resampler=ser_date.resample('D')
val=[i for i in resampler]
val

[(Timestamp('2023-01-01 00:00:00', freq='D'),
  2023-01-01    0.237766
  dtype: float64),
 (Timestamp('2023-01-02 00:00:00', freq='D'),
  2023-01-02    0.243984
  2023-01-02   -0.253486
  dtype: float64),
 (Timestamp('2023-01-03 00:00:00', freq='D'), Series([], dtype: float64)),
 (Timestamp('2023-01-04 00:00:00', freq='D'),
  2023-01-04   -1.645632
  dtype: float64),
 (Timestamp('2023-01-05 00:00:00', freq='D'), Series([], dtype: float64)),
 (Timestamp('2023-01-06 00:00:00', freq='D'),
  2023-01-06    0.16977
  dtype: float64)]

One useful frequency class is “week of month,” starting with WOM. This enables you to get dates like the third Friday of each month:

In [None]:
pd.date_range("2023-01-01", "2023-05-01", freq="WOM-3FRI")


DatetimeIndex(['2023-01-20', '2023-02-17', '2023-03-17', '2023-04-21'], dtype='datetime64[ns]', freq='WOM-3FRI')

Shifting refers to moving data backward and forward through time. Both Series and DataFrame have a shift method for doing naive shifts forward or backward, leaving the index unmodified

In [None]:
date_ser=pd.Series(np.random.standard_normal(4),index=pd.date_range('2023-01-01',periods=4,freq='M'))
date_ser

2023-01-31    2.395762
2023-02-28   -1.219732
2023-03-31    0.990155
2023-04-30    0.844565
Freq: M, dtype: float64

In [None]:
#shift 2 position in forward direction
date_ser.shift(2)

2023-01-31         NaN
2023-02-28         NaN
2023-03-31    2.395762
2023-04-30   -1.219732
Freq: M, dtype: float64

In [None]:
#shift 2 position in backward direction
date_ser.shift(-2)

2023-01-31    0.990155
2023-02-28    0.844565
2023-03-31         NaN
2023-04-30         NaN
Freq: M, dtype: float64

In [None]:
date_ser.shift(2,freq='M')

2023-03-31    2.395762
2023-04-30   -1.219732
2023-05-31    0.990155
2023-06-30    0.844565
Freq: M, dtype: float64

In [None]:
#Time Zone
import pytz

In [None]:
pytz.common_timezones[5:10]

['Africa/Bamako',
 'Africa/Bangui',
 'Africa/Banjul',
 'Africa/Bissau',
 'Africa/Blantyre']

In [None]:
date_val=pd.date_range('2023-01-01',periods=10,freq='M',tz='UTC')
date_val

DatetimeIndex(['2023-01-31 00:00:00+00:00', '2023-02-28 00:00:00+00:00',
               '2023-03-31 00:00:00+00:00', '2023-04-30 00:00:00+00:00',
               '2023-05-31 00:00:00+00:00', '2023-06-30 00:00:00+00:00',
               '2023-07-31 00:00:00+00:00', '2023-08-31 00:00:00+00:00',
               '2023-09-30 00:00:00+00:00', '2023-10-31 00:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='M')

In [None]:
date_val=pd.date_range('2023-01-01',periods=10,tz='UTC')
date_val

DatetimeIndex(['2023-01-01 00:00:00+00:00', '2023-01-02 00:00:00+00:00',
               '2023-01-03 00:00:00+00:00', '2023-01-04 00:00:00+00:00',
               '2023-01-05 00:00:00+00:00', '2023-01-06 00:00:00+00:00',
               '2023-01-07 00:00:00+00:00', '2023-01-08 00:00:00+00:00',
               '2023-01-09 00:00:00+00:00', '2023-01-10 00:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Once a time series has been localized to a particular time zone, it can be converted to another time zone with tz_convert:


In [None]:
date_val.tz_convert('Africa/Banjul')

DatetimeIndex(['2023-01-01 00:00:00+00:00', '2023-01-02 00:00:00+00:00',
               '2023-01-03 00:00:00+00:00', '2023-01-04 00:00:00+00:00',
               '2023-01-05 00:00:00+00:00', '2023-01-06 00:00:00+00:00',
               '2023-01-07 00:00:00+00:00', '2023-01-08 00:00:00+00:00',
               '2023-01-09 00:00:00+00:00', '2023-01-10 00:00:00+00:00'],
              dtype='datetime64[ns, Africa/Banjul]', freq='D')

#### **pandas.Period()**
Periods represent time spans, like days, months, quarters, or years. 

In [None]:
per_data=pd.Period('2023',freq='A-DEC')
per_data


Period('2023', 'A-DEC')

In [None]:
per_data+6

Period('2029', 'A-DEC')

Regular ranges of periods can be constructed with the period_range function

In [None]:
periods = pd.period_range("2000-01-01", "2000-06-30", freq="M")
periods

PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05', '2000-06'], dtype='period[M]')

####**Resampling**

* Resampling refers to the process of converting a time series from one frequency to another. Aggregating higher frequency data to lower frequency is called downsampling, while converting lower frequency to higher frequency is called upsampling.

* Not all resampling falls into either of these categories; for example, converting W-WED(weekly on Wednesday) to W-FRI is neither upsampling nor downsampling.

In [None]:
dates=pd.date_range('2023-01-01',periods=40)
date_val_ser=pd.Series(np.random.standard_normal(len(dates)),index=dates)
date_val_ser

2023-01-01   -0.284172
2023-01-02   -0.107050
2023-01-03    0.740696
2023-01-04    0.128928
2023-01-05   -0.040858
2023-01-06   -0.391712
2023-01-07    1.436388
2023-01-08    1.049136
2023-01-09   -1.019158
2023-01-10   -0.143170
2023-01-11   -0.808677
2023-01-12   -0.267000
2023-01-13   -0.009802
2023-01-14    0.607807
2023-01-15    1.157858
2023-01-16    0.021471
2023-01-17    1.353183
2023-01-18   -1.557405
2023-01-19    1.201424
2023-01-20    0.781977
2023-01-21    1.963387
2023-01-22    1.483552
2023-01-23   -1.074749
2023-01-24   -0.087319
2023-01-25   -0.324553
2023-01-26   -0.651084
2023-01-27   -1.110617
2023-01-28   -0.516620
2023-01-29    0.016698
2023-01-30    0.040237
2023-01-31   -0.858987
2023-02-01   -0.381666
2023-02-02   -0.207969
2023-02-03   -0.692182
2023-02-04   -0.131929
2023-02-05   -0.316647
2023-02-06    1.073194
2023-02-07   -1.728932
2023-02-08   -2.559011
2023-02-09   -0.867897
Freq: D, dtype: float64

In [None]:
date_val_ser.resample('M').mean()

2023-01-31    0.088058
2023-02-28   -0.645893
Freq: M, dtype: float64

In [None]:
date_val_ser.resample('M',kind='period').mean()

2023-01    0.088058
2023-02   -0.645893
Freq: M, dtype: float64

In [None]:
dates=pd.date_range('2023-01-01',freq='T',periods=100)
data=pd.Series(np.random.standard_normal(len(dates)),index=dates)

In [None]:
data

2023-01-01 00:00:00   -0.978643
2023-01-01 00:01:00    0.867706
2023-01-01 00:02:00    0.616878
2023-01-01 00:03:00    0.735095
2023-01-01 00:04:00    2.324441
                         ...   
2023-01-01 01:35:00    0.835695
2023-01-01 01:36:00    1.205642
2023-01-01 01:37:00   -1.413152
2023-01-01 01:38:00    0.525094
2023-01-01 01:39:00    1.674724
Freq: T, Length: 100, dtype: float64

In [None]:
data.resample('5min').sum()

2023-01-01 00:00:00    3.565477
2023-01-01 00:05:00   -0.103472
2023-01-01 00:10:00   -3.138283
2023-01-01 00:15:00   -0.985799
2023-01-01 00:20:00    0.369708
2023-01-01 00:25:00    0.413892
2023-01-01 00:30:00   -1.215468
2023-01-01 00:35:00   -2.066850
2023-01-01 00:40:00    0.229575
2023-01-01 00:45:00   -0.449204
2023-01-01 00:50:00    2.283471
2023-01-01 00:55:00   -0.338266
2023-01-01 01:00:00    1.177911
2023-01-01 01:05:00   -0.855588
2023-01-01 01:10:00    0.958880
2023-01-01 01:15:00   -1.591106
2023-01-01 01:20:00   -2.643666
2023-01-01 01:25:00   -0.974349
2023-01-01 01:30:00    2.615890
2023-01-01 01:35:00    2.828003
Freq: 5T, dtype: float64

In [None]:
data.resample('5min',closed='right').sum()

2022-12-31 23:55:00   -0.978643
2023-01-01 00:00:00    6.104691
2023-01-01 00:05:00   -2.536242
2023-01-01 00:10:00   -2.420770
2023-01-01 00:15:00   -0.486533
2023-01-01 00:20:00    0.164826
2023-01-01 00:25:00    0.759102
2023-01-01 00:30:00   -2.786284
2023-01-01 00:35:00   -0.281872
2023-01-01 00:40:00   -0.835426
2023-01-01 00:45:00    0.498766
2023-01-01 00:50:00    3.073949
2023-01-01 00:55:00   -1.566386
2023-01-01 01:00:00    0.934335
2023-01-01 01:05:00    0.177200
2023-01-01 01:10:00   -0.807894
2023-01-01 01:15:00   -2.056571
2023-01-01 01:20:00   -2.593852
2023-01-01 01:25:00    0.214570
2023-01-01 01:30:00    3.511480
2023-01-01 01:35:00    1.992308
Freq: 5T, dtype: float64

In [None]:
data.resample('5min',closed='left').sum()

2023-01-01 00:00:00    3.565477
2023-01-01 00:05:00   -0.103472
2023-01-01 00:10:00   -3.138283
2023-01-01 00:15:00   -0.985799
2023-01-01 00:20:00    0.369708
2023-01-01 00:25:00    0.413892
2023-01-01 00:30:00   -1.215468
2023-01-01 00:35:00   -2.066850
2023-01-01 00:40:00    0.229575
2023-01-01 00:45:00   -0.449204
2023-01-01 00:50:00    2.283471
2023-01-01 00:55:00   -0.338266
2023-01-01 01:00:00    1.177911
2023-01-01 01:05:00   -0.855588
2023-01-01 01:10:00    0.958880
2023-01-01 01:15:00   -1.591106
2023-01-01 01:20:00   -2.643666
2023-01-01 01:25:00   -0.974349
2023-01-01 01:30:00    2.615890
2023-01-01 01:35:00    2.828003
Freq: 5T, dtype: float64

#### **Upsampling**
Upsampling is converting from a lower frequency to a higher frequency, where no aggregation is needed.

In [4]:
frame = pd.DataFrame(np.random.standard_normal((2, 4)),
                     index=pd.date_range("2000-01-01", periods=2,freq="W-WED"),
                     columns=["Colorado", "Texas", "New York", "Ohio"])


In [5]:
frame

Unnamed: 0,Colorado,Texas,New York,Ohio
2000-01-05,0.22611,-1.452498,0.338678,0.663561
2000-01-12,0.757081,-0.482177,-1.361902,1.033345


In [6]:
df_daily = frame.resample("D").asfreq()
df_daily

Unnamed: 0,Colorado,Texas,New York,Ohio
2000-01-05,0.22611,-1.452498,0.338678,0.663561
2000-01-06,,,,
2000-01-07,,,,
2000-01-08,,,,
2000-01-09,,,,
2000-01-10,,,,
2000-01-11,,,,
2000-01-12,0.757081,-0.482177,-1.361902,1.033345


Suppose you wanted to fill forward each weekly value on the non-Wednesdays.The same filling or interpolation methods available in the fillna and reindex methods are available for resampling:

In [7]:
frame.resample('D').ffill()

Unnamed: 0,Colorado,Texas,New York,Ohio
2000-01-05,0.22611,-1.452498,0.338678,0.663561
2000-01-06,0.22611,-1.452498,0.338678,0.663561
2000-01-07,0.22611,-1.452498,0.338678,0.663561
2000-01-08,0.22611,-1.452498,0.338678,0.663561
2000-01-09,0.22611,-1.452498,0.338678,0.663561
2000-01-10,0.22611,-1.452498,0.338678,0.663561
2000-01-11,0.22611,-1.452498,0.338678,0.663561
2000-01-12,0.757081,-0.482177,-1.361902,1.033345


In [8]:
 frame.resample("D").ffill(limit=2)

Unnamed: 0,Colorado,Texas,New York,Ohio
2000-01-05,0.22611,-1.452498,0.338678,0.663561
2000-01-06,0.22611,-1.452498,0.338678,0.663561
2000-01-07,0.22611,-1.452498,0.338678,0.663561
2000-01-08,,,,
2000-01-09,,,,
2000-01-10,,,,
2000-01-11,,,,
2000-01-12,0.757081,-0.482177,-1.361902,1.033345
