# Time Series Basics with Pandas

Hi Guys, Welcome to Pandas Tutorial 😀
</br>
In this notebook, I'm going to talk about time series basics with Pandas.
</br>
Happy learning 🐱‍🏍 

## What is the time series?

In [2]:
import pandas as pd
import numpy as np
from datetime import datetime

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


In [3]:
date=[datetime(2020,1,5),
      datetime(2020,1,10),
      datetime(2020,1,15),
      datetime(2020,1,20),
      datetime(2020,1,25)] 

In [4]:
ts=pd.Series(np.random.randn(5),index=date)
ts

2020-01-05   -0.040954
2020-01-10   -0.180656
2020-01-15   -1.764451
2020-01-20    0.463560
2020-01-25   -1.049511
dtype: float64

In [5]:
ts.index 

DatetimeIndex(['2020-01-05', '2020-01-10', '2020-01-15', '2020-01-20',
               '2020-01-25'],
              dtype='datetime64[ns]', freq=None)

## Time Series Data Structures

In [6]:
pd.to_datetime("01/01/2020") 

Timestamp('2020-01-01 00:00:00')

In [7]:
dates=pd.to_datetime(
    [datetime(2020,7,5),
     "6th of July, 2020",
     "2020-Jul-7",
     "20200708"])
dates

DatetimeIndex(['2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08'], dtype='datetime64[ns]', freq=None)

In [8]:
dates-dates[0]

TimedeltaIndex(['0 days', '1 days', '2 days', '3 days'], dtype='timedelta64[ns]', freq=None)

## Creating a Time Series

In [9]:
pd.date_range("2020-08-15","2020-09-01") 

DatetimeIndex(['2020-08-15', '2020-08-16', '2020-08-17', '2020-08-18',
               '2020-08-19', '2020-08-20', '2020-08-21', '2020-08-22',
               '2020-08-23', '2020-08-24', '2020-08-25', '2020-08-26',
               '2020-08-27', '2020-08-28', '2020-08-29', '2020-08-30',
               '2020-08-31', '2020-09-01'],
              dtype='datetime64[ns]', freq='D')

In [10]:
pd.date_range('2020-07-15', periods=10)

DatetimeIndex(['2020-07-15', '2020-07-16', '2020-07-17', '2020-07-18',
               '2020-07-19', '2020-07-20', '2020-07-21', '2020-07-22',
               '2020-07-23', '2020-07-24'],
              dtype='datetime64[ns]', freq='D')

In [11]:
pd.date_range("2020-07-15",
              periods=10,
              freq="H")

  pd.date_range("2020-07-15",


DatetimeIndex(['2020-07-15 00:00:00', '2020-07-15 01:00:00',
               '2020-07-15 02:00:00', '2020-07-15 03:00:00',
               '2020-07-15 04:00:00', '2020-07-15 05:00:00',
               '2020-07-15 06:00:00', '2020-07-15 07:00:00',
               '2020-07-15 08:00:00', '2020-07-15 09:00:00'],
              dtype='datetime64[ns]', freq='h')

In [12]:
pd.period_range("2020-10", 
                periods=10,
                freq="M")

PeriodIndex(['2020-10', '2020-11', '2020-12', '2021-01', '2021-02', '2021-03',
             '2021-04', '2021-05', '2021-06', '2021-07'],
            dtype='period[M]')

In [13]:
pd.timedelta_range(0,periods=8,freq="H")

  pd.timedelta_range(0,periods=8,freq="H")


TimedeltaIndex(['0 days 00:00:00', '0 days 01:00:00', '0 days 02:00:00',
                '0 days 03:00:00', '0 days 04:00:00', '0 days 05:00:00',
                '0 days 06:00:00', '0 days 07:00:00'],
               dtype='timedelta64[ns]', freq='h')

In [14]:
stamp=ts.index[1]
stamp

Timestamp('2020-01-10 00:00:00')

In [15]:
ts[stamp]

-0.18065648231055648

In [16]:
ts["25.1.2020"]

-1.0495105026510452

In [17]:
ts["20200125"]

-1.0495105026510452

In [18]:
long_ts=pd.Series(
    np.random.randn(1000),
    index=pd.date_range("1/1/2020",
                        periods=1000))
long_ts.head()

2020-01-01    0.064516
2020-01-02    0.001563
2020-01-03   -0.047271
2020-01-04    0.765910
2020-01-05   -0.005111
Freq: D, dtype: float64

In [19]:
long_ts["2020"].head()

2020-01-01    0.064516
2020-01-02    0.001563
2020-01-03   -0.047271
2020-01-04    0.765910
2020-01-05   -0.005111
Freq: D, dtype: float64

In [20]:
long_ts["2020-10"].head(15)

2020-10-01    0.681629
2020-10-02    0.290502
2020-10-03   -0.150777
2020-10-04    1.693867
2020-10-05    0.987963
2020-10-06   -1.397776
2020-10-07   -0.151433
2020-10-08   -1.436613
2020-10-09   -0.254317
2020-10-10   -0.781460
2020-10-11   -0.619868
2020-10-12    0.572536
2020-10-13   -1.847257
2020-10-14    2.251630
2020-10-15   -2.349847
Freq: D, dtype: float64

In [21]:
long_ts[datetime(2022,9,20):] 

2022-09-20   -1.794052
2022-09-21    0.727955
2022-09-22    0.111019
2022-09-23   -0.361229
2022-09-24    0.415793
2022-09-25   -1.158911
2022-09-26   -0.169570
Freq: D, dtype: float64

## The Important Methods Used in Time Series

In [22]:
ts

2020-01-05   -0.040954
2020-01-10   -0.180656
2020-01-15   -1.764451
2020-01-20    0.463560
2020-01-25   -1.049511
dtype: float64

In [23]:
ts.truncate(after="1/15/2020")

2020-01-05   -0.040954
2020-01-10   -0.180656
2020-01-15   -1.764451
dtype: float64

In [27]:
date=pd.date_range("1/1/2020",
                   periods=100,
                   freq="W-SUN")
date

DatetimeIndex(['2020-01-05', '2020-01-12', '2020-01-19', '2020-01-26',
               '2020-02-02', '2020-02-09', '2020-02-16', '2020-02-23',
               '2020-03-01', '2020-03-08', '2020-03-15', '2020-03-22',
               '2020-03-29', '2020-04-05', '2020-04-12', '2020-04-19',
               '2020-04-26', '2020-05-03', '2020-05-10', '2020-05-17',
               '2020-05-24', '2020-05-31', '2020-06-07', '2020-06-14',
               '2020-06-21', '2020-06-28', '2020-07-05', '2020-07-12',
               '2020-07-19', '2020-07-26', '2020-08-02', '2020-08-09',
               '2020-08-16', '2020-08-23', '2020-08-30', '2020-09-06',
               '2020-09-13', '2020-09-20', '2020-09-27', '2020-10-04',
               '2020-10-11', '2020-10-18', '2020-10-25', '2020-11-01',
               '2020-11-08', '2020-11-15', '2020-11-22', '2020-11-29',
               '2020-12-06', '2020-12-13', '2020-12-20', '2020-12-27',
               '2021-01-03', '2021-01-10', '2021-01-17', '2021-01-24',
      

In [25]:
long_df=pd.DataFrame(np.random.randn(100,4),
                    index=date,
                    columns=list("ABCD"))
long_df.head()

Unnamed: 0,A,B,C,D
2020-01-05,0.778055,0.511312,-0.02239,-0.716922
2020-01-12,0.769041,-1.530008,-0.867157,0.863497
2020-01-19,-1.527236,-0.575214,0.915176,-0.165441
2020-01-26,0.991931,-0.153319,0.838005,0.65629
2020-02-02,2.319523,2.067676,-1.599116,0.73259


In [None]:
date=pd.DatetimeIndex(
    ["1/1/2020","1/2/2020","1/2/2020",
     "1/2/2020","1/3/2020"])
ts1=pd.Series(np.arange(5),index=date)
ts1

2020-01-01    0
2020-01-02    1
2020-01-02    2
2020-01-02    3
2020-01-03    4
dtype: int32

In [None]:
ts1.index.is_unique 

False

In [None]:
group=ts1.groupby(level=0) 

In [None]:
group.count()

2020-01-01    1
2020-01-02    3
2020-01-03    1
dtype: int64

In [None]:
group.mean()

2020-01-01    0
2020-01-02    2
2020-01-03    4
dtype: int32