# Time Series Basics with Pandas

Hi Guys, Welcome to [Tirendaz Academy](https://youtube.com/c/tirendazacademy) 😀
</br>
In this notebook, I'm going to talk about time series basics with Pandas.
</br>
Happy learning 🐱‍🏍 

## What is the time series?

In [2]:
import pandas as pd
import numpy as np
from datetime import datetime

In [3]:
date=[datetime(2020,1,5),
      datetime(2020,1,10),
      datetime(2020,1,15),
      datetime(2020,1,20),
      datetime(2020,1,25)] 

In [4]:
ts=pd.Series(np.random.randn(5),index=date) 
ts

2020-01-05    0.440549
2020-01-10   -1.058806
2020-01-15    0.453689
2020-01-20   -2.011909
2020-01-25    0.641951
dtype: float64

In [5]:
ts.index #año-mes-dia

DatetimeIndex(['2020-01-05', '2020-01-10', '2020-01-15', '2020-01-20',
               '2020-01-25'],
              dtype='datetime64[ns]', freq=None)

## Time Series Data Structures

In [6]:
pd.to_datetime("01/01/2020")  

Timestamp('2020-01-01 00:00:00')

In [49]:
dates=pd.to_datetime(
    [datetime(2020,7,5), 
     "6th of July, 2020",
     "2020-Jul-7",
     "20200708"]) #distintos formatos de fechas
dates


DatetimeIndex(['2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08'], dtype='datetime64[ns]', freq=None)

In [50]:
dates.to_period("D")  #distinto formato ns-d

PeriodIndex(['2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08'], dtype='period[D]')

In [51]:
dates-dates[0] #diferencia de fechas conrespecto al primer elemento

TimedeltaIndex(['0 days', '1 days', '2 days', '3 days'], dtype='timedelta64[ns]', freq=None)

## Creating a Time Series

In [52]:
pd.date_range("2020-08-15","2020-09-01") #rango de fechas

DatetimeIndex(['2020-08-15', '2020-08-16', '2020-08-17', '2020-08-18',
               '2020-08-19', '2020-08-20', '2020-08-21', '2020-08-22',
               '2020-08-23', '2020-08-24', '2020-08-25', '2020-08-26',
               '2020-08-27', '2020-08-28', '2020-08-29', '2020-08-30',
               '2020-08-31', '2020-09-01'],
              dtype='datetime64[ns]', freq='D')

In [53]:
pd.date_range('2020-07-15', periods=10) #las 10 siguientes fechas

DatetimeIndex(['2020-07-15', '2020-07-16', '2020-07-17', '2020-07-18',
               '2020-07-19', '2020-07-20', '2020-07-21', '2020-07-22',
               '2020-07-23', '2020-07-24'],
              dtype='datetime64[ns]', freq='D')

In [54]:
pd.date_range("2020-07-15",
              periods=10,
              freq="H") #las siguientes 10 horas

  pd.date_range("2020-07-15",


DatetimeIndex(['2020-07-15 00:00:00', '2020-07-15 01:00:00',
               '2020-07-15 02:00:00', '2020-07-15 03:00:00',
               '2020-07-15 04:00:00', '2020-07-15 05:00:00',
               '2020-07-15 06:00:00', '2020-07-15 07:00:00',
               '2020-07-15 08:00:00', '2020-07-15 09:00:00'],
              dtype='datetime64[ns]', freq='h')

In [55]:
pd.period_range("2020-10", 
                periods=10,
                freq="M") #periodos de 10 meses

PeriodIndex(['2020-10', '2020-11', '2020-12', '2021-01', '2021-02', '2021-03',
             '2021-04', '2021-05', '2021-06', '2021-07'],
            dtype='period[M]')

In [56]:
pd.timedelta_range(0,periods=8,freq="H")    #rango de 8 horas

  pd.timedelta_range(0,periods=8,freq="H")


TimedeltaIndex(['0 days 00:00:00', '0 days 01:00:00', '0 days 02:00:00',
                '0 days 03:00:00', '0 days 04:00:00', '0 days 05:00:00',
                '0 days 06:00:00', '0 days 07:00:00'],
               dtype='timedelta64[ns]', freq='h')

In [57]:
stamp=ts.index[1] #fecha
stamp

Timestamp('2020-01-10 00:00:00')

In [58]:
ts[stamp] #valor de la fecha

np.float64(-1.0588057276120013)

In [59]:
ts["25.1.2020"] #valor de la fecha

np.float64(0.6419505791222829)

In [60]:
ts["20200125"] #valor de la fecha

np.float64(0.6419505791222829)

In [61]:
long_ts=pd.Series(
    np.random.randn(1000),
    index=pd.date_range("1/1/2020",
                        periods=1000)) #serie de 1000 elementos(por días)
long_ts.head() #primeros 5 elementos

2020-01-01   -0.390833
2020-01-02    0.157331
2020-01-03    1.253051
2020-01-04   -1.153837
2020-01-05   -0.687906
Freq: D, dtype: float64

In [62]:
long_ts["2020"].head() #primeros 5 elementos del año 2020

2020-01-01   -0.390833
2020-01-02    0.157331
2020-01-03    1.253051
2020-01-04   -1.153837
2020-01-05   -0.687906
Freq: D, dtype: float64

In [63]:
long_ts["2020-10"].head(15) #primeros 15 elementos de octubre 2020

2020-10-01   -1.175032
2020-10-02   -1.685319
2020-10-03    1.023381
2020-10-04    0.948972
2020-10-05    1.644554
2020-10-06    1.750927
2020-10-07   -1.111934
2020-10-08   -0.218166
2020-10-09    1.870816
2020-10-10    1.680443
2020-10-11   -1.417113
2020-10-12   -0.119896
2020-10-13    0.430560
2020-10-14    0.583632
2020-10-15   -1.601900
Freq: D, dtype: float64

In [64]:
long_ts[datetime(2022,9,20):]  #desde la fecha hasta el final

2022-09-20    0.193839
2022-09-21   -0.652283
2022-09-22    1.402658
2022-09-23   -0.024711
2022-09-24    0.240230
2022-09-25   -1.167211
2022-09-26   -0.158999
Freq: D, dtype: float64

## The Important Methods Used in Time Series

In [65]:
ts

2020-01-05    0.440549
2020-01-10   -1.058806
2020-01-15    0.453689
2020-01-20   -2.011909
2020-01-25    0.641951
dtype: float64

In [66]:
ts.truncate(after="1/15/2020") #que hay antes de la fecha

2020-01-05    0.440549
2020-01-10   -1.058806
2020-01-15    0.453689
dtype: float64

In [67]:
date=pd.date_range("1/1/2020",
                   periods=100,
                   freq="W-SUN") #100 fechas de los domingos

In [68]:
long_df=pd.DataFrame(np.random.randn(100,4),
                    index=date,
                    columns=list("ABCD")) #dataframe de 100x4
long_df.head()

Unnamed: 0,A,B,C,D
2020-01-05,0.121994,-0.484656,1.067993,-0.88262
2020-01-12,0.445543,-1.025266,-0.581452,-0.556969
2020-01-19,-0.67573,-0.367998,-1.34409,-0.894688
2020-01-26,0.35915,0.64832,0.99024,0.742174
2020-02-02,-0.487624,-0.693791,1.345275,-0.003744


In [74]:
# long_df.index = pd.to_datetime(long_df.index)
# long_df["2020-01-05"] #elementos de enero 2020

KeyError: '2020-01-05'

In [27]:
date=pd.DatetimeIndex(
    ["1/1/2020","1/2/2020","1/2/2020",
     "1/2/2020","1/3/2020"])
ts1=pd.Series(np.arange(5),index=date)
ts1

2020-01-01    0
2020-01-02    1
2020-01-02    2
2020-01-02    3
2020-01-03    4
dtype: int32

In [28]:
ts1.index.is_unique 

False

In [29]:
group=ts1.groupby(level=0) 

In [30]:
group.count()

2020-01-01    1
2020-01-02    3
2020-01-03    1
dtype: int64

In [31]:
group.mean()

2020-01-01    0
2020-01-02    2
2020-01-03    4
dtype: int32

Don't forget to follow us on [YouTube](http://youtube.com/tirendazacademy) | [Medium](http://tirendazacademy.medium.com) | [Twitter](http://twitter.com/tirendazacademy) | [GitHub](http://github.com/tirendazacademy) | [Linkedin](https://www.linkedin.com/in/tirendaz-academy) | [Kaggle](https://www.kaggle.com/tirendazacademy) 😎