Generar datos aleatorios:

In [2]:
import pandas as pd
import numpy as np

In [3]:
fechas = pd.date_range(start='2023-01-01', end='2023-01-15', freq='D')
datos = np.random.randint(1,100, len(fechas))
serie_temporal = pd.Series(data=datos, index=fechas, name='Valores') 
print(serie_temporal)


2023-01-01    71
2023-01-02    15
2023-01-03     2
2023-01-04    70
2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
2023-01-09    43
2023-01-10    75
2023-01-11    42
2023-01-12    77
2023-01-13     4
2023-01-14    87
2023-01-15    94
Freq: D, Name: Valores, dtype: int32


Para acceder a los datos:

Directamente a una fecha:

In [4]:
print(serie_temporal['2023-01-05'])

71


Indice posicional:

In [5]:
print(serie_temporal.iloc[4])


71


Para rangos de fechas de forma explicita:

In [6]:
print(serie_temporal['2023-01-05':'2023-01-10'])  

2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
2023-01-09    43
2023-01-10    75
Freq: D, Name: Valores, dtype: int32


Para una selección por posición:

In [7]:
print(serie_temporal[2:8] ) 

2023-01-03     2
2023-01-04    70
2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
Freq: D, Name: Valores, dtype: int32


Selección por condiciones 

In [8]:
print(serie_temporal[serie_temporal>50])  

2023-01-01    71
2023-01-04    70
2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
2023-01-10    75
2023-01-12    77
2023-01-14    87
2023-01-15    94
Name: Valores, dtype: int32


Metodos loc e iloc:
iloc por posición, loc indice particular

In [9]:
print(serie_temporal.loc['2023-01-05'])
print(serie_temporal.loc['2023-01-05':'2023-01-10']) 
print(serie_temporal.iloc[4])
print(serie_temporal.iloc[2:8])

71
2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
2023-01-09    43
2023-01-10    75
Freq: D, Name: Valores, dtype: int32
71
2023-01-03     2
2023-01-04    70
2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
Freq: D, Name: Valores, dtype: int32


Para nuevas fechas re-indexadas: 

In [10]:
nuevas_fechas = pd.date_range(start='2023-01-01', end='2023-01-20', freq='D')
serie_reindexada = serie_temporal.reindex(nuevas_fechas, fill_value=0)
print(serie_reindexada)

2023-01-01    71
2023-01-02    15
2023-01-03     2
2023-01-04    70
2023-01-05    71
2023-01-06    69
2023-01-07    67
2023-01-08    96
2023-01-09    43
2023-01-10    75
2023-01-11    42
2023-01-12    77
2023-01-13     4
2023-01-14    87
2023-01-15    94
2023-01-16     0
2023-01-17     0
2023-01-18     0
2023-01-19     0
2023-01-20     0
Freq: D, Name: Valores, dtype: int32


Podemos hacer un ajuste de frecuencia: utiliza el promedio de los días 

In [11]:
serie_semanal = serie_temporal.resample('W').mean()
print(serie_semanal)

2023-01-01    71.000000
2023-01-08    55.714286
2023-01-15    60.285714
Freq: W-SUN, Name: Valores, dtype: float64


Operaciones basicas

In [12]:
print(serie_temporal + 10)

2023-01-01     81
2023-01-02     25
2023-01-03     12
2023-01-04     80
2023-01-05     81
2023-01-06     79
2023-01-07     77
2023-01-08    106
2023-01-09     53
2023-01-10     85
2023-01-11     52
2023-01-12     87
2023-01-13     14
2023-01-14     97
2023-01-15    104
Freq: D, Name: Valores, dtype: int32


In [13]:
print(serie_temporal * 2)

2023-01-01    142
2023-01-02     30
2023-01-03      4
2023-01-04    140
2023-01-05    142
2023-01-06    138
2023-01-07    134
2023-01-08    192
2023-01-09     86
2023-01-10    150
2023-01-11     84
2023-01-12    154
2023-01-13      8
2023-01-14    174
2023-01-15    188
Freq: D, Name: Valores, dtype: int32


In [14]:
print("Media:", serie_temporal.mean())
print("Mediana:", serie_temporal.median())
print("Max:", serie_temporal.max())
print("Min:", serie_temporal.min())
print("Desviación estándar:", serie_temporal.std())


Media: 58.86666666666667
Mediana: 70.0
Max: 96
Min: 2
Desviación estándar: 30.805534444007822


desplazar dator

In [15]:
print(serie_temporal.shift(1))

2023-01-01     NaN
2023-01-02    71.0
2023-01-03    15.0
2023-01-04     2.0
2023-01-05    70.0
2023-01-06    71.0
2023-01-07    69.0
2023-01-08    67.0
2023-01-09    96.0
2023-01-10    43.0
2023-01-11    75.0
2023-01-12    42.0
2023-01-13    77.0
2023-01-14     4.0
2023-01-15    87.0
Freq: D, Name: Valores, dtype: float64


In [16]:
print(serie_temporal.shift(-1))

2023-01-01    15.0
2023-01-02     2.0
2023-01-03    70.0
2023-01-04    71.0
2023-01-05    69.0
2023-01-06    67.0
2023-01-07    96.0
2023-01-08    43.0
2023-01-09    75.0
2023-01-10    42.0
2023-01-11    77.0
2023-01-12     4.0
2023-01-13    87.0
2023-01-14    94.0
2023-01-15     NaN
Freq: D, Name: Valores, dtype: float64


In [17]:
print(serie_temporal.diff())


2023-01-01     NaN
2023-01-02   -56.0
2023-01-03   -13.0
2023-01-04    68.0
2023-01-05     1.0
2023-01-06    -2.0
2023-01-07    -2.0
2023-01-08    29.0
2023-01-09   -53.0
2023-01-10    32.0
2023-01-11   -33.0
2023-01-12    35.0
2023-01-13   -73.0
2023-01-14    83.0
2023-01-15     7.0
Freq: D, Name: Valores, dtype: float64


Para cambiar de una muestra diaria a una semanal, media:

In [19]:
serie_semanal = serie_temporal.resample('W').mean()
print(serie_semanal)

2023-01-01    71.000000
2023-01-08    55.714286
2023-01-15    60.285714
Freq: W-SUN, Name: Valores, dtype: float64


In [21]:
print(serie_temporal.resample('w').sum())

2023-01-01     71
2023-01-08    390
2023-01-15    422
Freq: W-SUN, Name: Valores, dtype: int32


  print(serie_temporal.resample('w').sum())


In [22]:
print(serie_temporal.resample('w').max())

2023-01-01    71
2023-01-08    96
2023-01-15    94
Freq: W-SUN, Name: Valores, dtype: int32


  print(serie_temporal.resample('w').max())


In [28]:
fechas_incompletas = pd.date_range('2023-01-01', '2023-01-10', freq='2D' )
serie_incompleta = pd.Series([10,20, np.nan,40, np.nan], index=fechas_incompletas)
print(serie_incompleta.ffill())

2023-01-01    10.0
2023-01-03    20.0
2023-01-05    20.0
2023-01-07    40.0
2023-01-09    40.0
Freq: 2D, dtype: float64


interpolación lineal

In [29]:
print(serie_incompleta.interpolate())

2023-01-01    10.0
2023-01-03    20.0
2023-01-05    30.0
2023-01-07    40.0
2023-01-09    40.0
Freq: 2D, dtype: float64
