In [33]:
import pandas as pd
import numpy as np

In [34]:
wind_speed = pd.read_csv("./datasets/windGuajira2019.csv", skiprows= 3)
wind_speed.head()

Unnamed: 0,time,local_time,electricity,wind_speed
0,2019-01-01 00:00,2018-12-31 19:00,0.952,13.742
1,2019-01-01 01:00,2018-12-31 20:00,0.953,13.783
2,2019-01-01 02:00,2018-12-31 21:00,0.946,13.515
3,2019-01-01 03:00,2018-12-31 22:00,0.94,13.327
4,2019-01-01 04:00,2018-12-31 23:00,0.941,13.341


From these first rows of data from the `wind_speed` dataset, we can see that it contains a sequence of timestamps, with hourly frequency in two columns, `time` and `local_time`. These two columns could be set as the dataframe's index, rather than a numeric sequence, so as to turn it a time series.  
Let's first check the structure of the dataframe, data types of each column, and missing values before doing any transformation:

In [35]:
wind_speed.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8760 entries, 0 to 8759
Data columns (total 4 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   time         8760 non-null   object 
 1   local_time   8760 non-null   object 
 2   electricity  8760 non-null   float64
 3   wind_speed   8760 non-null   float64
dtypes: float64(2), object(2)
memory usage: 273.9+ KB


In [36]:
wind_speed.isna().sum()

time           0
local_time     0
electricity    0
wind_speed     0
dtype: int64

Note that the abovementioned `time` and `local_time` columns are of *object* type when they should be Timestamp objects. Hence, they need to be converted to such data type.

In [37]:
wind_speed['time'] = pd.to_datetime(wind_speed['time'])
wind_speed['local_time'] = pd.to_datetime(wind_speed['local_time'])

wind_speed.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8760 entries, 0 to 8759
Data columns (total 4 columns):
 #   Column       Non-Null Count  Dtype         
---  ------       --------------  -----         
 0   time         8760 non-null   datetime64[ns]
 1   local_time   8760 non-null   datetime64[ns]
 2   electricity  8760 non-null   float64       
 3   wind_speed   8760 non-null   float64       
dtypes: datetime64[ns](2), float64(2)
memory usage: 273.9 KB


In [38]:
wind_speed.head()

Unnamed: 0,time,local_time,electricity,wind_speed
0,2019-01-01 00:00:00,2018-12-31 19:00:00,0.952,13.742
1,2019-01-01 01:00:00,2018-12-31 20:00:00,0.953,13.783
2,2019-01-01 02:00:00,2018-12-31 21:00:00,0.946,13.515
3,2019-01-01 03:00:00,2018-12-31 22:00:00,0.94,13.327
4,2019-01-01 04:00:00,2018-12-31 23:00:00,0.941,13.341
