Time series is any data set where the values are measured at different points in time e.g. hourly, daily, weekly or monthly

https://www.dataquest.io/blog/tutorial-time-series-analysis-with-pandas/

Time series can also be **irregularly spaced** 
e.g. Computer event log

In [1]:
import pandas as pd


**to_datetime()** function to create Timestamps from strings in a wide variety of date/time formats

In [2]:
pd.to_datetime('2023-03-12 8:45pm')

Timestamp('2023-03-12 20:45:00')

In [3]:
pd.to_datetime('7/8/1952')    #Different format - 8 July 1952

Timestamp('1952-07-08 00:00:00')

In [4]:
# dayfirst attribute
pd.to_datetime('7/8/1952', dayfirst='True') 

Timestamp('1952-08-07 00:00:00')

**DatetimeIndex** 
This data structure allows pandas to compactly store large sequences of date/time values and efficiently perform vectorized operations using NumPy datetime64 arrays

In [5]:
pd.to_datetime(['2018-01-05', '7/8/1952', 'Oct 10, 1995'])

DatetimeIndex(['2018-01-05', '1952-07-08', '1995-10-10'], dtype='datetime64[ns]', freq=None)

**Formatted dates**

In [6]:
pd.to_datetime(['2/25/10', '8/6/17', '12/15/12'], format='%m/%d/%y')

DatetimeIndex(['2010-02-25', '2017-08-06', '2012-12-15'], dtype='datetime64[ns]', freq=None)

# Exploring Data

Data Set - https://raw.githubusercontent.com/jenfly/opsd/master/opsd_germany_daily.csv

Electricity production and consumption are reported as daily totals in gigawatt-hours (GWh). The columns of the data file are:

* **Date** — The date (yyyy-mm-dd format)
* **Consumption** — Electricity consumption in GWh
* **Wind** — Wind power production in GWh
* **Solar** — Solar power production in GWh
* **Wind+Solar** — Sum of wind and solar power production in GWh

**Questions to Ask**
* When is electricity consumption typically highest and lowest?
* How do wind and solar power production vary with seasons of the year?
* What are the long-term trends in electricity consumption, solar power, and wind power?
* How do wind and solar power production compare with electricity consumption, and how has this ratio changed over time?


In [8]:
opsd_daily = pd.read_csv('https://raw.githubusercontent.com/jenfly/opsd/master/opsd_germany_daily.csv')
opsd_daily.shape

(4383, 5)

In [9]:
opsd_daily.head(3)

Unnamed: 0,Date,Consumption,Wind,Solar,Wind+Solar
0,2006-01-01,1069.184,,,
1,2006-01-02,1380.521,,,
2,2006-01-03,1442.533,,,


In [10]:
opsd_daily.tail(3)

Unnamed: 0,Date,Consumption,Wind,Solar,Wind+Solar
4380,2017-12-29,1295.08753,584.277,29.854,614.131
4381,2017-12-30,1215.44897,721.247,7.467,728.714
4382,2017-12-31,1107.11488,721.176,19.98,741.156


In [11]:
#Data types of each column
opsd_daily.dtypes

Date            object
Consumption    float64
Wind           float64
Solar          float64
Wind+Solar     float64
dtype: object

##Load data, import date  field as datetime and set it as index

In [17]:
opsd_daily = pd.read_csv('https://raw.githubusercontent.com/jenfly/opsd/master/opsd_germany_daily.csv', parse_dates=[0], index_col =0)

In [16]:
opsd_daily.dtypes

Date           datetime64[ns]
Consumption           float64
Wind                  float64
Solar                 float64
Wind+Solar            float64
dtype: object

In [18]:
opsd_daily.head(3)

Unnamed: 0_level_0,Consumption,Wind,Solar,Wind+Solar
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2006-01-01,1069.184,,,
2006-01-02,1380.521,,,
2006-01-03,1442.533,,,


## Extracting Day, Month, year from index

In [20]:
opsd_daily['Year'] = opsd_daily.index.year
opsd_daily['Month'] = opsd_daily.index.month
opsd_daily['Weekday Name'] = opsd_daily.index.day_name()
# Display a random sampling of 5 rows
opsd_daily.sample(5, random_state=0)

Unnamed: 0_level_0,Consumption,Wind,Solar,Wind+Solar,Year,Month,Weekday Name
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2008-08-23,1152.011,,,,2008,8,Saturday
2013-08-08,1291.984,79.666,93.371,173.037,2013,8,Thursday
2009-08-27,1281.057,,,,2009,8,Thursday
2015-10-02,1391.05,81.229,160.641,241.87,2015,10,Friday
2009-06-02,1201.522,,,,2009,6,Tuesday
