# Correlation between external parameters and electricity prices

This notebook will provide graphs and analysis about correlation between different external parameters and electricity prices. The data is provided by Nordpool and Ilmateenistus and YahooFinance.

In [1]:
# imports
import pandas as pd
import os
import matplotlib.pyplot as plt
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import yfinance as yf
# allows to output plots in the notebook
%matplotlib inline 

In [2]:
# changing to correct directory
# DO NOT RUN A SECOND TIME (Restart & Run All is allowed)
os.chdir("..")

In [3]:
# reading in files
nordpool = pd.read_csv(os.path.join("data", "processed", "nordpool_estonia.csv"))
ilmateenistus = pd.read_csv(os.path.join("data", "processed", "ilmateenistus.csv"))
gas = yf.Ticker("TTF=F").history(period="max")[["Open", "Close"]]
gas["gas_price"] = round((gas.pop("Open") + gas.pop("Close")) / 2, 3)

# refactoring dataframe formats
nordpool['DateTime'] = pd.to_datetime(nordpool.pop('Date')) + pd.to_timedelta(nordpool.pop('Time'))
ilmateenistus['DateTime'] = pd.to_datetime(ilmateenistus.pop('Date')) + pd.to_timedelta(ilmateenistus.pop('Time'))

In [23]:
# Combining the ilmateenistus and nordpool dataframes into 1
combined_df = nordpool.merge(ilmateenistus, how="left", on=["DateTime"])
combined_df.dropna(inplace=True)
combined_df.drop_duplicates(inplace=True)

# Combining nordpool and gas price to the gas df
nordpool_daily = nordpool.groupby(pd.Grouper(key="DateTime", freq="1D")).mean().reset_index()
gas_combined_nord = gas.merge(nordpool_daily, how="left", left_index=True, right_on="DateTime")
gas_combined_nord.drop(columns="consumption", inplace=True)
gas_combined_nord.dropna(inplace=True)
gas_combined_nord.drop_duplicates(inplace=True)

In [5]:
#Making a compressed version of the dataframe for plotting (the correlation coefficients will be calculated from the original dataframe)
#This compresses the data down to weekly data by calculating the mean
compressed_df = combined_df.groupby(pd.Grouper(key="DateTime", freq="1W")).mean().reset_index()

In [19]:
#Help-function for plotting
def plot_correlation(parameter:str, df=compressed_df):
    
    fig = make_subplots(specs=[[{"secondary_y": True}]])
    fig.add_trace(
        go.Scatter(x=df.DateTime, y=df["elspot_price"], name="Electricity price"),
        secondary_y=False,
    )
    fig.add_trace(
        go.Scatter(x=df.DateTime, y=df[parameter], name=parameter),
        secondary_y=True,
    )

    # Add figure title
    fig.update_layout(
        title_text= parameter + " vs Elspot price"
    )
    # Set x-axis title
    fig.update_xaxes(title_text="DateTime")
    # Set y-axes titles
    fig.update_yaxes(title_text="Elspot price", secondary_y=False)
    fig.update_yaxes(title_text=parameter, secondary_y=True)

    fig.show()


## Analysis towards correlations between consumption and electricity prices in Estonia

In [7]:
correlation_consumption = nordpool["consumption"].corr(nordpool["elspot_price"], method="pearson")
print("The correlation between temperature and electricity prices is " + str(correlation_consumption))

The correlation between temperature and electricity prices is 0.2896650305171631


In [None]:
nordpool_weekly = nordpool_daily.groupby(pd.Grouper(key="DateTime", freq="1W")).mean().reset_index()
nordpool_weekly.dropna(inplace=True)
nordpool_weekly.drop_duplicates(inplace=True)
plot_correlation("consumption", df=nordpool_weekly)

### Conclusion

## Analysis towards correlations between gas price (Dutch Natural Gas) and electricity prices in Estonia

In [None]:
correlation_gas = gas_combined_nord["gas_price"].corr(gas_combined_nord["elspot_price"], method="pearson")
print("The correlation between gas price and electricity prices is " + str(correlation_gas))

In [None]:
plot_correlation("gas_price", df=gas_combined_nord)

### Conclusion

## Analysis towards correlations between temperature and electricity prices in Estonia

In [None]:
correlation_temperature = combined_df["Temperature"].corr(combined_df["elspot_price"], method="pearson")
print("The correlation between temperature and electricity prices is " + str(correlation_temperature))

In [21]:
plot_correlation("Temperature")

### Conclusion

## Analysis towards correlations between gas price (Dutch Natural Gas) and electricity prices in Estonia

In [9]:
correlation_gas = gas_combined_nord["gas_price"].corr(gas_combined_nord["elspot_price"], method="pearson")
print("The correlation between gas price and electricity prices is " + str(correlation_gas))

The correlation between gas price and electricity prices is 0.6691329708661296


In [25]:
plot_correlation("gas_price", df=gas_combined_nord)

### Conclusion

## Analysis towards correlations between temperature and electricity prices in Estonia

In [None]:
correlation_temperature = combined_df["Temperature"].corr(combined_df["elspot_price"], method="pearson")
print("The correlation between temperature and electricity prices is " + str(correlation_temperature))

In [None]:
plot_correlation("Temperature")

### Conclusion

As to be expected, the overall trend is that as the temperature reaches the extremes - either very high or very low, the electricity prices rise. This is mostly because, as temperatures get reach the extremes, people tend to start controlling temperatures inside their homes and the electricity usage spikes. Once electricity usage goes through the roof, the runt of the work is done by coal and gas power plants. This, in fact, is taxed in Estonia and since it costs more for the provider to "create" electricity, the price for the end user is also significantly higher. This logic can briefly be read about here: https://www.err.ee/1608240762/elektri-hind-on-hakanud-huppeliselt-kasvama (in Estonian).

The logic behind the correlation remains the same throughout the 5 years - the optimal temperature for the lowest electricity prices is between 0-15 degrees Celsius. This has a lot of contributing factors - one of them being that usually during this time in the year (spring to fall) there is much more daylight and much less time is spent indoors. In addition to the points stated in the previous paragraph, this means that electricity is consumed for less hours throughout the day and since there is less demand, coal and gas power plants don't have to be utilised as much and electricity prices remain low. Almost instantly, when temperature drops below 0 degrees Celsius or rises above 15 degrees Celsius, there is also a small spike in electricity costs.

Finding the linear correlation factor deemed to be the wrong approach here, as sudden drops in temperature affect electricity prices in the same way as sudden rises, which means that the correlation coefficient is incredibly small (0.0924).

Temperature-wise, all 5 years are virtually identical. However, the prices were most stable during 2016-2017, started varying more and more during 2018-2019 and remained somewhat stable until the end of 2020, when there was 2 sudden rising spikes in electricity prices. This anomaly cannot really be correlated to temperature changes, so I will further address this in my next pieces of analysis.






## Analysis towards correlations between wind speed and electricity prices in Estonia

In [None]:
correlation= combined_df["WindSpeed"].corr(combined_df["elspot_price"], method= "pearson")
print("The correlation between wind speed and electricity prices is " + str(correlation))

In [None]:
plot_correlation("WindSpeed")

### Conclusion

As one can see, wind speeds don't really follow any certain patterns, the speeds keep going up and down weekly in a very rapid manner. Thus finding any certain correlations between electricity prices and wind speed is very optimistic, to not say impossible.

However, if we look at the overall trend, we can see that when wind speeds are higher, electricity prices tend to be lower. Why is that? It might have to do with the reason I brought out in the last paragraph (concerning temperature), where I mentioned, that once there is a drought of electricity, coal and gas is used in a much larger capacity to cover the demand. Wind turbines make up for over 50 percent of Estonia's renewable energy production and we have over 144 turbines in total. (https://tuuleenergia.ee/) It's clear to see that if wind levels are higher, the turbines are more effective and we have less need for production via coal and gas, which means that producers don't have to pay as much tax (CO2 tax) and the overall cost for the consumer is lower. 

Looking at the evidence, you can see that I've calculated the Pearson correlation regarding wind speed and electricity, and this, although extremely weakly, supports our theory that as wind speeds get higher, electricity prices drop. (The correlation coefficient is about -0.0103 which means there is close to no correlation between the 2 factors (The fact that the correlation is negative points to our theory but since the absolute value is under 0.3, strong conclusions can't be made.)

Higher wind speeds also might call for higher energy usage in heating, especially when temperatures are low, but the pros far outweigh the cons and the trend is clear to see in this case. Long live the wind!

## Analysis towards correlations between air pressure and electricity prices in Estonia

In [None]:
correlation = combined_df["AirPressure"].corr(combined_df["elspot_price"], method= "pearson")
print("The correlation between air pressure and electricity prices is " + str(correlation))

In [None]:
plot_correlation("AirPressure")

### Conclusion

Based on the data provided by Nordpool and Ilmateenistus, during the years 2016-2021 there seems to be no correlation between air pressure and electricity prices. Any similarities one might notice in the graph are completely random, which is also proven by the extremely low Pearson coefficient (0.0711). 

## Analysis towards correlations between precipitation and electricity prices in Estonia

In [None]:
correlation = combined_df["Precipitation"].corr(combined_df["elspot_price"], method= "pearson")
print("The correlation between Precipitation and electricity prices is " + str(correlation))

In [None]:
plot_correlation("Precipitation")

### Conclusion

Based on the data provided by Nordpool and Ilmateenistus, during the years 2016-2021 there seems to be no correlation between precipitation and electricity prices, which is clearly and easily proven by the nonexistant correlation coefficient (0.002).

Logically speaking, this conclusion makes sense. Higher precipitation doesn't call for any additional energy usage, which means that there is no increase in demand and no reason for electricity prices to rise.

## Analysis towards correlations between air humidity and electricity prices in Estonia

In [None]:
correlation = combined_df["AirHumidity"].corr(combined_df["elspot_price"], method= "pearson")
print("The correlation between air humidity and electricity prices is " + str(correlation))

In [None]:
plot_correlation("AirHumidity")

### Conclusion

As was the case with air pressure and precipitation, air humidity also appears to have no significant effect on electricity prices in Estonia, the correlation coefficient is just -0.1345, far below the threshold with which we could make conclusions.

The reason is most probably the same as with the previous 2 factors - air humidity doesn't call for any additional energy usage and thus there is no increase in demand - there is no reason for the electricity prices to rise.


## Overall analysis concerning correlation between weather parameters as a whole and electricity prices in Estonia

All in all, we've analysed electricity prices and their correlations to 5 different weather parameters. We did this by calculating correlation coefficients and studying graphs based on weekly changes in the weather parameters and electricity prices. Our goal was to find out any existing correlations and to find out which conditions affect prices positively/negatively. Different conclusions were made. Most parameters didn't seem to have much effect on prices whatsoever, which isn't very surprising. After all, for prices to be affected, there needs to be some sudden change in human behaviour and energy usage, which just aren't affected by parameters such as air humidity or air pressure. 

However, one parameter seemed to have a considerable effect on prices -  temperature. As said beforehand, this was to be expected, because changes in temperature (which lead to the extremes - out of the 0-15 degrees "sweet spot") cause people use electricity to change the indoor climate, which leads to higher energy demand and higher electricity prices. 

Why does the rise of demand dictate the price? This has to do with taxes set on CO2. When there is a higher strain on the electricity market, coal and gas power plants are used more intensely, which leads to higher CO2 production and thus greater taxes to be paid.

When it comes to expectations, we expected wind as a parameter to have a far greater effect, mainly because of the fact that, with stronger winds, wind farms become more effective and more renewable energy is created (Less CO2 tax!). However, as our analysis shows, this isn't really the case. The effect is there but it's miniscule and can't really be taken into consideration in the big picture.

The analysis filled it's purpose and we have a relatively defining answer - based on our data, temperature is pretty much the only weather parameter which has an effect on electricity prices. 