# Energy and Environment Homework - Mathilde, Hardi and Edem

## General
- Power demand follows French consumption from 2023 and will be rescaled to 1 TWh over the year
- solar production follows hourly profile from France 2024
- dispatchable (gas, nuclear, hydro, ...) are aggregated into one single source, which can be adjusted from 0 to 100% of the installed capacity
- among the dispatachble sources, we consider an IDEAL storage system (energy stored can be fully recovered)
- the data in the CSV is in MW

## 1 Numerical analysis

### Question 1
The load factor (Edem: Should it be capacity factor instead?!) is the ratio between the average and installed
power. Explain why this quantity can be approximated as the ratio of the average and maximal power throughout the year. Estimate the load factor of the solar production.

### Answer:
When providing the capacity of a PV system, the nameplate power of the system is used which was measured under standard conditions (1000 W/m2 solar irradiance, AM1.5 spectrum, and a cell temperature of 25°C) which is approximately equivalent to a sunny day in western europe. Since the maximum solar power output in France will have appeared on a sunny day, the conditions on this day will be similar to the standard conditions used to determine the capacity  (in some regions irradiance might have been a bit higher than 1000 W/m2, in some other regions a bit lower). Thus, we can estimate the capacity by the maximum solar generation thorughout the year.

As calculated below, the capacity factor (load factor) of Solar energy in France 2023 was approx. 0.189 .


In [None]:
import pandas as pd
import numpy as np
from matplotlib.dates import DateFormatter
import matplotlib.pyplot as plt
from sklearn import linear_model

In [None]:
df = pd.read_csv(
    "data-Rte-2023-CSV.csv",
    sep=';',          # correct delimiter
    encoding='utf-8', # handles accents
    on_bad_lines='skip'  # skip problematic lines
)

df["date_time"] = pd.to_datetime(df["Date"] + " " + df["Heures"], dayfirst=True, errors="coerce")
df = df.set_index("date_time")

average_power_solar = df['Solaire'].mean()

capacity_factor_solar = average_power_solar/df['Solaire'].max()
print('Estimated Capacity Factor for Solar Generation in France 2023: ', capacity_factor_solar)

In [None]:
df.head()

### Question 2
We will first consider an integration strategy relying entirely on the dispatchable source.
- The installed solar capacity is such that x TWh are produced over the year.
- We don’t consider storage in this question.
- When the solar power exceeds demand, production is curtailed. When solar power is not sufficient
 to meet power demand, the difference is supplied by the dispatchable source.

#### 2.1 Plot the required installed capacity of the dispatchable source (in GW) as a function of the solar production

In [None]:
solar_residual_load = (df['Consommation']-df['Solaire']).dropna()
#required_dispatchable_capacity = solar_residual_load.resample('D').mean().groupby(df['Solaire'].dropna().resample('D').mean())


plt.figure(figsize=[8,5])
plt.plot(solar_residual_load.resample('D').mean(), linewidth=0.9)
plt.plot(df['Solaire'].dropna().resample('D').mean(),linewidth=0.9)
plt.plot(df['Consommation'].dropna().resample('D').mean(),linewidth=0.9)
plt.ylim(0, 76000)
plt.legend(['Residual_load','Solar', 'Consommation'])
plt.tight_layout()
plt.grid(True,alpha=0.4)

plt.figure(figsize=[8,5])
plt.scatter(df['Solaire'].dropna()/1000,solar_residual_load/1000, marker='x', linewidths=0.5)
plt.xlabel('Solaire')
plt.ylabel('Consommation')
plt.ylim(20, 90)




'''
# Ordinary Least Square Linear regression (APPROACH WRONG: Requiered capacity is always the maximum) - On Average required capacity
# Define a linear regressor
reg = linear_model.LinearRegression(fit_intercept=True)

# Prepare input and output for fit
X_train = df['Solaire'].dropna().values[:, None]
y_train = solar_residual_load.values

# Fit
reg.fit(X_train, y_train)

# Print
print('Estimated coefficients:')
print('Intercept:\t{:.2e} (MWh) UNIT WRONG'.format(reg.intercept_))
print('Slope:\t\t{:.2e} (MWh/°C) UNIT WRONG'.format(reg.coef_[0]))

# Compute the train R2 with Scikit-Learn and print them
r2_train = reg.score(X_train, y_train)
print('Train R2:\t{:.2f}'.format(r2_train))

# Define an array of 100 temperatures ranging from -5 to 35°C
x_pred = np.linspace(0, 14, 100)

# Prepare these temperatures for the prediction
X_pred = x_pred[:, None]

# Predict
y_pred = reg.predict(X_pred)

plt.plot(x_pred, y_pred, color='red')
plt.legend(['daily average residual load','linear model of temperature dependency of residual load'])
'''

# Maximum capacity

# 1. Find maximum solar generation
solar_max = df['Solaire'].max()

# 2. Create 49 evenly spaced bins from 0 to solar_max
bins = np.linspace(0, solar_max, 30)

# 3. Bin the solar generation values
binned = pd.cut(df['Solaire'], bins=bins, include_lowest=True)

# 4. For each bin, compute the maximum residual load
residual_max_per_bin = solar_residual_load.groupby(binned, observed=False).max()

# 5. Compute representative (midpoint) solar generation value for each bin
bin_centers = (bins[:-1] + bins[1:]) / 2

# 6. Convert to DataFrame for clarity
result = pd.DataFrame({
    "Solar_Gen_bin_center_MW": bin_centers,
    "Max_Residual_Load_MW": residual_max_per_bin.values
})

# 7. Plot the relationship
plt.plot(result["Solar_Gen_bin_center_MW"]/1000,  # optional: convert to GW
         result["Max_Residual_Load_MW"]/1000, 'x-', lw=1, color='red')
plt.xlabel("Solar Generation (GW)")
plt.ylabel("Requiered capacity of dispatchable source (GW)")
plt.title("Maximum Residual Load vs Solar Generation (France 2023)")
plt.grid(True,alpha=0.3)
plt.tight_layout()
plt.show()

In [None]:
solar_residual_load.head()

####  2.3 Plot the amount of electricity produced by the dispatchable source and the total amount of electricity produced by the whole system over the year (in TWh) as a function of the solar production

In [None]:
total_electricity_production = df['Consommation'].sum()/1e-6

plt.figure(figsize=[12,8])
#plt.plot(total_electricity_production, bins)


#### 2.4 Comment on 2.3 results

### Question 3
We will now consider a different strategy, where the integration is entirely based on a storage system. This question is thus independent from the previous one.

* The installed solar capacity is such that x TWh are produced over the year.
* The dispatchable sources are used at a constant power, so that the total energy produced by the
system (solar + dispatchable) is 1 TWh over the year.
* The power balance is ensured entirely by a storage system, which charges whenever production
exceeds demand, and supplies power whenever demand exceeds production.
1. Plot the required installed capacity (in GW) and the load factor of the dispatchable source as a
function of the solar production.
2. Plot the total amount of energy provided by and to the storage system throughout the year as a
function of the solar production.
3. Plot the required capacity of the storage system (ie the maximal amount of energy that needs to be
stored at any given time, in TWh) as a function of the solar production.
4. Comment these results. Incl

### Question 4

We now consider an intermediate strategy, in which both the dispatchable source and the storage
system are used to integrate the solar production. This question is thus independent from the previous
ones.


* The installed solar capacity is such that 0.4 TWh are produced over the year.
* The dispatchable source has an installed capacity of y GW.
* The storage system supplies energy to meet demand when {solar + dispatchable} sources are not
sufficient.
* When {solar + dispatchable} exceed demand, the excess power can be used to charge the battery.
The charging power is limited to the minimal value such that the battery receives as much energy
as it provides throughout the year.
1. What is the minimum dispatchable capacity ymin (in GW) below which the system cannot be bal-
anced? What is the maximal dispatchable capacity ymax above which the storage system is not
required?
2 PHY 51055 - Homework - 2025
2. Plot the amount of energy produced by the dispatchable source, the required capacity of the storage
system and the total amount of energy provided by the storage system over the year as a function
of y.
3. Comment your results. I

## 2 Analytical investigation

The analysis is adapted from the work of Arthur Clerjon et Fabien Perdu at CEA. It aims at showcasing that
different storage technologies may be relevant for different applications.
Consider that we have a constant power demand P throughout period T (typically, 1 year).
During this period, the power source produces a varying power, with a period ∆T. We note β the amplitude
of the variation and α the oversizing of production as compared to demand. These quantities are illustrated
over a single cycle below.

![cycle](cycle.png "Single cycle")

We aim at supplying demand at all times through a storage system.
1. Estimate the amount of energy supplied to the storage system over the duration ∆T.
2. Estimate the amount of energy provided by the storage system over the duration ∆T.
3. The efficiency of the storage system is defined as the fraction η which can be recovered from the energy
supplied to it. Express the required oversizing of the production α as a function of the efficiency of the
storage system.
4. For simplicity, we will consider that the variation of the power source production brings it down to zero.
How does this simplify the previous expression? How much energy is produced over the duration ∆T?
How much is stored? How much is provided by the storage system?
5. The LCOE for electricity production is γe (in €/kWh produced). The cost of the battery γ is related to the
amount of energy the battery is able to release in one full discharge. Estimate total cost over the period
T.
6. Consider two possible storage technologies with efficiency ηi,j and cost γi,j. Discuss the tradeoff and
illustrate it with concrete examples. _(In the context of this exercise, consider that batteries are cheap and
inefficient, while thermal storage is expensive and efficien)_ in white text lol