<a href="https://colab.research.google.com/github/anschelalmog/SDRML_project/blob/main/project_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div style="text-align: center">
    <img src='https://github.com/CLAIR-LAB-TECHNION/CLAI/blob/main/tutorials/assets/logo.png?raw=true' width=800/>  
</div>

# **Sequential Decision Making and Reinforcement Learning (SDRML) Final Project**

## 👥 **Team Members**
| **Name** | **ID** |
|:---------------:|:----------:|
| [Almog Anschel](mailto:anschelalmog@campus.technion.ac.il) | 316353531 |
| [Eden](mailto:eden@campus.technion.ac.il) | 123456778 |


## 📌 [GitHub Repo & Paper](https://github.com/anschelalmog/SDRML_project.git)




# <img src="https://img.icons8.com/?size=100&id=ZGqV6cHUtDmj&format=png&color=000000" style="height:50px;display:inline"> Imports and Utillity Functions




## Imports

In [2]:
from IPython.display import clear_output
!pip install torch gymnasium tensorboardX matplotlib pyyaml
clear_output()

import os
import random
import numpy as np
import matplotlib.pyplot as plt
import yaml
from datetime import datetime
##################
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.optim import Adam
from torch.utils.tensorboard import SummaryWriter
##################
import gymnasium as gym
from gymnasium.envs.registration import register
from gymnasium import spaces, utils



## Utility Functions

In [6]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

seed = 42
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)
print("Random seed set.")


register(id='ElectricityMarket-v0',
         entry_point='__main__:ElectricityMarketEnv',
         kwargs={'args': None},
         max_episode_steps=200)

# Setup TensorBoard
%load_ext tensorboard

# Function to launch TensorBoard
log_dir = "logs/"
os.makedirs(log_dir, exist_ok=True)
writer = SummaryWriter(log_dir)

Using device: cuda
Random seed set.
The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


  logger.warn(f"Overriding environment {new_spec.id} already in registry.")


# <img src="https://img.icons8.com/?size=100&id=ZKacxH3j7_2b&format=png&color=000000" style="height:50px;display:inline"> Environment

The **ElectricityMarketEnv** is an environment for simulating an electricity market with a battery storage system.

The agent interacts with the environment by taking continuous actions, charging (positive) or discharging (negative) a battery.

The environment models the dynamics of household electricity demand and market price, both following periodic functions with noise.

---

## **States**
- **$SoC$** : State of Charge - current energy level in the battery.
- **$ D_t$**: Household electricity demand at the current timestep.
- **$ P_t$**: Market price of electricity at the current timestep.


## **Action**
- A continuous value in **$[- \text{battery_cap}, \text{battery_cap}]$**.


## **Reward (Options):**

1.    **profit:** (default) $R = P_t \times \text{sold_energy}$
- Rewards the agent solely based on the revenue generated from selling surplus energy after meeting demand.
2. **penalty unmet:** $R = P_t \times \text{discharge_amount} - \lambda \times \max(0, D_t - \text{discharge_amount})$
- Rewards all discharged energy but applies a **penalty** for any portion of household demand that is not met.
- This encourages the agent to **prioritize internal demand** before selling energy.
3. **degradation:**
$ R = P_t \times \text{sold_energy} - c \times \text{discharge_amount} $
- Rewards surplus energy sold while **subtracting a cost** proportional to the total discharged energy to account for **battery degradation**.


In [5]:
class ElectricityMarketEnv(gym.Env):
    def __init__(self, args=None):
        super(ElectricityMarketEnv, self).__init__()

        # --------------------------
        # Battery parameters
        # --------------------------
        self.battery_capacity = 100.0  # Maximum energy that can be stored
        self.initial_soc = 50.0  # Initial State of Charge
        self.soc = self.initial_soc  # Current SoC

        # --------------------------
        # Environment dynamics parameters
        # --------------------------
        self.max_steps = 200  # Total timesteps in one episode
        self.current_step = 0  # Timestep counter

        # --------------------------
        # Define the action space.
        # The agent's action is a continuous value in [-battery_capacity, battery_capacity].
        # A positive value represents charging and a negative value represents discharging.
        # --------------------------
        self.action_space = spaces.Box(low=-self.battery_capacity,
                                       high=self.battery_capacity,
                                       shape=(1,),
                                       dtype=np.float32)

        # --------------------------
        # Define the observation space.
        # The state consists of [SoC, Demand, Price].
        # SoC is bounded between 0 and battery_capacity.
        # Demand and Price are non-negative; we use a very high upper bound.
        # --------------------------
        obs_low = np.array([0.0, 0.0, 0.0], dtype=np.float32)
        obs_high = np.array([self.battery_capacity, np.finfo(np.float32).max, np.finfo(np.float32).max],
                            dtype=np.float32)
        self.observation_space = spaces.Box(low=obs_low, high=obs_high, dtype=np.float32)

        # --------------------------
        # Reward function selection.
        # Default is "profit", but you can choose "penalty_unmet" or "degradation" via the args parameter.
        # --------------------------
        self.reward_type = "profit"
        if args is not None and hasattr(args, 'reward_type'):
            self.reward_type = args.reward_type
            #logger.info(
             #   f"ElectricityMarketEnv initialized with battery_capacity={self.battery_capacity}, initial_soc={self.initial_soc}, max_steps={self.max_steps}, reward_type={self.reward_type}")

        # --------------------------
        # Set the random seed for reproducibility.
        # --------------------------
        self.seed()

    def seed(self, seed=None):
        """
        Set the seed for random number generation.
        """
        self.np_random, seed = utils.seeding.np_random(seed)
        return [seed]

    def reset(self, seed=42, options=None):
        """
        Reset the environment to the initial state at the start of an episode.
        Returns a tuple (observation, info) as required by Gymnasium.
        """
        self.soc = self.initial_soc
        self.current_step = 0
        # logger.info(f"Environment reset: initial SoC={self.soc}")
        obs = self._get_obs()
        return obs, {}

    def _get_obs(self):
        """
        Construct the current observation.
        The observation includes the current SoC, the demand, and the price.
        Demand and price are functions of normalized time.
        """
        t_norm = self.current_step / self.max_steps  # Normalized time [0, 1]
        demand = self._demand_function(t_norm)
        price = self._price_function(t_norm)
        obs = np.array([self.soc, demand, price], dtype=np.float32)
        # logger.debug(f"Observation: SoC={self.soc}, Demand={demand:.2f}, Price={price:.2f}")
        return obs

    def _demand_function(self, t):
        """
        Compute the household electricity demand at time t.

        The demand function is modeled as a combination of two Gaussian functions:

        \[
        f_D(t) = 100 \cdot \exp\left(-\frac{(t-0.4)^2}{2 \cdot (0.05)^2}\right)
                + 120 \cdot \exp\left(-\frac{(t-0.7)^2}{2 \cdot (0.1)^2}\right)
        \]

        A small Gaussian noise (mean 0, std 5.0) is added to simulate randomness.
        """
        base_demand = (100.0 * np.exp(-((t - 0.4) ** 2) / (2 * (0.05 ** 2))) +
                       120.0 * np.exp(-((t - 0.7) ** 2) / (2 * (0.1 ** 2))))
        noise = self.np_random.normal(0, 5.0)  # Noise term
        demand = max(base_demand + noise, 0.0)  # Ensure demand is non-negative
        # logger.debug(f"Computed demand={demand:.2f} at t={t:.2f}")

        return demand

    def _price_function(self, t):
        """
        Compute the market price of electricity at time t.

        The price function is modeled as a combination of two Gaussian functions:

        \[
        f_P(t) = 50 \cdot \exp\left(-\frac{(t-0.3)^2}{2 \cdot (0.07)^2}\right)
                + 80 \cdot \exp\left(-\frac{(t-0.8)^2}{2 \cdot (0.08)^2}\right)
        \]

        A small Gaussian noise (mean 0, std 2.0) is added to simulate randomness.
        """
        base_price = (50.0 * np.exp(-((t - 0.3) ** 2) / (2 * (0.07 ** 2))) +
                      80.0 * np.exp(-((t - 0.8) ** 2) / (2 * (0.08 ** 2))))
        noise = self.np_random.normal(0, 2.0)  # Noise term
        price = max(base_price + noise, 0.0)  # Ensure price is non-negative
        # logger.debug(f"Computed price={price:.2f} at t={t:.2f}")
        return price

    def step(self, action):
        """
        Execute one timestep in the environment.

        Parameters:
            action (array): A 1D numpy array with one element representing the
                            amount of energy to charge (positive) or discharge (negative).

        Returns:
            obs (array): Next observation ([SoC, demand, price]).
            reward (float): Reward earned this timestep.
            done (bool): Whether the episode has ended.
            info (dict): Additional info (empty in this implementation).
        """
        # Clip the action within allowed bounds.
        action = np.clip(action, self.action_space.low, self.action_space.high)
        action_value = action[0]  # Extract scalar from array
        info = {}

        # Compute current demand and price based on normalized time.
        t_norm = self.current_step / self.max_steps
        demand = self._demand_function(t_norm)
        price = self._price_function(t_norm)

        if action_value >= 0:
            # --------------------------
            # Charging the battery.
            # The battery cannot be charged beyond its capacity.
            # --------------------------
            charge_amount = min(action_value, self.battery_capacity - self.soc)
            self.soc += charge_amount
            reward = 0.0  # No immediate reward for charging
            # logger.info(
                #f"Step {self.current_step}: Charging battery by {charge_amount:.2f} kWh, New SoC={self.soc:.2f}")

        else:
            # --------------------------
            # Discharging the battery.
            # First, limit the discharge to the available energy in the battery.
            # --------------------------
            discharge_requested = -action_value  # Convert to a positive value
            discharge_possible = min(discharge_requested, self.soc)
            self.soc -= discharge_possible

            # --------------------------
            # Use discharged energy to meet the household demand.
            # Any energy beyond meeting the demand is sold to the grid.
            # --------------------------
            if discharge_possible >= demand:
                sold_energy = discharge_possible - demand
            else:
                sold_energy = 0.0  # Not enough to cover demand, so nothing is sold

            # Compute reward based on the chosen reward function.
            reward = self._compute_reward(demand, sold_energy, price, discharge_possible)
            # logger.info(
               # f"Step {self.current_step}: Discharged {discharge_possible:.2f} kWh, Demand={demand:.2f}, Sold Energy={sold_energy:.2f}, Reward={reward:.2f}")

        self.current_step += 1
        done = self.current_step >= self.max_steps

        terminated = False
        truncated = done  # Consider ending due to time limit as truncation

        # Construct the next observation.
        obs = np.array([self.soc, demand, price], dtype=np.float32)

        return obs, reward, terminated, truncated, info

    def _compute_reward(self, demand, sold_energy, price, discharge_amount):
        """
        Compute the reward based on the selected reward function.

        Three reward functions are available:

        1. "profit" (default):
            Reward = \( \text{price} \times \text{sold\_energy} \)

            This function rewards the agent solely based on the revenue generated
            from selling surplus energy to the grid after meeting demand.

        2. "penalty_unmet":
            Reward = \( \text{price} \times \text{discharge\_amount} - \lambda \times \max(0, \text{demand} - \text{discharge\_amount}) \)

            Here, the agent earns revenue for all discharged energy but is penalized
            for any portion of household demand that is not met. This encourages the agent
            to ensure that the internal demand is satisfied before selling energy.

        3. "degradation":
            Reward = \( \text{price} \times \text{sold\_energy} - c \times \text{discharge\_amount} \)

            This reward takes into account battery degradation. While the agent earns revenue
            for selling surplus energy, it incurs a cost proportional to the discharged energy,
            which simulates battery wear-and-tear and encourages cautious discharging.
        """
        if self.reward_type == "profit":
            # --------------------------
            # Reward function 1: Profit Reward (default)
            # --------------------------
            return price * sold_energy

        elif self.reward_type == "penalty_unmet":
            # --------------------------
            # Reward function 2: Penalize Unmet Demand.
            # --------------------------
            penalty = 10.0  # Penalty coefficient (tunable)
            unmet_demand = max(0.0, demand - discharge_amount)
            return price * discharge_amount - penalty * unmet_demand

        elif self.reward_type == "degradation":
            # --------------------------
            # Reward function 3: Battery Degradation Cost Aware.
            # --------------------------
            degradation_cost = 0.5  # Cost per unit of discharged energy (tunable)
            return price * sold_energy - degradation_cost * discharge_amount

        else:
            # Fallback to default profit reward if an unknown reward type is provided.
            return price * sold_energy

    def render(self, mode='human'):
        """
        Render the current state of the environment.
        """
        print(f"Step: {self.current_step}, SoC: {self.soc:.2f}")


# <img src="https://img.icons8.com/?size=100&id=6MP1kA74ozKg&format=png&color=000000" style="height:50px;display:inline"> Agent

In [20]:
# --------------------------
# Replay Buffer
# --------------------------
class ReplayBuffer:
    """
    A simple Replay Buffer for storing transitions observed from the environment.
    """

    def __init__(self, capacity):
        self.capacity = capacity
        self.buffer = []
        self.position = 0

    def push(self, state, action, reward, next_state, done):
        """Saves a transition."""
        if len(self.buffer) < self.capacity:
            self.buffer.append(None)
        self.buffer[self.position] = (state, action, reward, next_state, done)
        self.position = (self.position + 1) % self.capacity

    def sample(self, batch_size):
        """
        Samples a batch of transitions.
        Returns:
            state, action, reward, next_state, done: each as a NumPy array.
        """
        batch = random.sample(self.buffer, batch_size)
        state, action, reward, next_state, done = map(np.stack, zip(*batch))
        return state, action, reward, next_state, done

    def __len__(self):
        return len(self.buffer)


# --------------------------
# Actor Network (Policy)
# --------------------------
class Actor(nn.Module):
    """
    Actor network for SAC.
    Given a state, it outputs the mean and log standard deviation of a Gaussian distribution.
    The action is sampled using the reparameterization trick and squashed using tanh.
    """

    def __init__(self, state_dim, action_dim, max_action, hidden_dim=256):
        super(Actor, self).__init__()
        self.max_action = max_action

        self.l1 = nn.Linear(state_dim, hidden_dim)
        self.l2 = nn.Linear(hidden_dim, hidden_dim)

        self.mean = nn.Linear(hidden_dim, action_dim)
        self.log_std = nn.Linear(hidden_dim, action_dim)

        # Limits for numerical stability
        self.LOG_STD_MIN = -20
        self.LOG_STD_MAX = 2

    def forward(self, state):
        x = F.relu(self.l1(state))
        x = F.relu(self.l2(x))
        mean = self.mean(x)
        log_std = self.log_std(x)
        # Clamp log_std for numerical stability
        log_std = torch.clamp(log_std, self.LOG_STD_MIN, self.LOG_STD_MAX)
        return mean, log_std

    def sample(self, state, evaluate=False):
        """
        Sample an action given the state.
        If evaluate is True, returns the deterministic action (mean) after squashing.
        Otherwise, returns a sampled action along with its log probability.
        """
        mean, log_std = self.forward(state)
        std = log_std.exp()

        if evaluate:
            # For evaluation, use the mean (deterministic policy)
            action = torch.tanh(mean) * self.max_action
            log_prob = None
        else:
            # Reparameterization trick
            normal = torch.distributions.Normal(mean, std)
            x_t = normal.rsample()  # sample and add noise
            action = torch.tanh(x_t) * self.max_action

            # Compute log probability and adjust for tanh squashing
            log_prob = normal.log_prob(x_t)
            log_prob = log_prob.sum(dim=-1, keepdim=True)
            # Adjustment term for tanh squashing (to correct probability density)
            log_prob -= torch.log(1 - torch.tanh(x_t).pow(2) + 1e-6).sum(dim=-1, keepdim=True)
        return action, log_prob


# --------------------------
# Critic Network (Q-function)
# --------------------------
class Critic(nn.Module):
    """
    Critic network for SAC.
    Evaluates the Q-value for a given state and action pair.
    """

    def __init__(self, state_dim, action_dim, hidden_dim=256):
        super(Critic, self).__init__()
        self.l1 = nn.Linear(state_dim + action_dim, hidden_dim)
        self.l2 = nn.Linear(hidden_dim, hidden_dim)
        self.l3 = nn.Linear(hidden_dim, 1)

    def forward(self, state, action):
        # Concatenate state and action along the last dimension
        x = torch.cat([state, action], dim=-1)
        x = F.relu(self.l1(x))
        x = F.relu(self.l2(x))
        x = self.l3(x)
        return x


# --------------------------
# SAC Agent
# --------------------------
class Agent:
    """
    Soft Actor-Critic Agent.
    This agent encapsulates the actor, two critics, their target networks, optimizers,
    and a replay buffer. It supports automatic entropy tuning for the exploration/exploitation tradeoff.
    """

    def __init__(self, env, args=None):
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        print(f"Using device: {self.device}")

        # Hyperparameters (with reasonable default values)

        self.gamma = 0.99 if args is None else getattr(args, 'gamma', 0.99)  # Discount factor
        self.tau = 0.005 if args is None else getattr(args, 'tau', 0.005)  # Soft update coefficient for target networks
        self.batch_size = 256 if args is None else getattr(args, 'batch_size', 256)  # Batch size for updates

        # Learning rates
        self.actor_lr =3e-14 if args is None else getattr(args, 'actor_lr', 3e-4)
        self.critic_lr =3e-14 if args is None else getattr(args, 'critic_lr', 3e-4)

        # Replay Buffer capacity
        self.buffer_capacity = 1_000_000 if args is None else getattr(args, 'buffer_capacity', 1_000_000)

        # Automatic entropy tuning flag and target entropy
        self.automatic_entropy_tuning = True if args is None else getattr(args, 'automatic_entropy_tuning', True)
        self.target_entropy = -env.action_space.shape[0]

        # Environment dimensions
        self.state_dim = env.observation_space.shape[0]
        self.action_dim = env.action_space.shape[0]
        self.max_action = float(env.action_space.high[0])

        # Initialize actor and critics (two Q–networks)
        self.actor = Actor(self.state_dim, self.action_dim, self.max_action).to(self.device)
        self.critic1 = Critic(self.state_dim, self.action_dim).to(self.device)
        self.critic2 = Critic(self.state_dim, self.action_dim).to(self.device)

        # Create target networks and initialize with critic parameters
        self.critic1_target = Critic(self.state_dim, self.action_dim).to(self.device)
        self.critic2_target = Critic(self.state_dim, self.action_dim).to(self.device)
        self.critic1_target.load_state_dict(self.critic1.state_dict())
        self.critic2_target.load_state_dict(self.critic2.state_dict())

        # Optimizers
        self.actor_optimizer = Adam(self.actor.parameters(), lr=self.actor_lr)
        self.critic_optimizer = Adam(list(self.critic1.parameters()) + list(self.critic2.parameters()),
                                     lr=self.critic_lr)

        # Entropy coefficient (alpha) with optional automatic tuning
        if self.automatic_entropy_tuning:
            self.log_alpha = torch.zeros(1, requires_grad=True, device=self.device)
            self.alpha_optimizer = Adam([self.log_alpha], lr=self.actor_lr)
            self.alpha = self.log_alpha.exp().detach()
        else:
            self.alpha = getattr(args, 'alpha', 0.2)

        # Replay Buffer for experience storage
        self.replay_buffer = ReplayBuffer(self.buffer_capacity)

    def select_action(self, state, evaluate=False):
        """
        Given a state, select an action according to the current policy.
        If evaluate is True, return a deterministic action.
        """
        state = torch.FloatTensor(state).to(self.device).unsqueeze(0)
        action, _ = self.actor.sample(state, evaluate=evaluate)
        return action.detach().cpu().numpy()[0]

    def store_transition(self, state, action, reward, next_state, done):
        """
        Store a transition in the replay buffer.
        """
        self.replay_buffer.push(state, action, reward, next_state, done)

    def update(self):
        """
        Update the networks (actor and critics) using a mini-batch sampled from the replay buffer.
        """
        if len(self.replay_buffer) < self.batch_size:
            return None  # Not enough samples to update

        # Sample a batch of transitions
        state, action, reward, next_state, done = self.replay_buffer.sample(self.batch_size)
        state = torch.FloatTensor(state).to(self.device)
        action = torch.FloatTensor(action).to(self.device)
        reward = torch.FloatTensor(reward).to(self.device).unsqueeze(1)
        next_state = torch.FloatTensor(next_state).to(self.device)
        done = torch.FloatTensor(done).to(self.device).unsqueeze(1)

        # --------------------------
        # Critic update
        # --------------------------
        with torch.no_grad():
            # Sample next actions and compute their log probabilities
            next_action, next_log_prob = self.actor.sample(next_state)
            # Compute target Q-values using target networks and take the minimum to mitigate overestimation
            target_Q1 = self.critic1_target(next_state, next_action)
            target_Q2 = self.critic2_target(next_state, next_action)
            target_Q = torch.min(target_Q1, target_Q2) - self.alpha * next_log_prob
            # Bellman backup for Q functions
            target_value = reward + (1 - done) * self.gamma * target_Q

        # Compute current Q estimates
        current_Q1 = self.critic1(state, action)
        current_Q2 = self.critic2(state, action)
        # Mean Squared Error loss for both critics
        critic_loss = F.mse_loss(current_Q1, target_value) + F.mse_loss(current_Q2, target_value)

        self.critic_optimizer.zero_grad()
        critic_loss.backward()
        self.critic_optimizer.step()

        # --------------------------
        # Actor update
        # --------------------------
        new_action, log_prob = self.actor.sample(state)
        # Evaluate new actions using the current critics
        Q1_new = self.critic1(state, new_action)
        Q2_new = self.critic2(state, new_action)
        Q_new = torch.min(Q1_new, Q2_new)
        # The actor loss maximizes the expected return and entropy
        actor_loss = (self.alpha * log_prob - Q_new).mean()

        self.actor_optimizer.zero_grad()
        actor_loss.backward()
        self.actor_optimizer.step()

        # --------------------------
        # Entropy coefficient update (if automatic tuning is enabled)
        # --------------------------
        if self.automatic_entropy_tuning:
            alpha_loss = -(self.log_alpha * (log_prob + self.target_entropy).detach()).mean()
            self.alpha_optimizer.zero_grad()
            alpha_loss.backward()
            self.alpha_optimizer.step()
            self.alpha = self.log_alpha.exp().detach()

        # --------------------------
        # Soft update of target networks
        # --------------------------
        for param, target_param in zip(self.critic1.parameters(), self.critic1_target.parameters()):
            target_param.data.copy_(self.tau * param.data + (1 - self.tau) * target_param.data)
        for param, target_param in zip(self.critic2.parameters(), self.critic2_target.parameters()):
            target_param.data.copy_(self.tau * param.data + (1 - self.tau) * target_param.data)

        # For monitoring purposes, you might return losses
        return actor_loss.item(), critic_loss.item()


# <img src="https://img.icons8.com/?size=100&id=122695&format=png&color=000000" style="height:50px;display:inline"> Trainer

In [8]:
class Trainer:
    def __init__(self, env, agent, episodes=100):
        self.env = env
        self.agent = agent
        self.episodes = episodes
        self.episode_rewards = []

    def train(self):
        for episode in range(self.episodes):
            state, _ = self.env.reset()
            done = False
            episode_reward = 0.0

            while not done:
                action = self.agent.select_action(state)
                next_state, reward, done, _, _ = self.env.step(action)
                self.agent.store_transition(state, action, reward, next_state, done)
                self.agent.update()
                state = next_state
                episode_reward += reward

            self.episode_rewards.append(episode_reward)
            writer.add_scalar('Reward/Episode', episode_reward, episode)
            print(f"Episode {episode+1}/{self.episodes} - Reward: {episode_reward:.2f}")
        self.plot_metrics()

    def plot_metrics(self):
        plt.plot(range(len(self.episode_rewards)), self.episode_rewards)
        plt.xlabel("Episode")
        plt.ylabel("Reward")
        plt.title("Training Progress")
        plt.show()

# <img src="https://img.icons8.com/?size=100&id=114903&format=png&color=000000" style="height:50px;display:inline"> Run

In [9]:
register_env()
env = gym.make('ElectricityMarket-v0')
agent = Agent(args=None, env=env)
trainer = Trainer(env, agent, episodes=50)
trainer.train()
print("Training Complete.")

# <img src="https://img.icons8.com/?size=100&id=122512&format=png&color=000000" style="height:50px;display:inline"> Results and Plots

# <img src="https://img.icons8.com/?size=100&id=121704&format=png&color=000000" style="height:50px;display:inline"> Expension

# <img src="https://img.icons8.com/?size=100&id=122512&format=png&color=000000" style="height:50px;display:inline"> Results and Plots

# <img src="https://img.icons8.com/?size=100&id=4Nd6LXTNhoed&format=png&color=000000" style="height:50px;display:inline"> Conclusions and Future Work