# Step 1: Install Required Libraries

In [1]:
# Import the NumPy library for numerical operations, including array manipulations and mathematical functions
import numpy as np
# Import the Pandas library for handling and analyzing structured data (e.g., CSV, DataFrames)
import pandas as pd
# Import the random module to generate random numbers, useful for stochastic processes in simulations
import random
# Import Matplotlib for data visualization, particularly for creating graphs and plots
import matplotlib.pyplot as plt
# Import defaultdict from the collections module to create dictionary-like objects 
# with a default value for missing keys (useful in reinforcement learning and Q-learning)
from collections import defaultdict

# Step 2: Generate Simulated Data

In [2]:
# Generate synthetic hydrogen production data
np.random.seed(42)
# Setting a seed ensures that the random numbers generated are the same every time the code is executed.
# This helps in reproducibility.

# Generate 500 samples
data_size = 500
# This variable represents the number of synthetic data points to be generated (500 samples).

# Random values for electrolysis parameters
# The np.random.uniform(a, b, size) function generates data_size number of values between a and b.
voltage = np.random.uniform(1.5, 2.5, data_size)  # Voltage (V)
temperature = np.random.uniform(50, 80, data_size)  # Temperature (¬∞C)
pressure = np.random.uniform(1, 5, data_size)  # Pressure (bar)

# Energy consumption (lower is better)
energy_consumption = 50 - (voltage * 5 + temperature * 0.5 + pressure * 2) + np.random.normal(0, 2, data_size)

# Hydrogen yield (higher is better)
hydrogen_yield = (voltage * 3 + temperature * 0.2 + pressure * 1.5) + np.random.normal(0, 1, data_size)

# Creating DataFrame
# Stores all electrolysis parameters, energy consumption, and hydrogen yield in a structured format.
# Easier to analyze, visualize, and process the data.
df = pd.DataFrame({
    'Voltage (V)': voltage,
    'Temperature (¬∞C)': temperature,
    'Pressure (bar)': pressure,
    'Energy Consumption (kWh/kg H‚ÇÇ)': energy_consumption,
    'Hydrogen Yield (kg)': hydrogen_yield
})

# Display first few rows
df.head()

Unnamed: 0,Voltage (V),Temperature (¬∞C),Pressure (bar),Energy Consumption (kWh/kg H‚ÇÇ),Hydrogen Yield (kg)
0,1.87454,70.944851,1.740532,-3.325001,23.740503
1,2.450714,66.082891,3.167604,2.951661,25.202058
2,2.231994,59.285828,4.491783,-2.565596,23.168968
3,2.098658,74.413851,3.9289,-8.848814,26.464273
4,1.656019,70.541935,4.226245,0.041591,26.712804


üîπ Explanation:

We simulate 500 observations of electrolysis conditions.

Energy consumption is inversely related to efficiency (lower is better).

Hydrogen yield is directly proportional to optimal electrolysis settings.

# Step 3: Define the Reinforcement Learning Environment

In [3]:
# The Python class HydrogenOptimizationEnv defines a Reinforcement Learning (RL) environment
# where an agent (AI model) can adjust voltage, temperature, and pressure to optimize hydrogen production.
# The AI agent will interact with this environment by adjusting voltage, temperature, and pressure.
class HydrogenOptimizationEnv:
    def __init__(self):
        # This constructor method (__init__) initializes the environment when an object of the class is created.
# It defines the state space, action space, and the initial state.
        # Define state space (Voltage, Temperature, Pressure)
        self.state_space = {
            'voltage': np.linspace(1.5, 2.5, 10),  # 10 voltage levels
            'temperature': np.linspace(50, 80, 10),  # 10 temperature levels
            'pressure': np.linspace(1, 5, 5)  # 5 pressure levels
        }
        
        # Define actions (increase or decrease each parameter)
        self.action_space = ['increase_voltage', 'decrease_voltage',
                             'increase_temp', 'decrease_temp',
                             'increase_pressure', 'decrease_pressure']
        
        # Initial state
        self.state = [2.0, 65, 3]  # Starting with an average setting
        self.steps = 0

    def step(self, action):
        """ Take an action and return new state, reward, and done flag """
        voltage, temperature, pressure = self.state
        
        # Apply actions
        if action == 'increase_voltage': voltage = min(voltage + 0.1, 2.5)
        elif action == 'decrease_voltage': voltage = max(voltage - 0.1, 1.5)
        elif action == 'increase_temp': temperature = min(temperature + 2, 80)
        elif action == 'decrease_temp': temperature = max(temperature - 2, 50)
        elif action == 'increase_pressure': pressure = min(pressure + 0.5, 5)
        elif action == 'decrease_pressure': pressure = max(pressure - 0.5, 1)
        
        # Calculate efficiency: reward is high when hydrogen yield is high & energy consumption is low
        hydrogen_yield = (voltage * 3 + temperature * 0.2 + pressure * 1.5)
        energy_efficiency = 50 - (voltage * 5 + temperature * 0.5 + pressure * 2)
        
        reward = hydrogen_yield - (energy_efficiency / 10)  # Balancing both factors
        
        # Update state
        self.state = [voltage, temperature, pressure]
        
        # End after 50 steps
        self.steps += 1
        done = self.steps >= 50
        
        return self.state, reward, done

    def reset(self):
        """ Reset the environment """
        self.state = [2.0, 65, 3]  # Reset to initial condition
        self.steps = 0
        return self.state

State space defines all possible values that the environment can take.

We use NumPy‚Äôs linspace function to generate evenly spaced values.

Voltage: Ranges from 1.5V to 2.5V, with 10 discrete levels.

Temperature: Ranges from 50¬∞C to 80¬∞C, with 10 discrete levels.

Pressure: Ranges from 1 bar to 5 bar, with 5 discrete levels.

üìå Purpose: This allows our RL agent to pick an optimal combination of these parameters for hydrogen production.

Action space defines what the AI agent can do in the environment.

The AI can increase or decrease voltage, temperature, or pressure.

These six actions are like buttons that the AI can press to optimize hydrogen yield and energy efficiency.

üìå Purpose: The AI will try different actions and learn from experience which actions result in the best hydrogen production.

Initial state starts with:

Voltage = 2.0V (mid-range)

Temperature = 65¬∞C (mid-range)

Pressure = 3 bar (mid-range)

self.steps = 0 keeps track of how many actions the agent has taken.

üìå Purpose: AI starts from an average condition and learns to find better settings through trial and error.

This method allows the AI agent to take an action and update the state (voltage, temperature, pressure) accordingly.

The function returns:
    
New state: The updated values of voltage, temperature, and pressure.
    
Reward: A score that tells the AI how good or bad the new state is.
    
Done flag: Tells if the episode is over.


voltage, temperature, pressure = self.state

Reads the current values of voltage, temperature, and pressure.

These values will be modified based on the action chosen.

üîπ Explanation:

This simulates an electrolysis environment where the RL agent adjusts voltage, temperature, and pressure.

Rewards are higher for maximizing hydrogen yield and minimizing energy waste.


if action == 'increase_voltage': voltage = min(voltage + 0.1, 2.5)
elif action == 'decrease_voltage': voltage = max(voltage - 0.1, 1.5)
elif action == 'increase_temp': temperature = min(temperature + 2, 80)
elif action == 'decrease_temp': temperature = max(temperature - 2, 50)
elif action == 'increase_pressure': pressure = min(pressure + 0.5, 5)
elif action == 'decrease_pressure': pressure = max(pressure - 0.5, 1)

The AI agent picks an action, and this section updates the voltage, temperature, or pressure.
Boundary Conditions:
    
min() ensures values do not exceed the maximum.

max() ensures values do not go below the minimum.

üìå Purpose: Prevents unrealistic settings like voltage exceeding 2.5V or temperature going below 50¬∞C.

Summary of What This Code Does

Component	    Explanation

State Space	    Possible values for voltage, temperature, and pressure

Action Space	The AI can increase/decrease voltage, temperature, and pressure

Initial State	Starts from a mid-range setting

Step Function	The AI chooses an action, and the state updates accordingly

Boundary Checks	Prevents values from going out of range

üîπ Example Walkthrough

Imagine the AI starts at:

Voltage = 2.0V
Temperature = 65¬∞C
Pressure = 3 bar

Scenario 1: AI chooses increase_voltage
New voltage: 2.0V + 0.1V = 2.1V
Temperature & Pressure remain the same.

Scenario 2: AI chooses increase_temp
New temperature: 65¬∞C + 2¬∞C = 67¬∞C
Voltage & Pressure remain the same.

Scenario 3: AI chooses decrease_pressure
New pressure: 3 bar - 0.5 bar = 2.5 bar
Voltage & Temperature remain the same.

üìå AI will repeat this process multiple times to find the best settings for maximizing hydrogen yield while minimizing energy consumption. üöÄ

hydrogen_yield = (voltage * 3 + temperature * 0.2 + pressure * 1.5)

What this does:

Higher voltage, temperature, and pressure result in more hydrogen production.

The formula assigns different weights to each factor:

Voltage is the most important (multiplied by 3).

Temperature has a smaller impact (multiplied by 0.2).

Pressure also has a strong effect (multiplied by 1.5).

energy_efficiency = 50 - (voltage * 5 + temperature * 0.5 + pressure * 2)

What this does:
    
The goal is to minimize energy consumption while maximizing hydrogen production.

The higher the voltage, temperature, or pressure, the more energy is used, which is bad.

This formula reduces energy efficiency based on these parameters.

üîπ Conclusion: If energy efficiency drops too much, it means the AI is using too much energy, which is bad.

reward = hydrogen_yield - (energy_efficiency / 10)  # Balancing both factors

What this does:

The reward is the final score for the AI's action.

The formula ensures hydrogen production is high and energy usage is low.

energy_efficiency / 10 ensures that high energy usage slightly reduces the reward but does not completely dominate the hydrogen yield.

üîπ Conclusion: The higher the reward, the better the AI's action was.

self.state = [voltage, temperature, pressure]

Stores the new values of voltage, temperature, and pressure so that in the next step, the AI starts from these values.

This allows the AI to learn over time which settings are best.


self.steps += 1

done = self.steps >= 50

Each time the AI takes an action, it counts as 1 step.

If the AI reaches 50 steps, the experiment ends (done = True).

üîπ Conclusion: This ensures the AI does not run forever and must find the best settings within 50 moves.

def reset(self):
    """ Reset the environment """
    self.state = [2.0, 65, 3]  # Reset to initial condition
    self.steps = 0
    return self.state

Resets the environment back to its starting conditions so the AI can try again.

Ensures fair training, as the AI always starts from the same point.

# Step 4: Implement Q-Learning Algorithm

In [4]:
# Initialize Q-table
Q_table = defaultdict(lambda: np.zeros(len(HydrogenOptimizationEnv().action_space)))

# Hyperparameters
alpha = 0.1  # Learning rate
gamma = 0.9  # Discount factor
epsilon = 0.1  # Exploration rate

env = HydrogenOptimizationEnv()

# Training
for episode in range(1000):  # Train for 1000 episodes
    state = tuple(env.reset())
    done = False

    while not done:
        if np.random.rand() < epsilon:
            action = np.random.choice(env.action_space)  # Explore
        else:
            action = env.action_space[np.argmax(Q_table[state])]  # Exploit
        
        new_state, reward, done = env.step(action)
        new_state = tuple(new_state)
        
        # Q-learning update
        Q_table[state][env.action_space.index(action)] += alpha * (reward + gamma * np.max(Q_table[new_state]) - Q_table[state][env.action_space.index(action)])
        
        state = new_state

üîπ Explanation:

The RL agent explores different electrolysis settings.

Uses Q-learning to update its policy based on rewards.

After training, the RL model finds the best electrolysis conditions.

# üåü Final Step: Testing the RL Model

In [5]:
state = tuple(env.reset())
done = False
while not done:
    action = env.action_space[np.argmax(Q_table[state])]
    state, reward, done = env.step(action)
    print(f"Optimal Electrolysis Setting: {state} | Reward: {reward}")

Optimal Electrolysis Setting: [2.1, 65, 3] | Reward: 23.7


TypeError: unhashable type: 'list'

üîπ This prints the best voltage, temperature, and pressure settings learned by AI.

üöÄ Conclusion

Reinforcement Learning enables AI to optimize hydrogen production dynamically.

Q-learning helps AI adjust electrolysis parameters for best efficiency.

This approach is crucial for achieving India's Green Hydrogen Goals! üåç‚ö°