<a href="https://colab.research.google.com/github/niyobern/Google-Colab-notebooks/blob/main/Njordfrey_AI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 1. Demo model for Njordfrey aquaponics

This notebook contains information demo models created for showcasing how the actual models will work which will be trained on real data corrected from sensors depoloyed in aquaponics settings.

**1. Data Generation**

Because we don't have data at this time, we generate time-series data that mimics sensor readings (e.g., pH levels, temperature, humidity) using Python with libraries. I took a duration of three months correcting data at 1 minute interval

In [None]:
import numpy as np
import pandas as pd
from datetime import datetime, timedelta

# Parameters for data generation
n_samples = 129600  # Number of data points to signify 90 days
start_date = datetime.now()
time_interval = timedelta(minutes=1)  # Time intervals of 1 minute

# Helper function to generate time-series data
def generate_time_series_data(n_samples, start_date, time_interval):
    timestamps = [start_date - i * time_interval for i in range(n_samples)]

    data = {
        'timestamp': timestamps,
        'pH': np.random.uniform(6.5, 7.5, n_samples).round(2),            # pH (no unit)
        'DO': np.random.uniform(4.0, 8.0, n_samples).round(2),            # DO (mg/L)
        'temperature': np.random.uniform(20.0, 30.0, n_samples).round(1), # Temperature (°C)
        'EC': np.random.uniform(0.5, 2.0, n_samples).round(2),            # EC (mS/cm)
        'ammonia': np.random.uniform(0.0, 0.5, n_samples).round(2),       # Ammonia (mg/L)
        'nitrate': np.random.uniform(10.0, 50.0, n_samples).round(1),     # Nitrate (mg/L)
        'nitrite': np.random.uniform(0.0, 0.2, n_samples).round(2),       # Nitrite (mg/L)
        'light': np.random.uniform(10000, 30000, n_samples).round(0),     # Light (lux)
        'humidity': np.random.uniform(40.0, 80.0, n_samples).round(1),    # Humidity (%)
        'CO2': np.random.uniform(350, 800, n_samples).round(0),           # CO2 (ppm)
        'water_flow': np.random.uniform(0.5, 3.0, n_samples).round(2),    # Water Flow (L/min)
        'nutrient_dosing': np.random.uniform(0, 100, n_samples).round(1), # Nutrient Dosing (%)
        'pressure': np.random.uniform(0.5, 2.5, n_samples).round(2)       # Pressure (Bar)
    }


    return pd.DataFrame(data)

# Generate the synthetic data
demo_data = generate_time_series_data(n_samples, start_date, time_interval)

# Save to CSV for later use
demo_data.to_csv("aquaponics_demo_data.csv", index=False)
print("Demo data generated and saved to aquaponics_demo_data.csv")


Demo data generated and saved to aquaponics_demo_data.csv


**2. Training Models**

We train three supervide models including one for classifying water quality and other two for regression tasks (predicting a numeric value)

**2.1 Preparing Data for multiple models**

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np

# Load the synthetic data
demo_data = pd.read_csv("aquaponics_demo_data.csv")

# 1. Create synthetic water quality labels based on some thresholds for pH, DO, and Ammonia
conditions = [
    (demo_data['pH'] < 6.8) | (demo_data['DO'] < 5.0) | (demo_data['ammonia'] > 0.3),
    (demo_data['pH'].between(6.8, 7.2)) & (demo_data['DO'].between(5.0, 7.0)) & (demo_data['ammonia'] < 0.3),
    (demo_data['pH'] > 7.2) | (demo_data['DO'] > 7.0) | (demo_data['ammonia'] < 0.1)
]
labels = [0, 1, 2]  # 0: Poor, 1: Moderate, 2: Good
demo_data['water_quality_status'] = np.select(conditions, labels, default=1)

# 2. Generate synthetic plant growth rate (for demo)
demo_data['plant_growth_rate'] = np.random.uniform(1, 10, demo_data.shape[0])

# 3. Define features and targets for different tasks
features = ['pH', 'DO', 'temperature', 'EC', 'ammonia', 'nitrate', 'nitrite', 'light', 'humidity', 'CO2']
X = demo_data[features]

# Split data for each task (regression and classification)
X_train, X_test, y_train_nutrient, y_test_nutrient = train_test_split(X, demo_data['nutrient_dosing'], test_size=0.2, random_state=42)
X_train_class, X_test_class, y_train_quality, y_test_quality = train_test_split(X, demo_data['water_quality_status'], test_size=0.2, random_state=42)
X_train_growth, X_test_growth, y_train_growth, y_test_growth = train_test_split(X, demo_data['plant_growth_rate'], test_size=0.2, random_state=42)

# Normalize features for regression tasks (optional but recommended)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
X_train_growth_scaled = scaler.fit_transform(X_train_growth)
X_test_growth_scaled = scaler.transform(X_test_growth)

print("Data preparation completed for multiple tasks.")


Data preparation completed for multiple tasks.


**2.2. Nutrient Dosing Prediction (Regression)**

First, I trained a model that predicts the right amount of nutrients required for optimal productivity while conserving resources.

In [None]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
import pickle

# Train Random Forest for nutrient dosing
rf_nutrient = RandomForestRegressor(n_estimators=100, random_state=42)
rf_nutrient.fit(X_train_scaled, y_train_nutrient)

# Predictions and evaluation
y_pred_nutrient = rf_nutrient.predict(X_test_scaled)
mse_nutrient = mean_squared_error(y_test_nutrient, y_pred_nutrient)
r2_nutrient = r2_score(y_test_nutrient, y_pred_nutrient)

# Save the model to a file
filename = 'nutrient_dosing_model.pkl'
pickle.dump(rf_nutrient, open(filename, 'wb'))

print(f"Nutrient Dosing Prediction - MSE: {mse_nutrient}, R-Squared: {r2_nutrient}")


Nutrient Dosing Prediction - MSE: 845.6263720892746, R-Squared: -0.014185833533131742


**2.3. Water Quality Status Prediction (Classification)**

This model predicts whether the water quality is good, moderate or bad based on different parameters measured by sensors in aquaponics environment.

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# Train Random Forest for water quality classification
rf_quality = RandomForestClassifier(n_estimators=100, random_state=42)
rf_quality.fit(X_train_class, y_train_quality)

# Predictions and evaluation
y_pred_quality = rf_quality.predict(X_test_class)
accuracy_quality = accuracy_score(y_test_quality, y_pred_quality)

# Save the model to a file
filename = 'water_quality_model.pkl'
pickle.dump(rf_quality, open(filename, 'wb'))

print(f"Water Quality Classification Accuracy: {accuracy_quality}")
print(classification_report(y_test_quality, y_pred_quality))

Water Quality Classification Accuracy: 1.0
              precision    recall  f1-score   support

           0       1.00      1.00      1.00     17572
           1       1.00      1.00      1.00      3258
           2       1.00      1.00      1.00      5090

    accuracy                           1.00     25920
   macro avg       1.00      1.00      1.00     25920
weighted avg       1.00      1.00      1.00     25920



**2.4. Plant Growth Rate Prediction (Regression)**

This model predicts the growth rate of plants and this can tell us whether we need to adjust settings or not. We could predict other things as well such as warnings of something going bad in the ecosystem, fish production, etc.

In [None]:
# Train Random Forest for plant growth rate prediction
rf_growth = RandomForestRegressor(n_estimators=100, random_state=42)
rf_growth.fit(X_train_growth_scaled, y_train_growth)

# Predictions and evaluation
y_pred_growth = rf_growth.predict(X_test_growth_scaled)
mse_growth = mean_squared_error(y_test_growth, y_pred_growth)
r2_growth = r2_score(y_test_growth, y_pred_growth)

# Save the model to a file
filename = 'plant_growth_model.pkl'
pickle.dump(rf_growth, open(filename, 'wb'))

print(f"Plant Growth Rate Prediction - MSE: {mse_growth}, R-Squared: {r2_growth}")

Plant Growth Rate Prediction - MSE: 6.8407561477012, R-Squared: -0.017574223218319807


**3. Reinforcement Learning model**

We train an inteligent agent that will take data from sensors and from system settings, take predictions from supervised models which are expected to be more than the mentioned three above and return recomendations. The inteligent agent trained this wa will return with numerical values of adjustments to be made so that the actuators in the aquaponics will apply those adjustments without requiring human interactions.

In [None]:
!pip install stable-baselines3


Collecting stable-baselines3
  Downloading stable_baselines3-2.3.2-py3-none-any.whl.metadata (5.1 kB)
Collecting gymnasium<0.30,>=0.28.1 (from stable-baselines3)
  Downloading gymnasium-0.29.1-py3-none-any.whl.metadata (10 kB)
Collecting farama-notifications>=0.0.1 (from gymnasium<0.30,>=0.28.1->stable-baselines3)
  Downloading Farama_Notifications-0.0.4-py3-none-any.whl.metadata (558 bytes)
Downloading stable_baselines3-2.3.2-py3-none-any.whl (182 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m182.3/182.3 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading gymnasium-0.29.1-py3-none-any.whl (953 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m953.9/953.9 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading Farama_Notifications-0.0.4-py3-none-any.whl (2.5 kB)
Installing collected packages: farama-notifications, gymnasium, stable-baselines3
Successfully installed farama-notifications-0.0.4 gymnasium-0.29.1 stable-baselines3-2.3

3.1. Defining the model

In [None]:
import gymnasium as gym
from gymnasium import spaces
import numpy as np
import joblib

class AquaponicsEnv(gym.Env):
    def __init__(self):
        super(AquaponicsEnv, self).__init__()

        # Load supervised models
        self.water_quality_model = joblib.load("water_quality_model.pkl")  # Classification model
        self.nutrient_dosing_model = joblib.load("nutrient_dosing_model.pkl")  # Regression model
        self.plant_growth_model = joblib.load("plant_growth_model.pkl")  # Regression model

        # Define the action space (adjustments for pH, DO, nutrient dosing, and light)
        self.action_space = spaces.Box(low=-1, high=1, shape=(4,), dtype=np.float32)

        # Define the observation space (9 raw sensor data + 3 predictions = 12 elements)
        self.observation_space = spaces.Box(
            low=np.array([6.0, 4.0, 20.0, 0.5, 0.0, 10000, 40, 350, 50, 0, 0, 0]),
            high=np.array([8.0, 10.0, 30.0, 2.0, 0.5, 40000, 80, 800, 100, 2, 100, 10]),
            dtype=np.float32
        )

        # Initialize the random number generator
        self.seed()

    def seed(self, seed=None):
        # Seed the random number generator for reproducibility
        self.np_random, seed = gym.utils.seeding.np_random(seed)
        return [seed]

    def reset(self, seed=None, options=None):
        super().reset(seed=seed)
        self.seed(seed)

        # Initialize the raw sensor data (randomly within valid ranges)
        self.state_raw = np.array([
            self.np_random.uniform(6.5, 7.5),  # pH
            self.np_random.uniform(5.0, 8.0),  # DO
            self.np_random.uniform(20.0, 30.0),  # Temperature
            self.np_random.uniform(0.5, 2.0),  # EC
            self.np_random.uniform(0.0, 0.5),  # Ammonia
            self.np_random.uniform(10000, 30000),  # Light
            self.np_random.uniform(40, 80),  # Humidity
            self.np_random.uniform(350, 800),  # CO2
            self.np_random.uniform(50, 100)  # Nutrient Dosing
        ]).astype(np.float32)

        # Add an extra feature (e.g., pressure or other if needed)
        pressure = self.np_random.uniform(900, 1200)
        self.state_with_pressure = np.append(self.state_raw, pressure)

        # Get predictions from the supervised models
        water_quality_status = self.water_quality_model.predict(self.state_with_pressure.reshape(1, -1))[0]
        predicted_nutrient_dosing = self.nutrient_dosing_model.predict(self.state_with_pressure.reshape(1, -1))[0]
        predicted_growth_rate = self.plant_growth_model.predict(self.state_with_pressure.reshape(1, -1))[0]

        # Combine raw sensor data and supervised model predictions for RL model
        self.state = np.concatenate((
            self.state_raw,  # 9 features of raw sensor data
            np.array([water_quality_status, predicted_nutrient_dosing, predicted_growth_rate])  # 3 Supervised model predictions
        ))

        return self.state, {}

    def step(self, action):
        # Apply action (adjust pH, DO, nutrient, light)
        pH_adjustment, DO_adjustment, nutrient_adjustment, light_adjustment = action

        # Simulate the system response to actions (using real sensor data in practice)
        self.state_raw[0] = np.clip(self.state_raw[0] + pH_adjustment * 0.1, 6.0, 8.0)  # Adjust pH
        self.state_raw[1] = np.clip(self.state_raw[1] + DO_adjustment * 0.5, 4.0, 10.0)  # Adjust DO
        self.state_raw[8] = np.clip(self.state_raw[8] + nutrient_adjustment * 10, 0, 100)  # Adjust nutrient dosing
        self.state_raw[5] = np.clip(self.state_raw[5] + light_adjustment * 1000, 0, 40000)  # Adjust light

        # Update the extra feature (pressure)
        pressure = self.np_random.uniform(900, 1200)
        self.state_with_pressure = np.append(self.state_raw, pressure)

        # Get updated predictions from the supervised models
        water_quality_status = self.water_quality_model.predict(self.state_with_pressure.reshape(1, -1))[0]
        predicted_nutrient_dosing = self.nutrient_dosing_model.predict(self.state_with_pressure.reshape(1, -1))[0]
        predicted_growth_rate = self.plant_growth_model.predict(self.state_with_pressure.reshape(1, -1))[0]

        # Combine raw sensor data and supervised model predictions for RL model
        self.state = np.concatenate((
            self.state_raw,  # 9 features of raw sensor data
            np.array([water_quality_status, predicted_nutrient_dosing, predicted_growth_rate])  # 3 Supervised model predictions
        ))

        # Calculate reward based on how close the parameters are to the ideal values
        reward = -abs(self.state_raw[0] - 7.0)  # pH target is 7.0
        reward -= abs(self.state_raw[1] - 6.5)  # DO target is 6.5
        reward -= abs(self.state_raw[8] - 80)   # Nutrient dosing target is 80
        reward -= abs(self.state_raw[5] - 20000)  # Light target is 20000 lux

        # Termination logic (e.g., if the pH goes too far out of range)
        terminated = False  # You can add logic to set this to True

        # Truncation logic (e.g., episode timeout)
        truncated = False  # Set this to True if there's a time limit or episode truncation

        info = {}  # Additional info can go here (optional)

        # Return observation, reward, terminated, truncated, and info
        return self.state, reward, terminated, truncated, info

    def render(self):
        print(f"State: {self.state_raw} | Predictions: {self.state[9:]}")


3.2. Training the model

In [None]:
from stable_baselines3 import PPO

# Initialize the environment
env = AquaponicsEnv()

# Initialize the PPO model
model = PPO("MlpPolicy", env, verbose=1)

# Train the model
model.learn(total_timesteps=20000)

# Save the trained model
model.save("ppo_aquaponics_with_supervised_model_inputs")


Using cpu device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.




-----------------------------
| time/              |      |
|    fps             | 39   |
|    iterations      | 1    |
|    time_elapsed    | 51   |
|    total_timesteps | 2048 |
-----------------------------




------------------------------------------
| time/                   |              |
|    fps                  | 39           |
|    iterations           | 2            |
|    time_elapsed         | 102          |
|    total_timesteps      | 4096         |
| train/                  |              |
|    approx_kl            | 9.351148e-06 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -5.68        |
|    explained_variance   | -2.03e-06    |
|    learning_rate        | 0.0003       |
|    loss                 | 1.91e+10     |
|    n_updates            | 10           |
|    policy_gradient_loss | -0.000106    |
|    std                  | 1            |
|    value_loss           | 3.76e+10     |
------------------------------------------




-----------------------------------------
| time/                   |             |
|    fps                  | 39          |
|    iterations           | 3           |
|    time_elapsed         | 153         |
|    total_timesteps      | 6144        |
| train/                  |             |
|    approx_kl            | 2.15873e-05 |
|    clip_fraction        | 0           |
|    clip_range           | 0.2         |
|    entropy_loss         | -5.68       |
|    explained_variance   | 5.96e-08    |
|    learning_rate        | 0.0003      |
|    loss                 | 1.98e+10    |
|    n_updates            | 20          |
|    policy_gradient_loss | -0.000234   |
|    std                  | 1           |
|    value_loss           | 4.26e+10    |
-----------------------------------------




------------------------------------------
| time/                   |              |
|    fps                  | 40           |
|    iterations           | 4            |
|    time_elapsed         | 204          |
|    total_timesteps      | 8192         |
| train/                  |              |
|    approx_kl            | 4.857441e-05 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -5.68        |
|    explained_variance   | -2.38e-07    |
|    learning_rate        | 0.0003       |
|    loss                 | 2.78e+10     |
|    n_updates            | 30           |
|    policy_gradient_loss | -0.000586    |
|    std                  | 1            |
|    value_loss           | 5.14e+10     |
------------------------------------------




------------------------------------------
| time/                   |              |
|    fps                  | 39           |
|    iterations           | 5            |
|    time_elapsed         | 257          |
|    total_timesteps      | 10240        |
| train/                  |              |
|    approx_kl            | 3.707467e-05 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -5.68        |
|    explained_variance   | -1.19e-07    |
|    learning_rate        | 0.0003       |
|    loss                 | 6.02e+09     |
|    n_updates            | 40           |
|    policy_gradient_loss | -0.000254    |
|    std                  | 1            |
|    value_loss           | 1.59e+10     |
------------------------------------------




------------------------------------------
| time/                   |              |
|    fps                  | 39           |
|    iterations           | 6            |
|    time_elapsed         | 310          |
|    total_timesteps      | 12288        |
| train/                  |              |
|    approx_kl            | 7.173716e-06 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -5.68        |
|    explained_variance   | -9.3e-06     |
|    learning_rate        | 0.0003       |
|    loss                 | 2.51e+10     |
|    n_updates            | 50           |
|    policy_gradient_loss | -0.000106    |
|    std                  | 1            |
|    value_loss           | 4.27e+10     |
------------------------------------------




----------------------------------------
| time/                   |            |
|    fps                  | 39         |
|    iterations           | 7          |
|    time_elapsed         | 365        |
|    total_timesteps      | 14336      |
| train/                  |            |
|    approx_kl            | 3.6332e-05 |
|    clip_fraction        | 0          |
|    clip_range           | 0.2        |
|    entropy_loss         | -5.68      |
|    explained_variance   | -3.58e-07  |
|    learning_rate        | 0.0003     |
|    loss                 | 1.67e+10   |
|    n_updates            | 60         |
|    policy_gradient_loss | -0.000374  |
|    std                  | 1          |
|    value_loss           | 3.04e+10   |
----------------------------------------




-------------------------------------------
| time/                   |               |
|    fps                  | 39            |
|    iterations           | 8             |
|    time_elapsed         | 417           |
|    total_timesteps      | 16384         |
| train/                  |               |
|    approx_kl            | 7.2369876e-06 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -5.68         |
|    explained_variance   | 5.96e-08      |
|    learning_rate        | 0.0003        |
|    loss                 | 7.41e+09      |
|    n_updates            | 70            |
|    policy_gradient_loss | -5.47e-05     |
|    std                  | 1             |
|    value_loss           | 1.81e+10      |
-------------------------------------------




------------------------------------------
| time/                   |              |
|    fps                  | 38           |
|    iterations           | 9            |
|    time_elapsed         | 472          |
|    total_timesteps      | 18432        |
| train/                  |              |
|    approx_kl            | 1.628435e-05 |
|    clip_fraction        | 0            |
|    clip_range           | 0.2          |
|    entropy_loss         | -5.68        |
|    explained_variance   | -1.19e-07    |
|    learning_rate        | 0.0003       |
|    loss                 | 1.26e+10     |
|    n_updates            | 80           |
|    policy_gradient_loss | -0.000165    |
|    std                  | 1            |
|    value_loss           | 2.51e+10     |
------------------------------------------




-------------------------------------------
| time/                   |               |
|    fps                  | 38            |
|    iterations           | 10            |
|    time_elapsed         | 527           |
|    total_timesteps      | 20480         |
| train/                  |               |
|    approx_kl            | 1.6702426e-05 |
|    clip_fraction        | 0             |
|    clip_range           | 0.2           |
|    entropy_loss         | -5.68         |
|    explained_variance   | 5.96e-08      |
|    learning_rate        | 0.0003        |
|    loss                 | 2.45e+10      |
|    n_updates            | 90            |
|    policy_gradient_loss | -0.000198     |
|    std                  | 1             |
|    value_loss           | 4.71e+10      |
-------------------------------------------




**3.2. Using the RL model**

In [None]:
rl_model = PPO.load("ppo_aquaponics_with_supervised_model_inputs")

def predict(raw_obs):

    pressure = np.random.uniform(900, 1200) # Add pressure
    state_with_pressure = np.append(raw_obs, pressure)

    # Get predictions from the supervised models
    water_quality_status = env.water_quality_model.predict(state_with_pressure.reshape(1, -1))[0]
    predicted_nutrient_dosing = env.nutrient_dosing_model.predict(state_with_pressure.reshape(1, -1))[0]
    predicted_growth_rate = env.plant_growth_model.predict(state_with_pressure.reshape(1, -1))[0]

    # Combine raw sensor data and supervised model predictions for RL model
    obs = np.concatenate((
        raw_obs,  # 9 features of raw sensor data
        np.array([water_quality_status, predicted_nutrient_dosing, predicted_growth_rate])  # 3 Supervised model predictions
    ))

    # Reset the environment to ensure we use the new observation
    env.reset()

    # Get action recommendations from the RL model
    action, _ = rl_model.predict(obs) # Use obs which has the correct shape

    # Return the action as a response (adjustments for pH, DO, nutrient dosing, and light)
    return {
        "pH_adjustment": action[0],
        "DO_adjustment": action[1],
        "nutrient_adjustment": action[2],
        "light_adjustment": action[3]
    }

# Call predict with the sensor data
raw_obs = np.array([
  6.8,
  5.5,
  25.0,
  1.2,
  0.3,
  15000,
  60.0,
  450,
  70.0
  ])
predict(raw_obs)

  and should_run_async(code)


{'pH_adjustment': 0.2961161,
 'DO_adjustment': -0.69618905,
 'nutrient_adjustment': 0.70978266,
 'light_adjustment': -0.3140868}



---



# Part 2. API to serve the models

In this section, we build an API that will do two things:


*   Serve the AI models so that they can be accessed over network
*   Interact with the database to store new data received from sensors so that they can be used for continuous learning in AI models

The API validates input data so it must meet required formats which are designed to be compatible with sensor readings from IoT gateway installed in aquaponics system. It also responds with sanitized data that is compatible with actuators in aquaponics system.
Based on this demo, the input schema is:




In [None]:
from pydantic import BaseModel
class Sensor(BaseModel):
    pH: float
    DO: float
    temperature: float
    EC: float
    ammonia: float
    nitrate: float
    nitrite: float
    light: float
    humidity: float
    CO2: float
    water_flow: float
    nutrient_dosing: float
    pressure: float

The API has been developed outside this notebook