$
\text{Irrigation\_Need} = \max\left(0,\ \left(1 - \frac{\text{Soil\_Moisture\_\%}}{40}\right) \cdot \left(\frac{\text{Temperature\_C}}{30}\right) \cdot \left(1 - \frac{\text{Humidity\_\%}}{100}\right) \cdot \left(1 - \frac{\text{Rainfall\_mm}}{50}\right)\right) $


---

### Applying the formula to generate Target variable:

In [2]:
import pandas as pd

df = pd.read_csv(r"C:\Users\hp\Desktop\projects\agro-scan\data.csv")

def compute_irrigation_need(row):
    # Clip values to avoid negatives and extreme outliers
    soil_moisture_factor = max(0, 1 - row['Soil_Moisture_%'] / 40)
    temperature_factor = row['Temperature_C'] / 30
    humidity_factor = 1 - row['Humidity_%'] / 100
    rainfall_factor = 1 - row['Rainfall_mm'] / 50

    irrigation_need = soil_moisture_factor * temperature_factor * humidity_factor * rainfall_factor
    return max(0, irrigation_need)

# Compute the target variable
df['Irrigation_Need'] = df.apply(compute_irrigation_need, axis=1)

In [3]:
print(df[['Temperature_C', 'Humidity_%', 'Soil_Moisture_%', 'Rainfall_mm', 'Irrigation_Need']].head())

df.to_csv("irrigation_data.csv", index=False)

   Temperature_C  Humidity_%  Soil_Moisture_%  Rainfall_mm  Irrigation_Need
0          24.48       53.64        27.844360         3.09         0.107857
1          21.31       53.20        34.519483        16.07         0.030909
2          25.24       33.07        26.836535         4.60         0.168262
3          29.62       55.05        35.164865        23.10         0.028862
4          20.83       70.99        24.459244        40.76         0.014462


In [4]:
X = df[['Temperature_C', 'Humidity_%', 'Rainfall_mm', 'Soil_Moisture_%']]
y = df['Irrigation_Need']

### Training a RandomForest Regression Model

In [5]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)

In [5]:
print("Test MSE:", mse)

Test MSE: 3.7190990050219685e-05


### Defining a Threshold to Trigger Irrigation:

In [6]:
def classify_irrigation_need(value, threshold=0.05):
    return 1 if value >= threshold else 0

df['Irrigation_Trigger'] = df['Irrigation_Need'].apply(classify_irrigation_need)

- classify_irrigation_need() checks whether a computed irrigation need value is above or below a defined threshold (default = 0.05).
- If it's above or equal to the threshold → returns 1 (indicating irrigation should be triggered).
- If it's below the threshold → returns 0 (indicating no irrigation is needed).

### Making Predictions:

In [7]:
import pandas as pd

new_data = pd.DataFrame([{
    'Temperature_C': 31.2,
    'Humidity_%': 45.0,
    'Rainfall_mm': 8.0,
    'Soil_Moisture_%': 18.0
}])

predicted_irrigation_need = model.predict(new_data)[0]

In [8]:
print("Predicted Irrigation Need:", round(predicted_irrigation_need, 4))

Predicted Irrigation Need: 0.1433


In [9]:
def irrigation_decision(need, threshold=0.05):
    return "Yes" if need >= threshold else "No"

decision = irrigation_decision(predicted_irrigation_need)
print("Irrigation Required?", decision)

Irrigation Required? Yes


### Saving the Trained Model:

In [7]:
import joblib
joblib.dump(model, 'irrigation_model.pkl')

['irrigation_model.pkl']