# Precise Agriculture: Resource allocation and Crop yield optimization

## Brief dataset explanation:
the dataset we're working on in this project offers a set of attributes (soil and environment conditions) --such as soil characteristics, rainfall rate, water usage in the corresponding farm, etc.-- concernig various crops, and each crop type has a dedicated set of records exctracted from real-life farms in California.

## Step 1: Data Preprocessing

In [207]:
import pandas as pd

### Loading the dataset to a pandas dataframe

In [208]:
df_raw = pd.read_csv("Crop_recommendation_with_yield.csv")
print (df_raw.shape)
df_raw.head()

(2200, 24)


Unnamed: 0,N,P,K,temperature,humidity,ph,rainfall,label,soil_moisture,soil_type,...,irrigation_frequency,crop_density,pest_pressure,fertilizer_usage,growth_stage,urban_area_proximity,water_source_type,frost_risk,water_usage_efficiency,crop_yield
0,90,42,43,20.879744,82.002744,6.502985,202.935536,rice,29.446064,2,...,4,11.74391,57.607308,188.194958,1,2.719614,3,95.649985,1.193293,5.044659
1,85,58,41,21.770462,80.319644,7.038096,226.655537,rice,12.851183,3,...,4,16.797101,74.736879,70.963629,1,4.714427,2,77.265694,1.752672,3.504667
2,60,55,44,23.004459,82.320763,7.840207,263.964248,rice,29.363913,2,...,1,12.654395,1.034478,191.976077,1,30.431736,2,18.192168,3.035541,5.535116
3,74,35,40,26.491096,80.158363,6.980401,242.864034,rice,26.207732,3,...,1,10.86436,24.091888,55.761388,3,10.861071,3,82.81872,1.273341,4.952057
4,78,42,42,20.130175,81.604873,7.628473,262.71734,rice,28.236236,2,...,3,13.85291,38.811481,185.259702,2,47.190777,3,25.466499,2.578671,5.342053


### Data cleaning

In [209]:
# checking for null values in the dataset
df_raw.isnull().values.any()

np.False_

In [210]:
# exploring available crops
df_raw["label"].unique()

array(['rice', 'maize', 'chickpea', 'kidneybeans', 'pigeonpeas',
       'mothbeans', 'mungbean', 'blackgram', 'lentil', 'pomegranate',
       'banana', 'mango', 'grapes', 'watermelon', 'muskmelon', 'apple',
       'orange', 'papaya', 'coconut', 'cotton', 'jute', 'coffee'],
      dtype=object)

* NOTE: Water Usage Efficiency (WUE) refers to the amount of water `L/hec`

In [211]:
df_raw["water_usage_efficiency"] = df_raw["water_usage_efficiency"] * df_raw["crop_yield"] * 1000 # WUE (L/hec)

In [212]:
df_raw["water_usage_efficiency"].values

array([ 6019.75742843,  6142.53075706, 16802.0723481 , ...,
       10174.8435152 , 19177.90454742, 14954.15902567], shape=(2200,))

* In order to nomalize the three attributes concerning resources allocation, we need to detemine the range of their values from the dataset

In [213]:
# WUE => Water Usage Efficiency
WUE_mean = df_raw["water_usage_efficiency"].mean()
print (f"Water Usage Efficiency Mean: {WUE_mean:.2f}")
WUE_min = df_raw["water_usage_efficiency"].min()
print (f"Water Usage Efficiency Min: {WUE_min:.2f}")
WUE_max = df_raw["water_usage_efficiency"].max()
print (f"Water Usage Efficiency Max: {WUE_max:.2f}")


Water Usage Efficiency Mean: 12924.15
Water Usage Efficiency Min: 2217.93
Water Usage Efficiency Max: 31174.13


In [214]:
# FU => Fertilizer Usage
FU_mean = df_raw["fertilizer_usage"].mean()
print (f"Fertilizer Usage Mean: {FU_mean:.2f}")
FU_min = df_raw["fertilizer_usage"].min()
print (f"Fertilizer Usage Min: {FU_min:.2f}")
FU_max = df_raw["fertilizer_usage"].max()
print (f"Fertilizer Usage Max: {FU_max:.2f}")

Fertilizer Usage Mean: 125.85
Fertilizer Usage Min: 50.21
Fertilizer Usage Max: 199.98


In [215]:
# IF => Irrigation frequency
IF_mean = df_raw["irrigation_frequency"].mean()
print (f"Irrigation Frequency Mean: {IF_mean}")
IF_min = df_raw["irrigation_frequency"].min()
print (f"Irrigation Frequency Min: {IF_min}")
IF_max = df_raw["irrigation_frequency"].max()
print (f"Irrigation Frequency Max: {IF_max}")

Irrigation Frequency Mean: 3.515
Irrigation Frequency Min: 1
Irrigation Frequency Max: 6


## STEP 2: Prediction Model Building

* In order to allow our heuristic to have an idea about the goal state, we propose to learn a model to predict the crop yield based on a specified soil, environment conditions and resources allocation

* we're assignig each crop type a numeric value (index) to make it a valid input for our model

In [216]:
df_raw["label"] = pd.factorize(df_raw["label"])[0]
df_raw["label"].unique

<bound method Series.unique of 0        0
1        0
2        0
3        0
4        0
        ..
2195    21
2196    21
2197    21
2198    21
2199    21
Name: label, Length: 2200, dtype: int64>

* Selecting the model's inputs, that are the most impactfull attributes on crop yield

In [217]:
y = df_raw["crop_yield"] # model's output ==> y = f(x1,x2,...,xn)

x = df_raw.drop(["crop_yield", "frost_risk","crop_density", "urban_area_proximity", "pest_pressure", "co2_concentration", "wind_speed", "sunlight_exposure"], axis=1)
x.columns # ==> x1,x2,...,xn (these xi represent the model's inputs for crop yield prediction)

Index(['N', 'P', 'K', 'temperature', 'humidity', 'ph', 'rainfall', 'label',
       'soil_moisture', 'soil_type', 'organic_matter', 'irrigation_frequency',
       'fertilizer_usage', 'growth_stage', 'water_source_type',
       'water_usage_efficiency'],
      dtype='object')

### Data Splitting

* We will split our data to 80% for model training and 20% for model testing

In [218]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2, random_state=42)

### Model Building (Linear Regression)

* We train an empty linear regression model on our data

In [219]:
from sklearn.linear_model import LinearRegression

lr = LinearRegression()
lr.fit(x_train, y_train)

* Applying the model to make a prediction

In [220]:
y_lr_train_pred = lr.predict(x_train) # => to check how well the model fits on data it was trained with
y_lr_test_pred = lr.predict(x_test) # => to test the model with unseen data
y = lr.predict

### Evaluating Model's Performance

* Comparing mean square error to say how close our model's predictions were to the actual data
* R2 score is a statistical measure for the predictions

In [221]:
from sklearn.metrics import mean_absolute_error, r2_score

lr_train_mse = mean_absolute_error(y_train, y_lr_train_pred)
lr_train_r2 = r2_score(y_train, y_lr_train_pred)

lr_test_mse = mean_absolute_error(y_test, y_lr_test_pred)
lr_test_r2 = r2_score(y_test, y_lr_test_pred)

In [222]:
print(f"LR MSE (Train): {lr_train_mse}")
print(f"LR R2 (Train): {lr_train_r2}")
print(f"LR MSE (Test): {lr_test_mse}")
print(f"LR R2 (Test): {lr_test_r2}")


LR MSE (Train): 0.3142393498637868
LR R2 (Train): 0.7579388239117083
LR MSE (Test): 0.27604337415494834
LR R2 (Test): 0.792427084866312


* It's important that we add a new column that represents the ratio we aim to maximize with our search

    $\text{ratio\_state} = \frac{\text{crop\_yield}}{\text{ normalized\_WUE } + \text{ normalized\_FU } + \text{ normalized\_IF }}$

$\text{normalized\_WUE} = \frac{\text{ WUE\_weight } \times \text{ WUE }}{\text{ WUE\_max } - \text{ WUE\_min }}$

In [223]:
normalized_WUE = df_raw["water_usage_efficiency"] / (WUE_max - WUE_min)
normalized_FU = df_raw["fertilizer_usage"] / (FU_max - FU_min)
normalized_IF = df_raw["irrigation_frequency"] / (IF_max - IF_min)
df_raw["yield_per_resources"] = (df_raw["crop_yield"] / (normalized_WUE + normalized_FU + normalized_IF))
df_raw["yield_per_resources"].max()

6.267844284020397

### Retrieve User input from form

In [224]:
from flask import request, Flask
app = Flask(__name__)

@app.route('/submit', methods=['POST'])
def handle_form():
    # Retrieve user inputs from the form
    N = request.form['N']
    P = request.form['P']
    K = request.form['K']
    temperature = request.form['temperature']
    humidity = request.form['humidity']
    ph = request.form['ph']
    rainfall = request.form['rainfall']
    soil_moisture = request.form['soil_moisture']
    soil_type = request.form['soil_type']
    organic_matter = request.form['organic_matter']
    growth_stage = request.form['growth_stage']
    water_source_type = request.form['water_source_type']


## STEP 3: Problem Formulation

In [225]:
from collections import deque
import random
from math import ceil
import utils

### Problem formulation

In [234]:
class Problem:

    def __init__(self, WUE_weight, FU_weight, IF_weight, soil_env_conditions):
        self.initial_state = self.get_random_state()
        self.conditions = soil_env_conditions
        self.WUE_weight = WUE_weight
        self.FU_weight = FU_weight
        self.IF_weight = IF_weight
        self.action_value = 0


    def get_random_state(self):
        random_WUE = random.uniform(WUE_min, WUE_max)
        random_FU = random.uniform(FU_min, FU_max)
        random_IF = random.choice(range(1, 8))
        return [random_WUE, random_FU, random_IF]

    def actions(self):
        actions_list = ["Increase Water Usage",
                        "Decrease Water Usage",
                        "Increase Fertilizer Usage",
                        "Decrease Fertilizer Usage",
                        "Increase Irrigation Frequency",
                        "Decrease Irrigation Frequency",
                        ]
        return actions_list
    
    def result(self, state, action):
        if action == None:
            print ("None action")
            return
        child = state[:]
        if action == "Increase Water Usage":
            self.action_value = random.uniform(20, 2000)
            while child[0] + self.action_value <= WUE_max:
                child[0] += self.action_value
            return child
        if action == "Decrease Water Usage":
            self.action_value = random.uniform(20, 2000)
            if child[0] - self.action_value >= WUE_min:
                child[0] -= self.action_value
            return child
        if action == "Increase Fertilizer Usage":
            self.action_value = random.uniform(5, 30)
            if child[1] + self.action_value <= FU_max:
                child[1] += self.action_value           
            return child
        if action == "Decrease Fertilizer Usage":
            self.action_value = random.uniform(5, 30)
            if child[1] - self.action_value >= FU_min:
                child[1] -= self.action_value        
            return child
        if action == "Increase Irrigation Frequency":
            self.action_value = random.choice(range(1,7))
            if child[2] + self.action_value <= IF_max:
                child[2] += self.action_value           
            return child
        if action == "Decrease Irrigation Frequency":
            self.action_value = random.choice(range(1, 7))
            if child[2] - self.action_value >= IF_min:
                child[2] -= self.action_value          
            return child
    
    
    def goal_test(self, state):
        return self.heuristic(state) <= 1e-6 # accept when state is at least 1e-6 close to threshold
    
    def heuristic(self, state):
        # preparing the model's input
        conditions = dict(self.conditions)
        conditions["water_usage_efficiency"] = state[0]
        conditions["fertilizer_usage"] = state[1]
        conditions["irrigation_frequency"] = state[2]

        input_values = [conditions]
        input_df = pd.DataFrame(input_values)
        estimated_crop_yield = lr.predict(input_df)

        normalized_WUE = self.WUE_weight * (state[0] - WUE_min) / (WUE_max - WUE_min)
        normalized_FU = self.FU_weight * (state[1] - FU_min) / (FU_max - FU_min)
        normalized_IF = self.IF_weight * (state[2] - IF_min) / (IF_max - IF_min)

        yield_per_resources = estimated_crop_yield / (normalized_WUE + normalized_FU + normalized_IF)

        ratio_threshold = 6 # example threshold value
        if yield_per_resources >= ratio_threshold: # threshold is defined by trying thresholds and incrementing them each time we find a goal
            return 0
        return abs(yield_per_resources - ratio_threshold)


    def evaluation_function(self, c, action, state, strategy):
        action_cost = 0

        if action == "Increase Water Usage":
            action_cost = self.WUE_weight * self.action_value / (WUE_max - WUE_min)
        elif action == "Decrease Water Usage":
            action_cost = self.WUE_weight * (self.action_value) / (WUE_max - WUE_min)
        elif action == "Increase Fertilizer Usage":
            action_cost = self.FU_weight * self.action_value / (FU_max - FU_min)
        elif action == "Decrease Fertilizer Usage":
            action_cost = self.FU_weight * (self.action_value) / (FU_max - FU_min)
        elif action == "Increase Irrigation Frequency":
            action_cost = self.IF_weight * self.action_value / (IF_max - IF_min)
        elif action == "Decrease Irrigation Frequency":
            action_cost = self.IF_weight * (self.action_value) / (IF_max - IF_min)
            
        if strategy == "Greedy best first":
            action_cost = 0

        if state == self.initial_state or action == None: # evaluate initial state (no path cost)
            return self.heuristic(state)

        f = c + action_cost + self.heuristic(state)
        return f

    def value(self, state): # for Genetic Algorithm's fitness function (to evaluate states)
        conditions = self.conditions
        conditions["water_usage_efficiency"] = state[0]
        conditions["fertilizer_usage"] = state[1]
        conditions["irrigation_frequency"] = state[2]
        input_values = [conditions]
        input_df = pd.DataFrame(input_values)
        estimated_crop_yield = lr.predict(input_df)

        normalized_WUE = self.WUE_weight * (state[0]) / (WUE_max - WUE_min)
        normalized_FU = self.FU_weight * (state[1]) / (FU_max - FU_min)
        normalized_IF = self.IF_weight * (state[2])/ (IF_max - IF_min)

        yield_per_resources = estimated_crop_yield / (normalized_WUE + normalized_FU + normalized_IF)

        return yield_per_resources
    

### Defining nodes

In [227]:
class Node:
    def __init__(self, state, cost, parent=None, action=None):
        self.state = state
        self.parent = parent
        self.action = action
        self.cost = cost

    def __repr__(self):
        return f"<Node {self.state}>"
    
    def expand(self, problem):
        return [self.child_node(problem, action)
                for action in problem.actions()]
    
    def child_node(self, problem, action, strategy):
        child_state = problem.result(self.state, action)
        child_cost = problem.evaluation_function(self.cost, action, child_state, strategy)
        child = Node(child_state, child_cost, self, action)
        return child
    

### Informed search (A*, Greedy best-first)

In [228]:
from math import floor
class informedSearch:
    def __init__(self, problem, strategy = "A*"):
        self.frontier = deque()
        self.explored_set = set()
        self.problem = problem
        self.strategy = strategy
    
    def is_in_frontier(self, state):
        for node in self.frontier:
            if node.state == state:
                return True
        return False
    
    def add_to_frontier(self, node): # node has to be inserted in the correct position in the frontier based on its cost
        for index in range(len(self.frontier)):
            current_node = self.frontier[index]
            if current_node.state == node.state: # it found that the given state exists already
                if node.cost < current_node.cost:
                    self.frontier[index] = node # replace the older state with the newer, since it has less cost
                return
            if node.cost < current_node.cost:
                self.frontier.insert(index, node) # insert our node just before the node having greater cost
                return
        self.frontier.append(node)
            

    def search(self): # applying search (A* or Greedy) based on the specified strategy
        init_state = self.problem.initial_state
        init_state_cost = self.problem.evaluation_function(0, None, init_state, self.strategy)
        node = Node(init_state, init_state_cost)
        self.frontier.append(node)
        while True:
            if len(self.frontier) == 0:
                return None
            node = self.frontier.popleft()
            if self.problem.goal_test(node.state):
                print("Goal Found!")
                node.state[2] = floor(node.state[2])
                return node.state # optimized resource allocation (WUE, FU, IF)
            self.explored_set.add(tuple(node.state))
            for action in self.problem.actions():
                child_node = node.child_node(self.problem, action, self.strategy)
                if tuple(child_node.state) not in self.explored_set: # don't check for frontier, add_to_frontier() does that
                    self.add_to_frontier(child_node)

### Genetic algorithm

In [229]:
class geneticAlgorithm:
    def __init__(self, problem, generations=50):
        self.problem = problem

        self.population = set()
        for action in problem.actions():
            state = problem.result(problem.initial_state, action)
            self.population.add(tuple(state))

        self.mutation_rate = 0.75
        self.mutation_probability = 0.6
        self.generations = generations
        
        
    def search(self):
        best_chromosome = self.problem.initial_state
        best_value = -float("inf")
        for generation in range(self.generations):
            new_population = set()
            for state in self.population:# check for optimal states in current population
                state_value = self.fitness_function(list(state))
                if best_value < state_value:
                    best_value = state_value
                    best_chromosome = list(state)
            new_population.add(tuple(best_chromosome))
            while len(new_population) < len(self.population):
                parent_state_1 = self.roulette_selection(self.population)
                parent_state_2 = self.roulette_selection(self.population)
                children = self.reproduce(parent_state_1, parent_state_2)
                if random.random() < self.mutation_probability: # mutation proba is high to increase states' diversity
                    for child in children:
                        # mutation doesn't necessarily apply to each child => we make it apply to 3/4 of population
                        if random.random() < self.mutation_rate:
                            child = self.mutate(child)
                        new_population.add(tuple(child))
            self.population = new_population        
        return best_chromosome



    def fitness_function(self, state):
        return self.problem.value(state)


    def roulette_selection(self, population):
        fitness_values = {state: self.problem.value(list(state)) for state in population}
        total_fitness = sum(fitness_values.values())

        if total_fitness == 0:
            return list(random.choice(list(population)))

        threshold = random.uniform(0, total_fitness)
        cumulative = 0

        for state, fitness in fitness_values.items():
            cumulative += fitness
            if cumulative >= threshold:
                return list(state)

        return list(random.choice(list(population)))

    
    def reproduce(self, parent_1, parent_2):
        parent_1_copy = list(parent_1)
        parent_2_copy = list(parent_2)
        swapped = [0,0,0] # set swapped[0] = 1 if we change WUE of the two states, swapped[1] = 1 for FU, swapped[2] = 1 for IF
        how_many_changements = random.choice([1,2]) # change one or two attributes from (WUE, FU, IF)
        for i in range(how_many_changements):
            changement = random.choice(range(3))
            while swapped[changement] == 1:
                changement = random.choice(range(3))
            parent_1_copy[changement], parent_2_copy[changement] = parent_2_copy[changement], parent_1_copy[changement]     
            swapped[changement] = 1  
        return [parent_1_copy, parent_2_copy]
    


    def mutate(self, state):
        random_action_index = random.choice(range(6)) # there are 6 possible actions for each state
        action = self.problem.actions()[random_action_index]
        mutated_state = self.problem.result(state, action)
        return mutated_state


### Constraint Satisfaction Problem

In [230]:
# class Problem2:
#     def __init__(self):
#         self._variables = []
#         self._domains = {}
#         self._constraints = {}
#         self._neighbors = {}

#     def addVariable(self, variable, domain):
#         if variable in self._variables:
#             raise ValueError("Variable already exists")
#         self._variables.append(variable)
#         self._domains[variable] = list(domain)
#         self._constraints[variable] = []
#         self._neighbors[variable] = set()

#     def addVariables(self, variables, domain):
#         for variable in variables:
#             self.addVariable(variable, domain)

#     def addConstraint(self, constraint, variables=None):
#         if variables is None:
#             variables = constraint.variables
#         else:
#             if not hasattr(constraint, "variables"):
#                 constraint = FunctionConstraint(constraint, variables)
#             elif constraint.variables is not None:
#                 raise ValueError("Constraint already bound to other variables")

#         for variable in variables:
#             if variable not in self._variables:
#                 raise ValueError(f"Unknown variable {variable}")
#             self._constraints[variable].append(constraint)
#         for i in range(len(variables)):
#             for j in range(len(variables)):
#                 if i != j:
#                     self._neighbors[variables[i]].add(variables[j])

#     def getSolutions(self):
#         return self._getSolutions({})

#     def getSolution(self):
#         solutions = self._getSolutions({})
#         return solutions[0] if solutions else None

#     def _getSolutions(self, assignments):
#         if len(assignments) == len(self._variables):
#             return [assignments.copy()]

#         unassigned = [v for v in self._variables if v not in assignments]
#         variable = unassigned[0]

#         result = []
#         for value in self._domains[variable]:
#             assignments[variable] = value
#             if self._isValid(assignments, variable):
#                 result.extend(self._getSolutions(assignments))
#             del assignments[variable]

#         return result

#     def _isValid(self, assignments, variable):
#         for constraint in self._constraints[variable]:
#             if not constraint.evaluate(assignments):
#                 return False
#         return True


# class FunctionConstraint:
#     def __init__(self, function, variables):
#         self.function = function
#         self.variables = list(variables)

#     def evaluate(self, assignments):
#         if not all(v in assignments for v in self.variables):
#             return True
#         values = [assignments[v] for v in self.variables]
#         return self.function(*values)

# # --- Toggle TEST MODE ---
# TEST_MODE = True  # Set to False for interactive input

# # --- Example farm features and resource availability for testing ---
# example_features = {"N":80, "P":36, "K":45, "temperature":21.37,
#                   "humidity": 40.11, "ph":5.9,"rainfall":230.21,
#                   "label":1,"soil_moisture":19.22, "soil_type":2,
#                   "organic_matter":3.19,"irrigation_frequency":0,"fertilizer_usage":0, "growth_stage":2,
#                   "water_source_type":3, "water_usage_efficiency":0
#                   }

# max_water = 30000 # L/hectare
# max_fertilizer = 200.0 # Kilograms/hectare
# max_irrigation = 7    # times/week

# # --- CSP Setup ---
# problem = Problem2()
# p = Problem(0.3, 0.5, 0.4, example_features)
# water_domain = [round(x * 0.1, 2) for x in range(10, int(max_water * 10) + 1, 50)]
# fertilizer_domain = [round(x * 1.0, 2) for x in range(0, int(max_fertilizer) + 1, 5)]
# irrigation_domain = list(range(1, max_irrigation + 1, 1))


# problem.addVariable("water", water_domain)
# problem.addVariable("fertilizer", fertilizer_domain)
# problem.addVariable("irrigation", irrigation_domain)

# def valid_resource_combo(water, fertilizer, irrigation):
#     return (1.0 <= water <= max_water) and \
#            (0 <= fertilizer <= max_fertilizer) and \
#            (1 <= irrigation <= max_irrigation)

# problem.addConstraint(valid_resource_combo, ["water", "fertilizer", "irrigation"])

# # --- Solve CSP ---
# valid_combinations = problem.getSolutions()
# print(f"Found {len(valid_combinations)} feasible combinations.\n")

# best_solution = None
# best_score = float("-inf")
# for combo in valid_combinations:

#     state = [combo["water"], combo["fertilizer"], combo["irrigation"]]
#     score = p.value(state)
#     if score > best_score:
#         best_score = score
#         best_solution = {
#             **combo,
#            # "predicted_yield": round(predicted, 2),
#             "score": score
#         }

# # --- Output ---
# # Extract WUE, FU, IF and return them as a list
# if best_solution:
#     WUE = best_solution["water"]
#     FU = best_solution["fertilizer"]
#     IF = best_solution["irrigation"]
#     print([WUE, FU, IF])



In [231]:
import numpy as np
class CSP:
    def __init__(self):
        self.variables = ["WUE", "FU", "IF"]
        self.domains = {
        'WUE': np.linspace(WUE_min, WUE_max, 100), # generate 100 number between min and max
        'FU': np.linspace(FU_min, FU_max, 100),
        'IF': np.array([1,2,3,4,5,6,7])
        }
        self.constraints = constraints
        self.neighbors = {
        'WUE': ['IF'],
        'FU': ['IF'],
        'IF': ['WUE', 'FU']
        }

        self.current_domains = None
        self.assignment = {
        'WUE': random.uniform(WUE_min, WUE_max),
        'FU': random.uniform(FU_min, FU_max),
        'IF': random.choice(self.domains["IF"])
        }
    
    def conflicted_vars(self, current):
        """Return a list of variables in current assignment that are in conflict"""
        return [var for var in self.variables
                if self.nconflicts(var, current[var], current) > 0]

    def nconflicts(self, var, val, assignment):
        """Return the number of conflicts var=val has with other variables."""

        def conflict(var2):
            return var2 in assignment and not self.constraints(var, val, var2, assignment[var2])

        return sum(conflict(v) for v in self.neighbors[var])
    
    def min_conflicts(self, max_steps=1000):
        current = self.assignment
        for i in range(max_steps):
            conflicted = self.conflicted_vars(current)
            if not conflicted:
                return current.values()
            var = random.choice(conflicted)
            val = self.min_conflicts_value(var, current)
            current[var] = val
        return None
    
    def min_conflicts_value(self, var, current):
        """Return the value that will give var the least number of conflicts.
        If there is a tie, choose at random."""
        return argmin_random_tie(self.domains[var], key=lambda val: self.nconflicts(var, val, current))
            
    def goal_test(self, assignment):
        return self.conflicted_vars(assignment) == 0


def argmin_random_tie(iterable, key=lambda x: x):
    min_value = min(iterable, key=key)
    ties = [x for x in iterable if key(x) == key(min_value)]
    return random.choice(ties)

def constraints(var1, val1, var2, val2):
    if var1 == 'IF':
        IF_val = val1
    elif var2 == 'IF':
        IF_val = val2
    else:
        IF_val = None

    if var1 == 'WUE':
        WUE_val = val1
    elif var2 == 'WUE':
        WUE_val = val2
    else:
        WUE_val = None

    if var1 == 'FU':
        FU_val = val1
    elif var2 == 'FU':
        FU_val = val2
    else:
        FU_val = None

    # Constraint 1: If normalized IF > 3 → WUE ∈ [WUE_min, WUE_mean]
    if IF_val is not None and WUE_val is not None:
        if IF_val > 3 and not (WUE_min <= WUE_val <= WUE_mean):
            return False

    # Constraint 2: If FU > FU_mean → IF ∈ [IF_mean, IF_max]
    if FU_val is not None and IF_val is not None:
        if FU_val > FU_mean and not (IF_mean <= IF_val <= IF_max):
            return False

    return True

In [232]:
N_min = df_raw["N"].min()
N_max = df_raw["N"].max()
P_min = df_raw["P"].min()
P_max = df_raw["P"].max()
K_min = df_raw["K"].min()
K_max = df_raw["K"].max()
temperature_min = df_raw["temperature"].min()
temperature_max = df_raw["temperature"].max()
humidity_min = df_raw["humidity"].min()
humidity_max = df_raw["humidity"].max()
ph_min = df_raw["ph"].min()
ph_max = df_raw["ph"].max()
rainfall_min = df_raw["rainfall"].min()
rainfall_max = df_raw["rainfall"].max()
soil_moisture_min = df_raw["soil_moisture"].min()
soil_moisture_max = df_raw["soil_moisture"].max()
organic_matter_min = df_raw["organic_matter"].min()
organic_matter_max = df_raw["organic_matter"].max()


## STEP 4: Test The Search Algorithms

In [240]:
import csv
def test_crop_yield_optimization():
    print("\n----- Crop Yield Optimization Test -----")

    env_input = {}

    env_input["N"] = random.randint(N_min, N_max)
    env_input["P"] = random.randint(P_min, P_max)
    env_input["K"] = random.randint(K_min, K_max)
    env_input["temperature"] = random.uniform(temperature_min, temperature_max)
    env_input["humidity"] = random.uniform(humidity_min, humidity_max)
    env_input["ph"] = random.uniform(ph_min, ph_max)
    env_input["rainfall"] = random.uniform(rainfall_min, rainfall_max)
    env_input["label"] = random.randint(0, 22)
    env_input["soil_moisture"] = random.uniform(soil_moisture_min, soil_moisture_max)
    env_input["soil_type"] = random.randint(1, 3)
    env_input["organic_matter"] = random.uniform(organic_matter_min, organic_matter_max)
    env_input["irrigation_frequency"] = 0 # to be recommended
    env_input["fertilizer_usage"] = 0 # to be recommended
    env_input["growth_stage"] = random.randint(1, 3)
    env_input["water_source_type"] = random.randint(1, 3)
    env_input["water_usage_efficiency"] = 0 # to be recommended

    '''env_input = {"N":80, "P":36, "K":45, "temperature":21.37,
                  "humidity": 40.11, "ph":5.9,"rainfall":230.21,
                  "label":1,"soil_moisture":19.22, "soil_type":2,
                  "organic_matter":3.19,"irrigation_frequency":0,"fertilizer_usage":0, "growth_stage":2,
                  "water_source_type":3, "water_usage_efficiency":0       
                  }'''
    
    problem = Problem(0.3, 0.4, 0.2, env_input)
    informed_search = informedSearch(problem, strategy="A*")

    print("\n--- A* Search ---")
    astar_result = informed_search.search()
    if astar_result is None:
        print("A* Algorithm Has Failed to Achieve The Specified Threshold. Try Again!")
        return
    env_input["water_usage_efficiency"] = astar_result[0]
    env_input["fertilizer_usage"] = astar_result[1]
    env_input["irrigation_frequency"] = astar_result[2]
    df = pd.DataFrame([env_input])
    print(f"A* Result ==> WUE:{astar_result[0]:.2f} L/ha, FU:{astar_result[1]:.2f} kg/ha, IF:{astar_result[2]} times/week")
    astar_yield = lr.predict(df)
    print(f"Crop yield => {astar_result} tn/hec")
    astar_ratio = problem.value(astar_result)
    astar_display = [astar_result[0], astar_result[1], astar_result[2], astar_yield, astar_ratio]
    print(f"ratio ==> {astar_ratio}")


    problem = Problem(0.3, 0.4, 0.2, env_input)
    informed_search = informedSearch(problem, strategy="Greedy best first")

    print("\n--- Greedy Search ---")
    greedy_result = informed_search.search()
    if greedy_result is None:
        print("Greedy Best First Algorithm Has Failed to Achieve The Specified Threshold. Try Again!")
        return
    env_input["water_usage_efficiency"] = greedy_result[0]
    env_input["fertilizer_usage"] = greedy_result[1]
    env_input["irrigation_frequency"] = greedy_result[2]
    df = pd.DataFrame([env_input])
    print(f"Greedy Result ==> WUE:{greedy_result[0]:.2f} L/ha, FU:{greedy_result[1]:.2f} kg/ha, IF:{greedy_result[2]} times/week")
    greedy_yield = lr.predict(df)
    print(f"Crop yield => {greedy_yield} tn/hec")

    greedy_ratio = problem.value(greedy_result)
    greedy_display = [greedy_result[0], greedy_result[1], greedy_result[2], greedy_yield, greedy_ratio]
    print(f"ratio ==> {greedy_ratio}")

    print("\n--- Genetic Algorithm ---")
    problem = Problem(0.3, 0.4, 0.2, env_input)
    genetic_search = geneticAlgorithm(problem)
    genetic_result = genetic_search.search()
    if genetic_result is None:
        print("Genetic Algorithm Has Failed to Achieve The Specified Threshold. Try Again!")
        return
    env_input["water_usage_efficiency"] = genetic_result[0]
    env_input["fertilizer_usage"] = genetic_result[1]
    env_input["irrigation_frequency"] = genetic_result[2]
    df = pd.DataFrame([env_input])
    print(f"Genetic Result ==> WUE:{genetic_result[0]:.2f} L/ha, FU:{genetic_result[1]:.2f} kg/ha, IF:{genetic_result[2]:.2f} times/week")
    genetic_yield = lr.predict(df)
    print(f"Estimated Crop yield => {genetic_yield} tn/hec")

    genetic_ratio = problem.value(genetic_result)
    genetic_display = [genetic_result[0], genetic_result[1], genetic_result[2], genetic_yield, genetic_ratio]
    print(f"ratio ==> {genetic_ratio}")


    print("\n--- CSP (Min-Conflicts) ---")

    csp = CSP()
    
    csp_result = list(csp.min_conflicts(max_steps=1000))
    if csp_result is not None:
        print(f"CSP Result: {csp_result}")
        env_input["water_usage_efficiency"] = csp_result[0]
        env_input["fertilizer_usage"] = csp_result[1]
        env_input["irrigation_frequency"] = csp_result[2]
        df = pd.DataFrame([env_input])
        csp_yield = lr.predict(df)
        print(f"CSP Crop Yield => {csp_yield} tn/hec")
    else:
        print("CSP failed to find a solution. Try again!")
        return
    
    csp_ratio = problem.value(csp_result)
    csp_display = [csp_result[0], csp_result[1], csp_result[2], csp_yield, csp_ratio]
    print(f"ratio ==> {csp_ratio}")
    
    return [astar_display, greedy_display, genetic_display, csp_display]
           #[WUE, FU, IF, yield, ratio]

if __name__ == "__main__":
    random.seed()
    ratio_results = []
    WUE_results = []
    FU_results = []
    yield_results = []
    IF_results = []  # Added for irrigation frequency if needed
    
    for i in range(20):
        print("Run Number: ", i)
        result = test_crop_yield_optimization()
        if result is None:
            continue  # Skip failed runs
            
        # Each result contains [astar_display, greedy_display, genetic_display, csp_display]
        # Each display contains [WUE, FU, IF, yield, ratio]
        
        # Collect ratios for each algorithm
        ratio_results.append([
            result[0][4],  # A* ratio
            result[1][4],  # Greedy ratio
            result[2][4],  # Genetic ratio
            result[3][4]   # CSP ratio
        ])
        
        # Collect WUE for each algorithm
        WUE_results.append([
            result[0][0],  # A* WUE
            result[1][0],  # Greedy WUE
            result[2][0],  # Genetic WUE
            result[3][0]   # CSP WUE
        ])
        
        # Collect FU for each algorithm
        FU_results.append([
            result[0][1],  # A* FU
            result[1][1],  # Greedy FU
            result[2][1],  # Genetic FU
            result[3][1]   # CSP FU
        ])
        
        # Collect yield for each algorithm
        yield_results.append([
            result[0][3][0],  # A* yield (assuming it's an array)
            result[1][3][0],  # Greedy yield
            result[2][3][0],  # Genetic yield
            result[3][3][0]   # CSP yield
        ])

    # Write to CSV files
    with open('bar_chart.csv', 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(["A*", "Greedy Best First", "Genetic", "CSP"])
        writer.writerows(ratio_results)
        
    with open('WUE_chart.csv', 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(["A*", "Greedy Best First", "Genetic", "CSP"])
        writer.writerows(WUE_results)
        
    with open('FU_chart.csv', 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(["A*", "Greedy Best First", "Genetic", "CSP"])
        writer.writerows(FU_results)
        
    with open('yield_chart.csv', 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(["A*", "Greedy Best First", "Genetic", "CSP"])
        writer.writerows(yield_results)
    

Run Number:  0

----- Crop Yield Optimization Test -----

--- A* Search ---
Goal Found!
A* Result ==> WUE:12065.19 L/ha, FU:136.71 kg/ha, IF:7 times/week
Crop yield => [12065.19099196377, 136.7131534922722, 7] tn/hec
ratio ==> [5.15160314]

--- Greedy Search ---
Goal Found!
Greedy Result ==> WUE:29805.71 L/ha, FU:77.21 kg/ha, IF:3 times/week
Crop yield => [3.96154984] tn/hec
ratio ==> [6.23847929]

--- Genetic Algorithm ---
Genetic Result ==> WUE:2587.29 L/ha, FU:56.50 kg/ha, IF:1.00 times/week
Estimated Crop yield => [2.68022567] tn/hec
ratio ==> [12.31139857]

--- CSP (Min-Conflicts) ---
CSP Result: [18960.624827017364, 99.37834307927099, np.int64(1)]
CSP Crop Yield => [3.51161387] tn/hec
ratio ==> [6.99729363]
Run Number:  1

----- Crop Yield Optimization Test -----

--- A* Search ---
Goal Found!
A* Result ==> WUE:11578.92 L/ha, FU:177.12 kg/ha, IF:7 times/week
Crop yield => [11578.918688578719, 177.11565258072176, 7] tn/hec
ratio ==> [5.71084917]

--- Greedy Search ---
Goal Found!


## Extra Work

### CSP Version 2