Model to predict plant to be harvested based on various parameters.

Import Libraries and load dataset.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score

# Load the dataset
data_path = r"C:\Users\Arnav\Desktop\spawn_labs\Crop_recommendation.csv"
data = pd.read_csv(data_path)

Preview of the dataset

In [2]:
print("Dataset Preview:\n", data.head())
print("\nDataset Info:\n")
data.info()

Dataset Preview:
     N   P   K  temperature   humidity        ph    rainfall label
0  90  42  43    20.879744  82.002744  6.502985  202.935536  rice
1  85  58  41    21.770462  80.319644  7.038096  226.655537  rice
2  60  55  44    23.004459  82.320763  7.840207  263.964248  rice
3  74  35  40    26.491096  80.158363  6.980401  242.864034  rice
4  78  42  42    20.130175  81.604873  7.628473  262.717340  rice

Dataset Info:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2200 entries, 0 to 2199
Data columns (total 8 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   N            2200 non-null   int64  
 1   P            2200 non-null   int64  
 2   K            2200 non-null   int64  
 3   temperature  2200 non-null   float64
 4   humidity     2200 non-null   float64
 5   ph           2200 non-null   float64
 6   rainfall     2200 non-null   float64
 7   label        2200 non-null   object 
dtypes: float64(4), int64(3), object(1)
memo

Target Variable (Plant to be grown) and train test splitting of dataset

In [3]:
X = data.drop(columns=['label'])  
y = data['label']             

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Load the model and training it the training data

In [4]:

model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

Performing prediction and check accuracy scores

In [5]:
predictions = model.predict(X_test)
print("\nModel Accuracy:", accuracy_score(y_test, predictions))
print("\nClassification Report:\n", classification_report(y_test, predictions))


Model Accuracy: 0.9931818181818182

Classification Report:
               precision    recall  f1-score   support

       apple       1.00      1.00      1.00        23
      banana       1.00      1.00      1.00        21
   blackgram       1.00      1.00      1.00        20
    chickpea       1.00      1.00      1.00        26
     coconut       1.00      1.00      1.00        27
      coffee       1.00      1.00      1.00        17
      cotton       1.00      1.00      1.00        17
      grapes       1.00      1.00      1.00        14
        jute       0.92      1.00      0.96        23
 kidneybeans       1.00      1.00      1.00        20
      lentil       0.92      1.00      0.96        11
       maize       1.00      1.00      1.00        21
       mango       1.00      1.00      1.00        19
   mothbeans       1.00      0.96      0.98        24
    mungbean       1.00      1.00      1.00        19
   muskmelon       1.00      1.00      1.00        17
      orange       1

Verifying the model's working with our own examples

In [6]:
#Sample conditions for the plant pomogranete to grow
#Ideal growth values have been taken from the dataset

example = pd.DataFrame({
    'N': [18],  #Soil Nitrogen value          
    'P': [27],  #Soil Phosphorous value         
    'K': [41],  #Soil Potassium value          
    'temperature': [22.5],  
    'humidity': [91.0],   
    'ph': [6.8],          
    'rainfall': [110.0]   
})
example_prediction = model.predict(example)
print("\nPredicted Crop for Example Parameters:", example_prediction[0])


Predicted Crop for Example Parameters: pomegranate


In [7]:
#Sample conditions for blackgram to grow

example2 = pd.DataFrame({
    'N': [50],            
    'P': [65],            
    'K': [19],            
    'temperature': [30.0],  
    'humidity': [67.0],   
    'ph': [7.1],          
    'rainfall': [70.0]    
})
example_prediction_2 = model.predict(example2)
print("\nPredicted Crop for Example Parameters:", example_prediction_2[0])


Predicted Crop for Example Parameters: blackgram


Vice Versa to predict conditions based on plant 

In [9]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import mean_absolute_error

In [10]:
file_path = "C:\\Users\\Arnav\\Desktop\\spawn_labs\\Crop_recommendation.csv"
data = pd.read_csv(file_path)

In [11]:
label_encoder = LabelEncoder()
data['label_encoded'] = label_encoder.fit_transform(data['label'])


In [13]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)

ValueError: could not convert string to float: 'orange'

In [None]:
y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error: {mae}")


def predict_conditions(crop_name):
    encoded_label = label_encoder.transform([crop_name])[0]
    predicted_conditions = model.predict([[encoded_label]])
    return pd.DataFrame(predicted_conditions, columns=y.columns)

In [None]:
crop_name = "pomegranate"
ideal_conditions = predict_conditions(crop_name)
print(f"Ideal conditions for {crop_name}:")
print(ideal_conditions)