# **OPEN-ARC**
---

### Project 7: Crop Recommendation Model:
**Challenge:** Create an AI model, capable of recommending crops based on different feature values, to optimize crop yields.


### Terms and Use:
Learn more about the project's [LICENSE](https://github.com/Infinitode/OPEN-ARC/blob/main/LICENSE) and read our [CODE_OF_CONDUCT](https://github.com/Infinitode/OPEN-ARC/blob/main/CODE_OF_CONDUCT) before contributing to the project. You can contribute to this project from here: [https://github.com/Infinitode/OPEN-ARC/](https://github.com/Infinitode/OPEN-ARC/).

---

Please fill out this performance sheet to help others quickly see your model's performance **(optional)**:

### Performance Sheet:
| Contributor | Architecture Type | Platform | Base Model | Dataset | Accuracy | Link |
|-------------|-------------------|----------|------------|---------|----------|------|
| Infinitode  | XGBClassifier  | Kaggle   | ✔  | Crop Recommendation Dataset | 98.6%    | [Notebook](https://github.com/Infinitode/OPEN-ARC/blob/main/Project-7-CR/project-7-cr.ipynb) |
| Username  | Unknown  | Kaggle   | ✗/✔  | Crop Recommendation Dataset | Score    | [Notebook](https://github.com) |

---

### Model: XGBoost Classifier:
This model implementation uses a `XGBoost Classifier` model, which is an extremely optimized and efficient classification model, perfect for our use case. We've also used `sklearn's` `LabelEncoder` to process and encode the target `"Crop"` string, into a numerical format.

### Import the necessary libraries

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
import random

### Load the dataset, preprocess the data

We'll also check for missing values, and remove the target variable from the dataset, leaving us with only the training data.

In [2]:
# Load dataset
dataset_path = "/kaggle/input/crop-recommendation-dataset/Crop_Recommendation.csv"
df = pd.read_csv(dataset_path)

# Check for missing values
print(df.isnull().sum())

# Encode the target variable 'Crop' (crop recommendation)
le = LabelEncoder()
df['Crop'] = le.fit_transform(df['Crop'])

# Separate features (X) and target (y)
X = df.drop('Crop', axis=1)
y = df['Crop']

Nitrogen       0
Phosphorus     0
Potassium      0
Temperature    0
Humidity       0
pH_Value       0
Rainfall       0
Crop           0
dtype: int64


### Split the dataset into train and test sets

Now we'll split the dataset into train and test sets, with a size of `80% train`, and `20% test`.

In [3]:
# Split the dataset into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Initialize the model, and train

In [4]:
# Train a model using XGBoost for fast and accurate classification
model = XGBClassifier(use_label_encoder=False, eval_metric='mlogloss')

# Fit the model
model.fit(X_train, y_train)

### Evaluate the model on the test set

In [5]:
# Predict on the test set
y_pred = model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Set Accuracy: {accuracy * 100:.2f}%")

Test Set Accuracy: 98.64%


A test set accuracy of `98.64%`, is not bad. It shows that our model, can accurately learn from the features in our dataset, and generalize well to data it has not seen before.

### Test the model, on random samples

Now let's test the model on random samples from our test set, and see how it performs.

In [6]:
def test_random_samples(model, X_test, y_test, le, n_samples=6):
    # Select 6 random indices
    random_indices = random.sample(range(X_test.shape[0]), n_samples)
    
    # Extract the random samples
    X_sample = X_test.iloc[random_indices, :]
    y_true_sample = y_test.iloc[random_indices]
    
    # Predict crop recommendations
    y_pred_sample = model.predict(X_sample)
    
    # Decode the predictions and ground truth back to crop names
    crops_pred = le.inverse_transform(y_pred_sample)
    crops_true = le.inverse_transform(y_true_sample)
    
    # Display the results
    for i in range(n_samples):
        print(f"Sample {i+1}:")
        print(f"Features: \n{X_sample.iloc[i]}")
        print(f"Predicted Crop: {crops_pred[i]}")
        print(f"Ground Truth: {crops_true[i]}")
        print("-" * 30)

# Test the function with random samples
test_random_samples(model, X_test, y_test, le)

Sample 1:
Features: 
Nitrogen        74.000000
Phosphorus      54.000000
Potassium       38.000000
Temperature     25.655535
Humidity        83.470211
pH_Value         7.120273
Rainfall       217.378858
Name: 56, dtype: float64
Predicted Crop: Rice
Ground Truth: Rice
------------------------------
Sample 2:
Features: 
Nitrogen        21.000000
Phosphorus      20.000000
Potassium       31.000000
Temperature     25.600337
Humidity        99.724010
pH_Value         5.855458
Rainfall       165.824873
Name: 1891, dtype: float64
Predicted Crop: Coconut
Ground Truth: Coconut
------------------------------
Sample 3:
Features: 
Nitrogen       20.000000
Phosphorus     19.000000
Potassium      35.000000
Temperature    34.177198
Humidity       50.621616
pH_Value        6.113935
Rainfall       98.006880
Name: 1105, dtype: float64
Predicted Crop: Mango
Ground Truth: Mango
------------------------------
Sample 4:
Features: 
Nitrogen       21.000000
Phosphorus     31.000000
Potassium      32.000000
Te

The model's output clearly shows that it has recommended the correct type of crop for all of the random test samples in this set.

### (Optional) Save the model, for later use

In [7]:
import pickle

# Save the model to a file
with open('crop_recommendation_model.pkl', 'wb') as model_file:
    pickle.dump(model, model_file)

# Save the label encoder
with open('label_encoder.pkl', 'wb') as le_file:
    pickle.dump(le, le_file)

print("Model and LabelEncoder saved successfully!")

Model and LabelEncoder saved successfully!


### The End:

This is the end of this project notebook, make sure to experiment and contribute to help improve the model and implementation. You can browse more of the open-source free projects on our GitHub repository: https://github.com/Infinitode/OPEN-ARC. If you like this project, make sure to star the repo and contribute your implementation, or help others in the community.

~ Infinitode