## Crop Recommendation Model Training with Comparison

In [None]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from joblib import dump

: 

This notebook demonstrates the process of training two different machine learning models (KNN and Gaussian Naive Bayes) for recommending crops based on environmental features such as NPK values, temperature, humidity, pH, and rainfall. We will compare their performances to decide which model to use.

In [None]:
data = pd.read_csv('Crop_recommendation.csv')
print(data.head())
print('Shape of Dataset:', data.shape)

### Import Necessary Libraries

In [None]:
labels = data['label']
features = data.drop('label', axis=1)
print('Labels:')
print(labels.head())
print('Features:')
print(features.head())

Import libraries required for handling data, machine learning operations, and model evaluation.

In [None]:
scaler = StandardScaler()
scaled_features = scaler.fit_transform(features)
features = pd.DataFrame(scaled_features, columns=features.columns)
print(features.head())

### Load and Inspect the Dataset

In [None]:
X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)
print('Training set size:', X_train.shape)
print('Testing set size:', X_test.shape)

Load the crop recommendation data from the provided CSV file and inspect the first few entries to understand its structure.

In [None]:
gnb_model = GaussianNB()
gnb_model.fit(X_train, y_train)
y_pred_gnb = gnb_model.predict(X_test)
accuracy_gnb = accuracy_score(y_test, y_pred_gnb)
print('Gaussian Naive Bayes Accuracy:', accuracy_gnb)

### Data Preprocessing

In [None]:
knn_model = KNeighborsClassifier(n_neighbors=3)
knn_model.fit(X_train, y_train)
y_pred_knn = knn_model.predict(X_test)
accuracy_knn = accuracy_score(y_test, y_pred_knn)
print('KNN Accuracy:', accuracy_knn)

Separate the features and the labels, and apply scaling to normalize the features, crucial for effective model training.

In [None]:
better_model = gnb_model if accuracy_gnb > accuracy_knn else knn_model
model_type = 'Gaussian Naive Bayes' if accuracy_gnb > accuracy_knn else 'K-Nearest Neighbors'
print(f'The better model based on accuracy is {model_type}.')

### Model Training: Gaussian Naive Bayes

In [None]:
dump(better_model, 'crop_model.pkl')
dump(scaler, 'crop_scaler.pkl')
print('Model and scaler saved to disk.')