# Crop Recommendation 

<img src='https://www.theenglishgarden.co.uk/wp-content/uploads/2016/08/apple-orchard-web-696x464.jpg' width=900, heigth=450 align='center'/>

### Introduction

As climate change climbs the chart of existential threats, soil is getting a lot of attention. Back when it supported forest or grassland, before we cleared it to grow crops, it stored an awful lot of carbon. By farming the land, we released the carbon. Now, there’s a major push to figure out how to put at least some of it back.

Growing perennial fruit crops can be a part of climate change solution. Perennial fruit crops such as apple, banana, grape, and citrus are important components of human diet providing ﬁber, vita-mins, antioxidants, and nutrients such as manganese and calcium that can improve human health but are often limited in cereals and other major food crops. 
Perennial horticultural crops are also high value agricultural commodities that are important for the local as well as the global economy. For example, coffee, grape, banana, and apple were ranked among the top 20 agricultural value commodities. 

Perennial crops deliver additional beneﬁts in agroecosystems such as carbon sequestration, erosion protection, biodiversity, and water retention. 

In this project we will focus on perennial crops reccomendation system based on soil and climate conditions. Perennial crops in this dataset are: pomegranate, banana, mango, grapes, apple, orange, papaya, coconut and coffee.
Jute and cotton are also perennial crops, but they are mainly grown as annual crops. Treating them as an annual crop and rotating the crop on field each year helps to minimize disease problems.

### Import Libraries

In [None]:
# Importing pandas and Series + DataFrame:
import pandas as pd
from pandas import Series, DataFrame
from pandas_summary import DataFrameSummary

# Importing numpy, matplotlib and seaborn:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# Imports for plotly:
import plotly.graph_objs as go
import plotly.express as px
import plotly.figure_factory as ff
from plotly.subplots import make_subplots

# Ignore warnings:
import warnings
warnings.filterwarnings("ignore")

# Importing os:
import os

### Data Loading

In [None]:
# Data files are available in the "../input/..." directory:
import os

folder_dir = '../input/crop-recommendation-dataset/'
print(os.listdir(folder_dir))

In [None]:
# Assign the data to a crop dataset:
crop = pd.read_csv(folder_dir+'Crop_recommendation.csv')

# Unique crop labels:
print(crop.label.unique())

In [None]:
# Create dataset with perennial crops only:
perennials = ['pomegranate', 'banana', 'mango', 'grapes', 'apple', 'orange', 'papaya', 'coconut', 'coffee']
pers = crop[crop['label'].isin(perennials)]

In [None]:
# Create numerical label:
pers['num_label'] = pers.label.map({'pomegranate':0, 'banana':1, 'mango':2, 'grapes':3, 'apple':4, 'orange':5, 'papaya':6, 'coconut':7, 'coffee':8})

In [None]:
# Function to describe variables:
stats = DataFrameSummary(pers).summary()
stats.transpose()

DATA FIELDS DESCRIPTION:
 - N - ratio of Nitrogen content in soil
 - P - ratio of Phosphorous content in soil
 - K - ratio of Potassium content in soil
 - temperature - temperature in degree Celsius
 - humidity - relative humidity in %
 - ph - ph value of the soil
 - rainfall - rainfall in mm

### Exploratory Data Analysis

#### Chemical  and Physical Properties of Soil

N, K, P or nitrogen, potassium and phosphorus content in soil are chemical soil properties and pH is physical property. Below are boxplots for each of the properties split by perennial crop.

In [None]:
# Boxplot for soil properties:

fig = go.Figure()

# Add Traces:
fig.add_trace(go.Box(x=pers['label'], y=pers['N']))
fig.add_trace(go.Box(x=pers['label'], y=pers['K'], visible=False))  
fig.add_trace(go.Box(x=pers['label'], y=pers['P'], visible=False)) 
fig.add_trace(go.Box(x=pers['label'], y=pers['ph'], visible=False)) 

# Add Buttons:
fig.update_layout(
    updatemenus=[
        dict(
            active=1,
            buttons=list([ 
                
                dict(label='N',
                     method='update',
                     args=[{'visible': [True, False,False,False]},
                           {'title': 'Boxplots for Soil Properties - Nitrogen'}]),
  
                dict(label='K',
                     method='update',
                     args=[{'visible': [False, True, False, False]},
                           {'title': 'Boxplots for Soil Properties - Potassium'}]),
                
                dict(label='P',
                     method='update',
                     args=[{'visible': [False, False, True, False]},
                           {'title': 'Boxplots for Soil Properties - Phosphorus'}]),
                
                dict(label='ph',
                     method='update',
                     args=[{'visible': [False, False, False, True]},
                           {'title': 'Boxplots for Soil Properties - pH'}]),
                
               
            ]),
        )
    ])

# Set title:
fig.update_layout(title_text='Boxplots for Soil Properties')

fig.show()

Banana and coffee need soil rich for nitrogen (80-120). Grape and apple trees like soil with high level of potassium (195-205), the lowest level is required by oranges. Similarly grape and apple thrive in soil rich of phosphorus followed by banana and papaya. Orange and coconut do not require soil with lots of phosphorus.

#### Climate conditions

The dataset we are working with in this project was collected in India. Due to climate change, temperatures and rainfall amounts in India vary from year to year and influence the amount of crops that farmers can produce.

In [None]:
# Boxplot with dropdown menu for main features:

fig = go.Figure()

# Add Traces

fig.add_trace(go.Box(x=pers['label'], y=pers['temperature']))
fig.add_trace(go.Box(x=pers['label'], y=pers['humidity'], visible=False))  
fig.add_trace(go.Box(x=pers['label'], y=pers['rainfall'], visible=False))  

# Add Buttons

fig.update_layout(
    updatemenus=[
        dict(
            active=1,
            buttons=list([ 
                
                dict(label='temperature',
                     method='update',
                     args=[{'visible': [True, False,False]},
                           {'title': 'Boxplot for Climate - temperature'}]),
  
                dict(label='humidity',
                     method='update',
                     args=[{'visible': [False, True, False]},
                           {'title': 'Boxplot for Climate - humidity'}]),
                
                dict(label='rainfall',
                     method='update',
                     args=[{'visible': [False, False, True]},
                           {'title': 'Boxplot for Climate - rainfall'}]),
                
               
            ]),
        )
    ])

# Set title
fig.update_layout(title_text='Boxplot for Climate')

fig.show()

### Crop Recommendation Models

[](http://)

The dataset needs more data, but this exercise is just for fun :) We will use the following models:
 - Decision Tree Classifier
 - SVM Classifier
 - Naive Bayes Classifier

In [None]:
from sklearn.utils import shuffle
pers = shuffle(pers)

In [None]:
from sklearn.model_selection import train_test_split

X = pers.drop(['label', 'num_label'], axis = 1)
y = pers.label

# Divide X, y into train and test data:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1234)

#### Decision Tree Classifier

In [None]:
# Train a DescisionTreeClassifier:
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier(max_depth = 8).fit(X_train, y_train)

In [None]:
accuracy = dtree.score(X_test, y_test)
print('Accuracy score for DecisionTreeClassifier: ' + str(accuracy))

In [None]:
dtree_preds = dtree.predict(X_test)

# Import and print classification report:
from sklearn.metrics import classification_report
print(classification_report(y_test, dtree_preds))

In [None]:
# Create a confusion matrix
from sklearn.metrics import confusion_matrix    
cm = confusion_matrix(y_test, dtree_preds)

# Plot heatmap for the confusion matrix:
plt.figure(figsize = (14,10))
sns.heatmap(cm, cmap='YlGnBu', annot = True, fmt = 'g')

#### SVM Classifier

In [None]:
# Train a SVM classifier:
from sklearn.svm import SVC
svm = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
svm_preds = svm.predict(X_test)

# Print classification report:
print(classification_report(y_test, svm_preds))

In [None]:
# Model accuracy for X_test  
accuracy = svm.score(X_test, y_test)
print('Accuracy score for SVM Classifier: ' + str(accuracy))

In [None]:
# Create a confusion matrix  
cm = confusion_matrix(y_test, svm_preds)

# Plot heatmap for the confusion matrix:
plt.figure(figsize = (14,10))
sns.heatmap(cm, cmap='YlGnBu', annot = True, fmt = 'g')

#### Naive Bayes Classifier

In [None]:
# Train a Naive Bayes classifier
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB().fit(X_train, y_train)
gnb_preds = gnb.predict(X_test)

print(classification_report(y_test, gnb_preds))

In [None]:
# accuracy on X_test
accuracy = gnb.score(X_test, y_test)
print('Accuracy score for Naive Bayes Classifier: ' + str(accuracy))

In [None]:
# Create a confusion matrix  
cm = confusion_matrix(y_test, gnb_preds)

# Plot heatmap for the confusion matrix:
plt.figure(figsize = (14,10))
sns.heatmap(cm, cmap='YlGnBu', annot = True, fmt = 'g')