## Crop and Fertilizer Recommendation System using ML

In [2]:
# Context
    #Agriculture plays a crucial role in food security and economic stability. Selecting the right crop based on soil and climate conditions can significantly enhance yield and sustainability. This dataset is designed to assist in recommending suitable crops and fertilizers based on key soil and environmental parameters.

#Problem Statement
    #Develop a machine learning model to accurately predict the most suitable crop and recommend appropriate fertilizers based on soil composition and climatic conditions.

#Dataset Description
    #The dataset contains various agricultural parameters as predictor variables and a target variable, "label", which represents the recommended crop.

#Predictor Variables

#N (Nitrogen):
    #Nitrogen content in the soil (measured in mg/kg).

#P (Phosphorus):
    #Phosphorus content in the soil (measured in mg/kg).

#K (Potassium):
    #Potassium content in the soil (measured in mg/kg).

#Temperature:
    #Atmospheric temperature (in degrees Celsius).

#Humidity:
    #Relative humidity in percentage (%).

#pH:
#Soil pH level (indicating soil acidity or alkalinity).

#Rainfall:
#Annual rainfall (in mm).


#Target Variable
#Label (Crop Type):
#The recommended crop for cultivation based on soil and climate conditions.
#Includes multiple crop categories such as rice, wheat, maize, mango, apple, coffee, cotton, and more.
#Potential Additional Feature


#Fertilizer Recommendation:
#Based on soil deficiencies, a model could suggest appropriate fertilizers (Nitrogen, Phosphorus, Potassium-based).

In [3]:
# import necessary Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [4]:
# Function to load data from a file
def load_data(file_path):
    """Load dataset from a given file path."""
    try:
        data = pd.read_csv(file_path)
        print("Data loaded successfully!")
        return data
    except FileNotFoundError:
        print("The file was not found. Please check the file path.")
        return None

# Example usage (using raw string to avoid escape characters issue)
file_path = r'Crop_recommendation.csv'  # Relative path
crop_data = load_data(file_path)

Data loaded successfully!


In [5]:
# Function to analyze the dataset
def analyze_data(data):
    """Perform basic analysis on the dataset."""
    if data is not None:
        print("\nDataset Preview (Head):")
        print(data.head())

        print("\nDataset Preview (Tail):")
        print(data.tail())

        print("\nDataset Information:")
        data.info()

        print("\nMissing Values in the Dataset:")
        print(data.isnull().sum())


        print("\nNumber of Duplicate Rows in the Dataset:")
        print(data.duplicated().sum())

        print("\nDescriptive Statistics of the Dataset:")
        print(data.describe())

        print("\nColumn Names in the Dataset:")
        print(data.columns)
    else:
        print("No data to analyze.")

# Analyzing the dataset
analyze_data(crop_data)


Dataset Preview (Head):
    N   P   K  temperature   humidity        ph    rainfall label
0  90  42  43    20.879744  82.002744  6.502985  202.935536  rice
1  85  58  41    21.770462  80.319644  7.038096  226.655537  rice
2  60  55  44    23.004459  82.320763  7.840207  263.964248  rice
3  74  35  40    26.491096  80.158363  6.980401  242.864034  rice
4  78  42  42    20.130175  81.604873  7.628473  262.717340  rice

Dataset Preview (Tail):
        N   P   K  temperature   humidity        ph    rainfall   label
2195  107  34  32    26.774637  66.413269  6.780064  177.774507  coffee
2196   99  15  27    27.417112  56.636362  6.086922  127.924610  coffee
2197  118  33  30    24.131797  67.225123  6.362608  173.322839  coffee
2198  117  32  34    26.272418  52.127394  6.758793  127.175293  coffee
2199  104  18  30    23.603016  60.396475  6.779833  140.937041  coffee

Dataset Information:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2200 entries, 0 to 2199
Data columns (total 8 colu

In [6]:
# Function to visualize missing data as a heatmap
def visualize_missing_data(data):
    """Visualize missing data in the dataset using a heatmap."""
    if data.isnull().sum().sum() == 0:
        print("No missing data in the dataset.")
    else:
        plt.figure(figsize=(10, 6))
        sns.heatmap(data.isnull(), cbar=False, cmap="viridis", linewidths=0.5)
        plt.title("Missing Data Heatmap")
        plt.show()

# Check if any missing data exists
print("Missing data count per column:")
print(crop_data.isnull().sum())

# Visualizing missing data
visualize_missing_data(crop_data)

Missing data count per column:
N              0
P              0
K              0
temperature    0
humidity       0
ph             0
rainfall       0
label          0
dtype: int64
No missing data in the dataset.


In [7]:
# Function to analyze value counts of a specific column
def analyze_column(data, column_name):
    """Analyze the value counts of a specific column."""
    if column_name in data.columns:
        print(f"\nValue Counts in the '{column_name}' Column:")
        print(data[column_name].value_counts())
    else:
        print(f"Column '{column_name}' not found.")

# Analyzing value counts of the 'label' column (change the column name if needed)
analyze_column(crop_data, 'label')


Value Counts in the 'label' Column:
label
rice           100
maize          100
jute           100
cotton         100
coconut        100
papaya         100
orange         100
apple          100
muskmelon      100
watermelon     100
grapes         100
mango          100
banana         100
pomegranate    100
lentil         100
blackgram      100
mungbean       100
mothbeans      100
pigeonpeas     100
kidneybeans    100
chickpea       100
coffee         100
Name: count, dtype: int64
