## DATASET RELEVANCE EXPLANATION

### HOW THIS DATASET SUPPORTS THE MAIZE FARMER AGENT:

The Crop Recommendation Dataset is highly relevant for our maize farmer agent because:

1.  **SOIL NUTRIENT MANAGEMENT**: The dataset contains NPK (Nitrogen, Phosphorus, Potassium) values that help the agent understand optimal soil conditions for maize cultivation and make fertilizer application decisions.

2.  **ENVIRONMENTAL DECISION-MAKING**: Temperature, humidity, and rainfall data enable the agent to determine the best planting times, predict crop performance, and schedule irrigation appropriately.

3.  **SOIL pH OPTIMIZATION**: pH levels are critical for nutrient availability in maize farming. This data helps the agent assess whether soil amendments are needed.

4.  **CROP SELECTION VALIDATION**: By analyzing conditions where maize performs well versus other crops, the agent can make informed decisions about whether to plant maize or consider alternatives based on current field conditions.

5.  **PREDICTIVE CAPABILITIES**: The agent can use this data to build ML models that predict crop suitability, helping farmers maximize yield and reduce crop failure risks.

In [1]:
# Install and import required libraries

import pandas as pd
import numpy as np

In [3]:
df = pd.read_csv('Crop_recommendation.csv')

In [None]:
# 1. DATASET SHAPE
print("\n--- 1. DATASET SHAPE ---")
print(f"Number of rows: {df.shape[0]}")
print(f"Number of columns: {df.shape[1]}")
print(f"Total cells: {df.shape[0] * df.shape[1]}")

In [4]:
# 2. COLUMN NAMES AND DATA TYPES
print("\n COLUMN NAMES AND DATA TYPES ")


dtype_summary = pd.DataFrame({
    'Column': df.columns,
    'Data Type': df.dtypes.values,
    'Non-Null Count': df.count().values,
    'Null Count': df.isnull().sum().values
})

print(dtype_summary.to_string(index=False))

print(f"\nTotal Columns: {len(df.columns)}")
print(f"Numerical: {len(df.select_dtypes(include=[np.number]).columns)}")
print(f"Categorical: {len(df.select_dtypes(include=['object']).columns)}")


 COLUMN NAMES AND DATA TYPES 
     Column Data Type  Non-Null Count  Null Count
          N     int64            2200           0
          P     int64            2200           0
          K     int64            2200           0
temperature   float64            2200           0
   humidity   float64            2200           0
         ph   float64            2200           0
   rainfall   float64            2200           0
      label    object            2200           0

Total Columns: 8
Numerical: 7
Categorical: 1


In [5]:
# 3. SAMPLE RECORDS - HEAD
print("\n FIRST 5 RECORDS (HEAD)")
print(df.head())




 FIRST 5 RECORDS (HEAD)
    N   P   K  temperature   humidity        ph    rainfall label
0  90  42  43    20.879744  82.002744  6.502985  202.935536  rice
1  85  58  41    21.770462  80.319644  7.038096  226.655537  rice
2  60  55  44    23.004459  82.320763  7.840207  263.964248  rice
3  74  35  40    26.491096  80.158363  6.980401  242.864034  rice
4  78  42  42    20.130175  81.604873  7.628473  262.717340  rice


In [6]:
# 4. SAMPLE RECORDS - TAIL
print("\n LAST 5 RECORDS (TAIL) ")
print(df.tail())




 LAST 5 RECORDS (TAIL) 
        N   P   K  temperature   humidity        ph    rainfall   label
2195  107  34  32    26.774637  66.413269  6.780064  177.774507  coffee
2196   99  15  27    27.417112  56.636362  6.086922  127.924610  coffee
2197  118  33  30    24.131797  67.225123  6.362608  173.322839  coffee
2198  117  32  34    26.272418  52.127394  6.758793  127.175293  coffee
2199  104  18  30    23.603016  60.396475  6.779833  140.937041  coffee


In [7]:
# 5. SUMMARY STATISTICS
print("\nSUMMARY STATISTICS ")
print(df.describe())


SUMMARY STATISTICS 
                 N            P            K  temperature     humidity  \
count  2200.000000  2200.000000  2200.000000  2200.000000  2200.000000   
mean     50.551818    53.362727    48.149091    25.616244    71.481779   
std      36.917334    32.985883    50.647931     5.063749    22.263812   
min       0.000000     5.000000     5.000000     8.825675    14.258040   
25%      21.000000    28.000000    20.000000    22.769375    60.261953   
50%      37.000000    51.000000    32.000000    25.598693    80.473146   
75%      84.250000    68.000000    49.000000    28.561654    89.948771   
max     140.000000   145.000000   205.000000    43.675493    99.981876   

                ph     rainfall  
count  2200.000000  2200.000000  
mean      6.469480   103.463655  
std       0.773938    54.958389  
min       3.504752    20.211267  
25%       5.971693    64.551686  
50%       6.425045    94.867624  
75%       6.923643   124.267508  
max       9.935091   298.560117  
