# Telangana Crop Yield Prediction - Making Predictions

## Overview
This notebook shows you how to use the trained model to make crop yield predictions.

### What You'll Learn:
1. Loading the saved model
2. Preparing input data for predictions
3. Making single and batch predictions
4. Interpreting results
5. Analyzing prediction scenarios

---

## Step 1: Import Libraries

In [None]:
import pandas as pd
import numpy as np
import pickle
import matplotlib.pyplot as plt
import seaborn as sns

print("✅ Libraries imported successfully!")

## Step 2: Load the Trained Model

We'll load the complete model package that was saved during training.

In [None]:
# Load model package
with open('crop_yield_model.pkl', 'rb') as f:
    model_package = pickle.load(f)

# Extract components
model = model_package['model']
scaler = model_package['scaler']
label_encoders = model_package['label_encoders']
feature_names = model_package['feature_names']
performance = model_package['performance']

print("✅ Model Loaded Successfully!\n")
print(f"📊 Model Information:")
print(f"   Name: {model_package['model_name']}")
print(f"   R² Score: {performance['test_r2']:.4f} ({performance['test_r2']*100:.2f}%)")
print(f"   MAE: {performance['mae']:.2f} kg/ha")
print(f"   RMSE: {performance['rmse']:.2f} kg/ha")
print(f"\n   Training Data:")
print(f"   - Total Samples: {model_package['training_info']['total_samples']:,}")
print(f"   - Districts: {model_package['training_info']['districts']}")
print(f"   - Crops: {model_package['training_info']['crops']}")
print(f"   - Features: {model_package['training_info']['n_features']}")

### View Available Options

In [None]:
# Get available districts, seasons, and crops
districts = sorted(label_encoders['District'].classes_)
seasons = sorted(label_encoders['Season'].classes_)
crops = sorted(label_encoders['Crop'].classes_)

print(f"📋 Available Options:\n")
print(f"🏘️ Districts ({len(districts)}):")
for i, d in enumerate(districts, 1):
    print(f"   {i:2d}. {d}")

print(f"\n🌾 Seasons ({len(seasons)}):")
for i, s in enumerate(seasons, 1):
    print(f"   {i}. {s}")

print(f"\n🌱 Crops ({len(crops)}):")
for i, c in enumerate(crops, 1):
    if i % 4 == 1:
        print(f"   ", end="")
    print(f"{c:20s}", end="")
    if i % 4 == 0:
        print()

## Step 3: Create Helper Function for Predictions

This function makes it easy to predict yield for any combination of inputs.

In [None]:
def predict_crop_yield(
    district, season, crop, year, area,
    total_rainfall, avg_temp_min, avg_temp_max,
    avg_humidity_min, avg_humidity_max
):
    """
    Predict crop yield based on agricultural and weather parameters.
    
    Parameters:
    -----------
    district : str
        District name (e.g., 'Jagitial', 'Adilabad')
    season : str
        Season ('Kharif' or 'Rabi')
    crop : str
        Crop name (e.g., 'Rice', 'Cotton', 'Maize')
    year : int
        Year (e.g., 2023, 2024)
    area : float
        Area in hectares
    total_rainfall : float
        Total seasonal rainfall in mm
    avg_temp_min : float
        Average minimum temperature in °C
    avg_temp_max : float
        Average maximum temperature in °C
    avg_humidity_min : float
        Average minimum humidity in %
    avg_humidity_max : float
        Average maximum humidity in %
    
    Returns:
    --------
    dict : Prediction results with yield and production estimates
    """
    
    # Calculate derived features
    years_since_start = year - 2018  # Assuming 2018 as baseline
    season_days = 153 if season == 'Kharif' else 151
    rainfall_per_day = total_rainfall / season_days
    temp_avg = (avg_temp_min + avg_temp_max) / 2
    temp_range = avg_temp_max - avg_temp_min
    humidity_avg = (avg_humidity_min + avg_humidity_max) / 2
    humidity_range = avg_humidity_max - avg_humidity_min
    
    # Agricultural indicators
    base_temp = 10
    gdd = max(0, temp_avg - base_temp)
    heat_stress = 1 if avg_temp_max > 35 else 0
    water_stress = 1 if total_rainfall < 500 else 0
    optimal_conditions = 1 if (heat_stress == 0 and water_stress == 0) else 0
    
    # Interaction features
    area_rainfall_interaction = area * total_rainfall
    area_temp_interaction = area * temp_avg
    
    # Encode categorical variables
    district_encoded = label_encoders['District'].transform([district])[0]
    season_encoded = label_encoders['Season'].transform([season])[0]
    crop_encoded = label_encoders['Crop'].transform([crop])[0]
    
    # Create input dictionary
    input_data = {
        'Year': year,
        'Years_Since_Start': years_since_start,
        'Area': area,
        'Total_Rainfall': total_rainfall,
        'Rainfall_Per_Day': rainfall_per_day,
        'Avg_Temp_Min': avg_temp_min,
        'Avg_Temp_Max': avg_temp_max,
        'Temp_Avg': temp_avg,
        'Temp_Range': temp_range,
        'Avg_Humidity_Min': avg_humidity_min,
        'Avg_Humidity_Max': avg_humidity_max,
        'Humidity_Avg': humidity_avg,
        'Humidity_Range': humidity_range,
        'GDD': gdd,
        'Heat_Stress': heat_stress,
        'Water_Stress': water_stress,
        'Optimal_Conditions': optimal_conditions,
        'Area_Rainfall_Interaction': area_rainfall_interaction,
        'Area_Temp_Interaction': area_temp_interaction,
        'District_Encoded': district_encoded,
        'Season_Encoded': season_encoded,
        'Crop_Encoded': crop_encoded
    }
    
    # Create DataFrame in correct order
    X = pd.DataFrame([input_data])[feature_names]
    
    # Scale and predict
    X_scaled = scaler.transform(X)
    predicted_yield = model.predict(X_scaled)[0]
    
    # Calculate production
    estimated_production = (predicted_yield * area) / 1000  # Convert to tons
    
    return {
        'predicted_yield_kg_per_ha': predicted_yield,
        'estimated_production_tons': estimated_production,
        'input_params': {
            'district': district,
            'season': season,
            'crop': crop,
            'year': year,
            'area_ha': area,
            'rainfall_mm': total_rainfall,
            'temp_range': f"{avg_temp_min}-{avg_temp_max}°C",
            'humidity_range': f"{avg_humidity_min}-{avg_humidity_max}%"
        },
        'conditions': {
            'gdd': gdd,
            'heat_stress': bool(heat_stress),
            'water_stress': bool(water_stress),
            'optimal': bool(optimal_conditions)
        }
    }

print("✅ Prediction function created!")

## Step 4: Make Sample Predictions

### Example 1: Rice in Jagitial (Kharif Season)

In [None]:
# Predict rice yield in Jagitial
result = predict_crop_yield(
    district='Jagitial',
    season='Kharif',
    crop='Rice',
    year=2024,
    area=1000,
    total_rainfall=900,
    avg_temp_min=22,
    avg_temp_max=34,
    avg_humidity_min=60,
    avg_humidity_max=85
)

print("🌾 Prediction Results:\n")
print(f"📊 Input Parameters:")
for key, value in result['input_params'].items():
    print(f"   {key.replace('_', ' ').title()}: {value}")

print(f"\n🌡️ Environmental Conditions:")
print(f"   GDD: {result['conditions']['gdd']:.2f}")
print(f"   Heat Stress: {'Yes ⚠️' if result['conditions']['heat_stress'] else 'No ✅'}")
print(f"   Water Stress: {'Yes ⚠️' if result['conditions']['water_stress'] else 'No ✅'}")
print(f"   Optimal Conditions: {'Yes ✅' if result['conditions']['optimal'] else 'No ⚠️'}")

print(f"\n🔮 Predictions:")
print(f"   Yield: {result['predicted_yield_kg_per_ha']:.2f} kg/ha")
print(f"   Total Production: {result['estimated_production_tons']:.2f} tons")
print(f"   Productivity: {result['predicted_yield_kg_per_ha']:.2f} kg/ha")

### Example 2: Cotton in Adilabad (Kharif Season)

In [None]:
# Predict cotton yield
result2 = predict_crop_yield(
    district='Adilabad',
    season='Kharif',
    crop='Cotton',
    year=2024,
    area=500,
    total_rainfall=800,
    avg_temp_min=24,
    avg_temp_max=36,
    avg_humidity_min=55,
    avg_humidity_max=80
)

print("🌾 Cotton Prediction:\n")
print(f"   District: {result2['input_params']['district']}")
print(f"   Crop: {result2['input_params']['crop']}")
print(f"   Season: {result2['input_params']['season']}")
print(f"\n   Predicted Yield: {result2['predicted_yield_kg_per_ha']:.2f} kg/ha")
print(f"   Total Production: {result2['estimated_production_tons']:.2f} tons")

## Step 5: Scenario Analysis

### Compare Kharif vs Rabi Seasons

In [None]:
# Same conditions, different seasons
kharif_result = predict_crop_yield(
    district='Medak', season='Kharif', crop='Rice', year=2024,
    area=1000, total_rainfall=900,
    avg_temp_min=22, avg_temp_max=34,
    avg_humidity_min=60, avg_humidity_max=85
)

rabi_result = predict_crop_yield(
    district='Medak', season='Rabi', crop='Rice', year=2024,
    area=1000, total_rainfall=900,
    avg_temp_min=22, avg_temp_max=34,
    avg_humidity_min=60, avg_humidity_max=85
)

print("🌾 Season Comparison (Rice in Medak):\n")
print(f"☀️ Kharif (Monsoon):")
print(f"   Yield: {kharif_result['predicted_yield_kg_per_ha']:.2f} kg/ha")
print(f"   Production: {kharif_result['estimated_production_tons']:.2f} tons")

print(f"\n❄️ Rabi (Winter):")
print(f"   Yield: {rabi_result['predicted_yield_kg_per_ha']:.2f} kg/ha")
print(f"   Production: {rabi_result['estimated_production_tons']:.2f} tons")

difference = rabi_result['predicted_yield_kg_per_ha'] - kharif_result['predicted_yield_kg_per_ha']
print(f"\n📊 Difference: {difference:+.2f} kg/ha")
print(f"   Rabi {'higher' if difference > 0 else 'lower'} by {abs(difference):.2f} kg/ha")

### Impact of Rainfall Variation

In [None]:
# Test different rainfall levels
rainfall_levels = [400, 600, 800, 1000, 1200, 1500]
yields = []

print("🌧️ Rainfall Impact Analysis (Rice, Kharif, Jagitial):\n")
print(f"{'Rainfall (mm)':>15} | {'Yield (kg/ha)':>15} | {'Status':>15}")
print("-" * 50)

for rainfall in rainfall_levels:
    result = predict_crop_yield(
        district='Jagitial', season='Kharif', crop='Rice', year=2024,
        area=1000, total_rainfall=rainfall,
        avg_temp_min=22, avg_temp_max=34,
        avg_humidity_min=60, avg_humidity_max=85
    )
    
    yields.append(result['predicted_yield_kg_per_ha'])
    status = 'Low' if rainfall < 600 else 'Optimal' if rainfall < 1200 else 'High'
    print(f"{rainfall:15d} | {result['predicted_yield_kg_per_ha']:15.2f} | {status:>15}")

# Visualize
plt.figure(figsize=(10, 6))
plt.plot(rainfall_levels, yields, marker='o', linewidth=2, markersize=8)
plt.xlabel('Total Rainfall (mm)', fontsize=12)
plt.ylabel('Predicted Yield (kg/ha)', fontsize=12)
plt.title('Impact of Rainfall on Rice Yield', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### District Comparison

In [None]:
# Compare yields across different districts
test_districts = ['Jagitial', 'Adilabad', 'Medak', 'Karimnagar', 'Nizamabad']
district_yields = []

print("🏘️ District Comparison (Rice, Kharif, Same Conditions):\n")
print(f"{'District':>15} | {'Yield (kg/ha)':>15}")
print("-" * 35)

for dist in test_districts:
    result = predict_crop_yield(
        district=dist, season='Kharif', crop='Rice', year=2024,
        area=1000, total_rainfall=900,
        avg_temp_min=22, avg_temp_max=34,
        avg_humidity_min=60, avg_humidity_max=85
    )
    district_yields.append(result['predicted_yield_kg_per_ha'])
    print(f"{dist:>15} | {result['predicted_yield_kg_per_ha']:15.2f}")

# Visualize
plt.figure(figsize=(10, 6))
plt.barh(test_districts, district_yields, color='steelblue', edgecolor='black')
plt.xlabel('Predicted Yield (kg/ha)', fontsize=12)
plt.ylabel('District', fontsize=12)
plt.title('Rice Yield Comparison Across Districts', fontsize=14, fontweight='bold')
plt.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.show()

## Step 6: Batch Predictions

Make predictions for multiple scenarios at once.

In [None]:
# Create batch of scenarios
scenarios = [
    {'name': 'Scenario 1: Good Rainfall', 'district': 'Jagitial', 'season': 'Kharif', 'crop': 'Rice', 
     'year': 2024, 'area': 1000, 'rainfall': 1200, 'temp_min': 22, 'temp_max': 34, 'hum_min': 65, 'hum_max': 85},
    
    {'name': 'Scenario 2: Low Rainfall', 'district': 'Jagitial', 'season': 'Kharif', 'crop': 'Rice',
     'year': 2024, 'area': 1000, 'rainfall': 500, 'temp_min': 22, 'temp_max': 34, 'hum_min': 50, 'hum_max': 70},
    
    {'name': 'Scenario 3: Heat Stress', 'district': 'Jagitial', 'season': 'Kharif', 'crop': 'Rice',
     'year': 2024, 'area': 1000, 'rainfall': 900, 'temp_min': 26, 'temp_max': 38, 'hum_min': 55, 'hum_max': 75},
    
    {'name': 'Scenario 4: Optimal', 'district': 'Medak', 'season': 'Rabi', 'crop': 'Rice',
     'year': 2024, 'area': 1000, 'rainfall': 1000, 'temp_min': 20, 'temp_max': 32, 'hum_min': 65, 'hum_max': 85}
]

# Run batch predictions
batch_results = []
for scenario in scenarios:
    result = predict_crop_yield(
        district=scenario['district'],
        season=scenario['season'],
        crop=scenario['crop'],
        year=scenario['year'],
        area=scenario['area'],
        total_rainfall=scenario['rainfall'],
        avg_temp_min=scenario['temp_min'],
        avg_temp_max=scenario['temp_max'],
        avg_humidity_min=scenario['hum_min'],
        avg_humidity_max=scenario['hum_max']
    )
    
    batch_results.append({
        'Scenario': scenario['name'],
        'District': scenario['district'],
        'Season': scenario['season'],
        'Rainfall': scenario['rainfall'],
        'Temp_Max': scenario['temp_max'],
        'Yield': result['predicted_yield_kg_per_ha'],
        'Production': result['estimated_production_tons'],
        'Heat_Stress': result['conditions']['heat_stress'],
        'Water_Stress': result['conditions']['water_stress']
    })

# Create DataFrame
batch_df = pd.DataFrame(batch_results)

print("📊 Batch Prediction Results:\n")
print(batch_df.to_string(index=False))

# Visualize
plt.figure(figsize=(12, 6))
plt.bar(batch_df['Scenario'], batch_df['Yield'], color='lightcoral', edgecolor='black')
plt.xlabel('Scenario', fontsize=12)
plt.ylabel('Predicted Yield (kg/ha)', fontsize=12)
plt.title('Yield Predictions Across Different Scenarios', fontsize=14, fontweight='bold')
plt.xticks(rotation=15, ha='right')
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

## Summary

### What We Learned:
✅ How to load a saved ML model

✅ Creating a prediction function

✅ Making single predictions

✅ Comparing scenarios (seasons, rainfall, districts)

✅ Batch predictions for multiple scenarios

✅ Visualizing and interpreting results

### Key Insights:
- Season affects yield (Kharif vs Rabi show differences)
- Rainfall has significant impact on predictions
- District location influences yield
- Heat and water stress reduce yield
- Optimal conditions maximize yield

### Next Steps:
- Use this model for your own predictions
- Experiment with different scenarios
- Integrate into web applications
- Present results to stakeholders

---

**Your model is ready for real-world agricultural predictions!** 🌾📊🎓
