<a href="https://colab.research.google.com/github/shrutisj12/AgriSense-AI-Intelligent-Crop-Disease-Assistant/blob/main/Phase2_Plant_Disease.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**🌱 PHASE-2 Plant Disease Detection MVP**

##Goal:

####Detect plant diseases from leaf images using a CNN (or transfer learning) and show results in a Streamlit web app.

We’ll use a small dataset (2–3 plant types + healthy vs diseased) for speed.

#**Install libraries**

In [1]:
!pip install tensorflow keras pillow matplotlib seaborn streamlit pyngrok


Collecting streamlit
  Downloading streamlit-1.50.0-py3-none-any.whl.metadata (9.5 kB)
Collecting pyngrok
  Downloading pyngrok-7.4.0-py3-none-any.whl.metadata (8.1 kB)
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Downloading streamlit-1.50.0-py3-none-any.whl (10.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.1/10.1 MB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyngrok-7.4.0-py3-none-any.whl (25 kB)
Downloading pydeck-0.9.1-py2.py3-none-any.whl (6.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m68.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pyngrok, pydeck, streamlit
Successfully installed pydeck-0.9.1 pyngrok-7.4.0 streamlit-1.50.0


#Get the Dataset From kaggle

Use Plant Village Dataset (subset) from Kaggle:
https://www.kaggle.com/datasets/emmarex/plantdisease

(Download only 2–3 plant types (healthy + diseased) to speed up training.)


In [2]:
# from google.colab import files
# import zipfile
# import os

# uploaded = files.upload()

# for fn in uploaded.keys():
#   print('User uploaded file "{name}" with length {length} bytes'.format(
#       name=fn, length=len(uploaded[fn])))

#   # To extract the uploaded zip file
#   if fn.endswith(".zip"):
#     with zipfile.ZipFile(fn, 'r') as zip_ref:
#         zip_ref.extractall()
#     os.remove(fn)
#     print(f"Extracted {fn} and removed the zip file.")
#   else:
#     print(f"Uploaded file {fn} is not a zip file. Skipping extraction.")

              # or

# from google.colab import files
# uploaded = files.upload()


##**Extract and Verify Dataset Folder in Colab**

Explanation:
This step unzips your downloaded dataset (archive.zip) inside your Colab environment and lists all extracted folders/files so you can confirm the dataset structure before training.

In [8]:
import zipfile
import os

with zipfile.ZipFile("archive.zip", "r") as zip_ref:
    zip_ref.extractall("dataset")

os.listdir("dataset")


['PlantVillage', 'plantvillage']

##**Verify Dataset Extraction Path**

Explanation:   
After unzipping the dataset, it’s important to confirm that the files were extracted to the correct location.
This step lists the contents of your main Colab directory (/content) and then specifically checks what’s inside the dataset folder to ensure that the PlantVillage dataset folder exists and is correctly structured for model training.

In [14]:
import os

# list everything inside /content
os.listdir("/content")
os.listdir("/content/dataset")



['PlantVillage', 'plantvillage']

# **Set Dataset Directory Path**

Explanation:    
After verifying that the dataset was correctly extracted, we now set the path where our dataset is stored.
The variable data_directory stores the exact folder location that contains all the image subfolders (each representing a crop disease class).

The os.listdir(data_directory) command helps us list all the subfolders (categories) inside that directory — for example:
['Apple___Black_rot', 'Corn___Common_rust', 'Tomato___Late_blight', ...]

This confirms that:

The path /content/dataset/PlantVillage exists

The dataset is properly organized into labeled folders (each folder = one disease category)

In [17]:
data_directory = '/content/dataset/PlantVillage'
os.listdir(data_directory)


['Potato___Late_blight',
 'Tomato__Tomato_YellowLeaf__Curl_Virus',
 'Tomato__Tomato_mosaic_virus',
 'Tomato_Septoria_leaf_spot',
 'Tomato_Leaf_Mold',
 'Tomato_Late_blight',
 'Tomato_healthy',
 'Pepper__bell___Bacterial_spot',
 'Tomato__Target_Spot',
 'Pepper__bell___healthy',
 'Tomato_Early_blight',
 'Potato___healthy',
 'Tomato_Spider_mites_Two_spotted_spider_mite',
 'Potato___Early_blight',
 'Tomato_Bacterial_spot']

#Preprocess Images

In [19]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os

train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)

# Assuming the zip file was uploaded to /content/ and extracted to /content/archive/
# and the dataset is within a folder named 'PlantVillage' inside the extracted folder
data_directory = '/content/dataset/PlantVillage'
# Update this path if your structure is different

if not os.path.exists(data_directory):
    print(f"Error: Dataset directory not found at {data_directory}.")
    print("Please make sure you have extracted the zip file and updated the 'data_directory' variable with the correct path to the 'PlantVillage' folder.")
else:
    train_generator = train_datagen.flow_from_directory(
        data_directory,
        target_size=(128,128),
        batch_size=16,
        class_mode='categorical',
        subset='training'
    )

    val_generator = train_datagen.flow_from_directory(
        data_directory,
        target_size=(128,128),
        batch_size=16,
        class_mode='categorical',
        subset='validation'
    )
    print("Data generators created successfully.")



Found 16516 images belonging to 15 classes.
Found 4122 images belonging to 15 classes.


#Build CNN Model (Fast MVP)

In [20]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(128,128,3)),
    MaxPooling2D(2,2),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(train_generator.num_classes, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


#Train Model Quickly




✅ Tip: Fewer epochs now — you can retrain later for accuracy.


In [21]:
history = model.fit(
    train_generator,
    validation_data=val_generator,
    epochs=5  # short for MVP
)

  self._warn_if_super_not_called()


Epoch 1/5
[1m1033/1033[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m528s[0m 510ms/step - accuracy: 0.4072 - loss: 1.9639 - val_accuracy: 0.7297 - val_loss: 0.8035
Epoch 2/5
[1m1033/1033[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m525s[0m 508ms/step - accuracy: 0.6983 - loss: 0.8972 - val_accuracy: 0.8091 - val_loss: 0.5717
Epoch 3/5
[1m1033/1033[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m553s[0m 499ms/step - accuracy: 0.7844 - loss: 0.6513 - val_accuracy: 0.8292 - val_loss: 0.4956
Epoch 4/5
[1m1033/1033[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m509s[0m 493ms/step - accuracy: 0.8159 - loss: 0.5275 - val_accuracy: 0.8617 - val_loss: 0.4121
Epoch 5/5
[1m1033/1033[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m509s[0m 493ms/step - accuracy: 0.8497 - loss: 0.4447 - val_accuracy: 0.8823 - val_loss: 0.3524



#Save Model



✅ Keep .h5 file for Streamlit integration.



In [22]:
model.save('plant_disease_model.h5')




#Streamlit App



In [35]:
%%writefile app2.py
import streamlit as st
from tensorflow.keras.models import load_model
from PIL import Image
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator


data_directory = '/content/dataset/PlantVillage'
train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
        data_directory,
        target_size=(128,128),
        batch_size=16,
        class_mode='categorical',
        subset='training'
    )

# Load model
model = load_model('plant_disease_model.h5')

st.title("🌿 AgriSense AI – Plant Disease Detection")

uploaded_file = st.file_uploader("Upload Leaf Image", type=['jpg','png'])
if uploaded_file is not None:
    image = Image.open(uploaded_file).resize((128,128))
    st.image(image, caption='Uploaded Leaf', use_column_width=True)

    img_array = np.array(image)/255.0
    img_array = np.expand_dims(img_array, axis=0)

    pred = model.predict(img_array)
    class_idx = np.argmax(pred)
    class_label = train_generator.class_indices
    class_label = dict((v,k) for k,v in class_label.items())

    st.success(f"Predicted Disease: {class_label[class_idx]}")


Overwriting app2.py


#Run Streamlit App in Colab

Click the generated URL → your Plant Disease Detection MVP is live!



In [36]:
# !streamlit run app.py & npx localtunnel --port 8501

                      #  or
!pip install streamlit pyngrok
from pyngrok import ngrok

# 🔑 Replace the string below with YOUR token
ngrok.set_auth_token("34Pxm3PxWDKKUCaM3XnqYseHe9J_4Prrsazxo8hN2FexwbFdV")

import threading, time
from pyngrok import ngrok

ngrok.kill()

def run_app():
    !streamlit run app2.py --server.port 8501 &

thread = threading.Thread(target=run_app)
thread.start()

time.sleep(10)

public_url = ngrok.connect(8501)
print(f"🌐 Your Streamlit app is live here: {public_url.public_url}")



Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
2025-10-23 10:40:54.118 Port 8501 is already in use
🌐 Your Streamlit app is live here: https://paltriest-jumblingly-bettye.ngrok-free.dev


# Phase 2 Deliverables (MVP)

Trained CNN model (plant_disease_model.h5)

Streamlit web app for leaf disease detection

GitHub repo updated

Standalone demo-ready MVP



#💡 Tip for Speed:

Use small dataset

Train 5–10 epochs

Focus on workflow, not max accuracy