# DS510 Team Project
DS510 Artificial Intelligence for Data Science \
Term: Summer 2025 \
Team: Team XX \
Authors: Hiromi Cota, David Hiltzman, Joseph Tran \
Emails: cotahiromi@cityuniversity.edu, hiltzmandavid@cityuniversity.edu, trantung@cityuniversity.edu \

## Task: 
First, find an applicable area where an AI algorithm can be applied (e.g., weather prediction). Once the project's goal is set, the models must be developed and tested on different datasets. There are various publicly available datasets; find one with data that suits your project. Finding publicly available data that can be used for the project is a crucial step in getting the project done appropriately. You are encouraged to look at Kaggle   to see available datasets to give you some ideas for selecting the team project topic. Please have one team member send the instructor information on the team project topic for confirmation to get started on the project and the project proposal. 

In [None]:
# Imports
import kagglehub
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
from PIL import Image
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [None]:
# Download latest version
PATH = kagglehub.dataset_download("abdallahalidev/plantvillage-dataset")

print("Path to dataset files:", PATH)
IMG_SIZE = (224, 224)
BATCH_SIZE = 32

In [None]:
# Step 3: Prepare training & validation datasets
train_gen = datagen.flow_from_directory(
    extract_dir,
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='training'
)

In [None]:
val_gen = datagen.flow_from_directory(
    extract_dir,
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='validation'
)

In [None]:
# Step 4: Save class names to CSV
labels = list(train_gen.class_indices.keys())
with open("labels.txt", "w") as f:
    for label in labels:
        f.write(label + "\n")


In [None]:
# Step 5: Build MobileNetV2 model (no pre-trained weights for Kaggle)
# using a lighter MobileNetV2 since it's small and fast so it can run on a phone
base_model = tf.keras.applications.MobileNetV2(
    input_shape=IMG_SIZE + (3,),
    include_top=False,
    weights='imagenet'
)
base_model.trainable = False  # Freeze base layers

In [None]:
model = tf.keras.Sequential([
    base_model,
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(len(labels), activation='softmax')
])

In [None]:
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

In [None]:
# Step 6: Train the model
history = model.fit(
    train_gen,
    validation_data=val_gen,
    epochs=10
)

In [None]:
# Step 7: Save model
model.save("plant_disease_model.h5")
print("Model and class names saved!")