### 🐄 Animal Image Classifier - Buffalo vs. Cattle

This notebook loads a fine-tuned MobileNetV2 model( which was saved after training) to classify animal images as either **Buffalo** or **Cattle**.  
It includes image quality checks for **blurriness** and **cropping**, ensuring only valid images are evaluated.  
Predictions are generated for a folder of images and saved as a CSV with confidence scores and quality flags.


### Imports and Setup

This cell imports all the necessary libraries for image processing, model loading, and data handling. It includes libraries like `os`, `cv2`, `torch`, `torchvision`, `PIL`, `numpy`, `pandas`, and `tqdm`.

In [1]:
import os
import cv2
import torch
import torchvision.transforms as transforms
import torch.nn as nn
from torchvision import models
from PIL import Image
import numpy as np
import pandas as pd
from tqdm import tqdm

### Extract Dataset

This cell extracts the animal dataset from a zip file located at certain path into a directory named `dataset`. The variable `data` is set to the path of the extracted dataset.

In [2]:
import zipfile
import os
with zipfile.ZipFile("/content/sample_data/animal_dataset.zip", 'r') as zip_ref:
    zip_ref.extractall("dataset")
data = './dataset'

### Load Model and Define Transforms

This cell defines the `load_model` function to load a pre-trained MobileNetV2 model and adapt it for binary classification. It also defines the image `transform` to resize, convert to tensor, and normalize images for the model. The `class_names` list is defined for the two classes: 'Buffalo' and 'Cattle'.

In [3]:

# 2. Load Model
def load_model(model_path):
    model = models.mobilenet_v2(weights=None)
    model.classifier[1] = nn.Linear(model.last_channel, 2)
    model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
    model.eval()
    return model

# 3. Define Transforms
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406],
                         [0.229, 0.224, 0.225])
])

class_names = ['Buffalo', 'Cattle']


### Utility Functions

This cell defines two utility functions:
- `is_blurry`: Checks if an image is blurry using the Laplacian variance method.
- `is_cropped`: Checks if an image is potentially cropped by analyzing the largest contour's area ratio.

It also defines two prediction functions:
- `predict_image`: Predicts the class and confidence for a single image.
- `predict_folder`: Predicts the class and status (Blurry, Cropped, or Class prediction with confidence) for all images in a given folder and returns the results as a pandas DataFrame.

In [4]:
# 4. Utility Functions
def is_blurry(image, threshold=100):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    laplacian_var = cv2.Laplacian(gray, cv2.CV_64F).var()
    return laplacian_var < threshold

def is_cropped(image, min_area_ratio=0.5):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 10, 255, cv2.THRESH_BINARY)
    contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    if not contours:
        return True
    max_contour = max(contours, key=cv2.contourArea)
    area_ratio = cv2.contourArea(max_contour) / (image.shape[0] * image.shape[1])
    return area_ratio < min_area_ratio

# 5. Predict a Single Image
def predict_image(model, image_path):
    image = Image.open(image_path).convert('RGB')
    input_tensor = transform(image).unsqueeze(0)
    with torch.no_grad():
        output = model(input_tensor)
        prob = torch.nn.functional.softmax(output[0], dim=0)
        predicted_class = class_names[torch.argmax(prob).item()]
        confidence = prob.max().item()
    return predicted_class, confidence, confidence # Return confidence as well

# 6. Predict Images in Folder
def predict_folder(model, image_folder, confidence_threshold=0.7): # Add confidence_threshold
    results = []
    for img_name in tqdm(os.listdir(image_folder)):
        if img_name.lower().endswith(('jpg', 'jpeg', 'png')):
            img_path = os.path.join(image_folder, img_name)
            image_cv = cv2.imread(img_path)

            if is_blurry(image_cv):
                status = "Blurry"
            elif is_cropped(image_cv):
                status = "Cropped"
            else:
                label, confidence, _ = predict_image(model, img_path) # Get confidence
                if confidence < confidence_threshold: # Check confidence
                    status = "Not a valid animal image"
                else:
                    status = f"{label} ({confidence*100:.2f}%)"

            results.append((img_name, status))

    print() # Add a new line after the tqdm loop
    return pd.DataFrame(results, columns=["Image", "Prediction"])

### Run Example

This cell demonstrates how to use the defined functions. It loads the pre-trained model, runs the `predict_folder` function on the "Cattle" images in the extracted dataset, saves the predictions to a CSV file named `predictionsreport.csv`, and prints the head of the resulting DataFrame.

In [5]:
#7. Run Example
model = load_model("/content/sample_data/best_model.pth")
df = predict_folder(model, "/content/dataset/animal_dataset/Cattle")
df.to_csv("predictionsreport.csv", index=False)
print(df.head())


100%|██████████| 203/203 [00:10<00:00, 18.65it/s]


              Image                Prediction
0  page19_img1.jpeg           Cattle (97.27%)
1  page22_img2.jpeg                    Blurry
2  page22_img1.jpeg          Buffalo (70.38%)
3  page40_img2.jpeg           Cattle (92.66%)
4  page40_img3.jpeg  Not a valid animal image



