## COMPUTER VISION 
### FINAL PROJECT: HOW’S THE WEATHER TODAY
#### THE DATASETS: ACDC

**Student name**: 

**ID Number**: 

Import required libraries, including OpenCV (cv2) for image processing, NumPy for numerical operations, Pandas for data manipulation, scikit-learn for machine learning tasks (RandomForestClassifier), and Mahotas for computer vision features

In [1]:
# Import necessary libraries
import os
import cv2
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
import mahotas.features as mfs
from sklearn.metrics import accuracy_score, classification_report

This function iterates through each image file in the specified directory (data_dir). For each image, it loads the image, resizes it, extracts features using the extract_features function (which is defined later), gets the label from the filename using the get_label_from_filename function (also defined later), and appends the features, mapped label, and image name to their respective lists. Finally, it returns NumPy arrays for features and labels, along with a list of image names.

In [2]:
# Function to load and preprocess the data
def load_and_preprocess_data(data_dir, img_size=(150, 150)):
    # Initialize empty lists to store data
    data = []                        # Features of images
    labels = []                      # Corresponding labels (Clear, Fog, Night, Rain, Snow)
    image_names = []                 # Names of processed images


    # Mapping of filename labels to predefined weather classes
    class_mapping = {'clear': 'Clear', 'fog': 'Fog', 'night': 'Night', 'rain': 'Rain', 'snow': 'Snow'}

    # Loop through each file in the specified directory
    for img_file in os.listdir(data_dir):
        # Build the full path to the image file
        img_path = os.path.join(data_dir, img_file)
        # Read the image using OpenCV
        img = cv2.imread(img_path)

        # Check if the image was successfully loaded
        if img is not None:
            # Resize the image to the specified size (default is 150x150 pixels)
            img = cv2.resize(img, img_size)
            # Extract features from the image
            features = extract_features(img)
            # Get the label from the filename
            label = get_label_from_filename(img_file)

            # Check if the label is in the predefined mapping
            if label in class_mapping:
                # Append the extracted data to the respective lists
                data.append(features)
                labels.append(class_mapping[label])
                image_names.append(img_file)
                
    # Convert lists to NumPy arrays and return them
    return np.array(data), np.array(labels), image_names


this function computes features like color histograms, average color values, standard deviations, and image moments for each channel of an image. It also includes a placeholder for LBP features, which can be added later if needed. The final feature vector is obtained by concatenating all these individual feature arrays.

In [3]:
# Function to extract features from an image
def extract_features(img):
    # Calculate histogram features for each channel (color intensity)
    hist_features = cv2.calcHist([img], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256]).flatten()
    # Calculate average color features for each channel
    avg_color_features = [np.mean(img[:,:,0]), np.mean(img[:,:,1]), np.mean(img[:,:,2])]
    # Calculate standard deviation of color features for each channel
    std_color_features = [np.std(img[:,:,0]), np.std(img[:,:,1]), np.std(img[:,:,2])]
    # Calculate image moments features for each channel
    moments_features = [cv2.moments(img[:,:,0])['m00'], cv2.moments(img[:,:,1])['m00'], cv2.moments(img[:,:,2])['m00']]
     # LBP features (currently empty)
    lbp_features = []

    # Concatenate all the extracted features into a single array
    return np.concatenate((hist_features, avg_color_features, std_color_features, moments_features, lbp_features))

this function takes a filename, e.g., "clear_001.png". It splits the filename using '_' as a delimiter and retrieves the first part, which represents the label e.g., "clear". The label is then converted to lowercase for consistency. The function returns this label extracted from the filename.

In [4]:
# Function to get label from filename
def get_label_from_filename(filename):
    # Split the filename using '_' as a delimiter and take the first part
    # Convert the result to lowercase to ensure consistency
    return filename.split('_')[0].lower()

this function takes numerical data, corresponding labels, and image names, organizes them into a DataFrame, and then saves this DataFrame to an Excel file. The column names for numerical features are generated dynamically as "Feature_0", "Feature_1", and so on. The Excel file is saved without including row indices.







In [5]:
# Function to save data to Excel
def save_to_excel(data, labels, image_names, file_name):
    # Create a DataFrame using the provided data, with column names like "Feature_0", "Feature_1", ...
    df = pd.DataFrame(data, columns=[f"Feature_{i}" for i in range(data.shape[1])])
    # Add columns for labels and image names to the DataFrame
    df['Label'] = labels
    df['Image_Name'] = image_names
    # Save the DataFrame to an Excel file with the specified name, without including row indices
    df.to_excel(file_name, index=False)

this code first loads and preprocesses the training data from the specified directory using a function (load_and_preprocess_data). Then, it saves the extracted features, corresponding labels, and image names to an Excel file named 'train_data_features.xlsx' using another function (save_to_excel).

In [6]:
# Load and preprocess the training data
train_data_dir = r"ACDC\train"
# Call a function to load and preprocess the training data, getting numerical features, labels, and image names
train_data, train_labels, train_image_names = load_and_preprocess_data(train_data_dir)
# Save the training data to an Excel file
# Call a function to save the training data (features, labels, and image names) to an Excel file
save_to_excel(train_data, train_labels, train_image_names, 'train_data_features.xlsx')


this code initializes a Random Forest classifier with 100 trees and then trains it using the training data (features and corresponding labels). The classifier will learn patterns in the data that can be used for making predictions on new, unseen data.

In [7]:
# Train the Random Forest classifier
# Create a Random Forest classifier with 100 trees and a fixed random seed for reproducibility
clf = RandomForestClassifier(n_estimators=100, random_state=42)
# Fit (train) the classifier using the training data (features and labels)
clf.fit(train_data, train_labels)


this code loads and preprocesses the test data, including extracting features, obtaining labels, and saving this information into an Excel file for further analysis or evaluation.







In [8]:
# Load and preprocess the test data
# Specify the directory containing the test data
test_data_dir = r"ACDC\test"
# Call the function to load and preprocess the test data, and retrieve features, labels, and image names
test_data, test_labels, test_image_names = load_and_preprocess_data(test_data_dir)
# Save the test data, labels, and image names to an Excel file
# The function 'save_to_excel' creates a DataFrame and saves it to an Excel file
save_to_excel(test_data, test_labels, test_image_names, 'test_data_features.xlsx')


 the machine learning model (Random Forest classifier) is applied to the test dataset (test_data), and predictions are made for the corresponding classes. The resulting predicted labels are stored in the variable predicted_labels.

In [9]:
# Predict classes for the test set
# Uses a trained classifier 'clf' to predict labels for the test data
predicted_labels = clf.predict(test_data)


This code calculates the accuracy of the machine learning model by comparing its predicted labels (predicted_labels) with the actual labels in the test set (test_labels). Additionally, it generates a classification report that contains detailed performance metrics. Finally, the accuracy and classification report are printed for evaluation.

In [10]:
# Evaluate the classifier
# Compares the predicted labels with the actual labels in the test set
accuracy = accuracy_score(test_labels, predicted_labels)
# Generate a classification report
# Provides detailed performance metrics like precision, recall, and F1-score
classification_report_str = classification_report(test_labels, predicted_labels)
# Print the accuracy and classification report
print(f"Accuracy: {accuracy}")
print("Classification Report:\n", classification_report_str)


Accuracy: 0.928
Classification Report:
               precision    recall  f1-score   support

       Clear       1.00      1.00      1.00       100
         Fog       0.89      0.89      0.89       100
       Night       0.99      0.99      0.99       100
        Rain       0.91      0.93      0.92       100
        Snow       0.85      0.83      0.84       100

    accuracy                           0.93       500
   macro avg       0.93      0.93      0.93       500
weighted avg       0.93      0.93      0.93       500




This code loads a random photo, resizes it, extracts features, and then uses a trained classifier (`clf`) to predict the weather condition based on those features. Finally, it prints the predicted weather condition.

In [11]:
# Predict the weather condition for a random photo
# Specify the file path for the random photo
random_photo_path = "ACDC\\test\\clear_61.png"
# Read the random photo using OpenCV
random_photo = cv2.imread(random_photo_path)
# Resize the random photo to a specified size (150x150 pixels)
random_photo = cv2.resize(random_photo, (150, 150))
# Extract features from the resized random photo
random_photo_features = extract_features(random_photo)
# Use the trained classifier (clf) to predict the weather condition based on the features
predicted_condition = clf.predict([random_photo_features])[0]
# Print the predicted weather condition
print(f"Predicted Weather Condition: {predicted_condition}")

Predicted Weather Condition: Clear
