<a href="https://colab.research.google.com/github/hassaanhameed786/Machine-learning-/blob/main/mammals_image_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **1. Importing Essential Packages**
The article starts with importing necessary Python packages:

pandas, numpy: For data handling and numerical operations.
seaborn, matplotlib.pyplot, plotly: For visualizing data and results.
os: For file and directory operations.
cv2 (OpenCV): For reading and manipulating images.
tensorflow, torch: For building and training deep learning models.
Understanding and using these libraries will be crucial as each serves a specific purpose in the process of data handling, model building, and visualization.

# **2. Loading Images**
Using cv2 to load images from a directory structure is a common approach. This involves iterating through each subdirectory and file, loading each image, and possibly storing them in a list or an array for further processing. This step is crucial for building your dataset.

# **3. Counting Subdirectories and Images**
Understanding the structure of your dataset is vital. Counting directories and images helps you know how many categories (classes) you have and how many images are in each category, which is essential for understanding the balance of your dataset.

# **4. Analyzing Image Properties**
Investigating image properties such as size, resolution, and color distribution can provide insights into the variability within your dataset and any preprocessing steps you might need to standardize the images.

# **5. Visualizing Data Distribution**
Creating visualizations like bar charts and scatter plots helps in understanding the distribution of your data across different classes and the variations in image properties. This step is crucial for identifying imbalances or outliers that might affect model training.

# **6. Training a Basic CNN Model**
Here's where you dive into model building. A basic CNN usually consists of convolutional layers, pooling layers, and fully connected layers. The article describes the process of defining the architecture, compiling the model, and then training it on your image data.

# **7. Training a Transfer Learning Model**
Transfer learning involves using a pre-trained model and adapting it to your specific task. This approach is particularly useful when you have a small dataset or want to save time and computational resources. The article mentions adapting MobileNetV2, but there are many other models like ResNet, VGG, and Inception that you can explore.

# **8. Evaluating the Model**
Once your model is trained, it's crucial to evaluate its performance using metrics like accuracy, precision, recall, and F1-score, and by plotting accuracy and loss curves. This helps in understanding how well the model is performing and whether it's overfitting or underfitting.

# **9. Making Predictions**
Finally, the article describes how to use your trained model to make predictions on new images. This involves preprocessing the image to fit the input requirements of your model, then feeding it through the model to get a prediction.

In [27]:
# Import Python Package
import pandas as pd
import numpy as np
import seaborn as sns
import os
import zipfile
import cv2
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import tensorflow as tf
from tensorflow.keras import layers, models
from plotly.subplots import make_subplots
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.models import Model
from PIL import Image
import plotly.offline as pyo
from IPython.display import display
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim
from sklearn.metrics import precision_score, recall_score, f1_score
from sklearn.metrics import confusion_matrix

In [28]:
# mounting the drive for the dataset
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [29]:
zip_path = '/content/drive/MyDrive/Datasets/archive (2).zip'

## Locate the Zip File

In [30]:
# Define the target directory where you want to extract the content
extract_to = '/content/mammals_dataset'

# Create the directory if it does not exist
if not os.path.exists(extract_to):
    os.makedirs(extract_to)

# Define the path to the zip file
zip_path = '/content/drive/MyDrive/Datasets/archive (2).zip'

# Extract the zip file
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extract_to)

## Space Efficiency:

Keeping your data zipped until needed can save space in your Google Drive, which has limited free storage.
## Time Efficiency:

 Uploading one large zip file can be faster and more reliable than uploading many small files, especially for a large dataset.
##Extraction Location:
 Make sure you have enough space in your Colab environment to extract all files, as it offers limited and temporary storage.
italicised text

In [31]:
os.listdir(extract_to)

['mammals']

In [32]:
mammals = os.listdir(extract_to)

# Loop through each mammal's folder and load images
for mammal in mammals:
    mammal_folder = os.path.join(extract_to, mammal)
    images = os.listdir(mammal_folder)
    for image in images:
        image_path = os.path.join(mammal_folder, image)
        if os.path.isfile(image_path):
            img = Image.open(image_path)  # Or use any other library for image loading
            # Now you can use img in your processing...

In [33]:
image_path

'/content/mammals_dataset/mammals/wildebeest'