<a href="https://colab.research.google.com/github/SoroushJamali/-Python-script-for-image-classification-using-the-Microsoft-COCO-dataset/blob/main/Python_script_for_image_classification_using_the_Microsoft_COCO_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This script downloads the Microsoft COCO dataset, extracts it, and trains a convolutional neural network to classify images into the specified animal categories. After training the model, you can use it to make predictions on new animal images.

Note that you'll need to replace path/to/image.jpg with the actual path to your image file. Also, you may need to adjust the model architecture and hyperparameters depending on the size and complexity of your dataset.

----------------------------------

**import tensorflow as tf**: TensorFlow is a popular open-source library for building and training machine learning models, particularly deep learning models. It provides various tools and modules for constructing and training neural networks, and can be used for a wide range of applications, such as image and speech recognition, natural language processing, and more.

**import numpy as np**: NumPy is a popular Python library for numerical computing. It provides tools for working with arrays, matrices, and other numerical data structures, and can be used for a wide range of mathematical computations.

**import os**: The os module provides a way to interact with the operating system in Python. It provides various tools for working with files and directories, and can be used to perform operations such as creating, deleting, renaming, or moving files.

**import cv2**: OpenCV is an open-source computer vision library that provides tools for image and video processing. The cv2 module provides an interface to the OpenCV library in Python, and can be used for various tasks such as image filtering, object detection, and more.

**from tensorflow import keras**: Keras is a high-level API for building and training machine learning models. It provides a simplified interface for constructing neural networks and other machine learning models, and can be used with various backends such as TensorFlow, Theano, and CNTK. In TensorFlow 2.0 and later versions, Keras is included as a part of the TensorFlow package.

**from tensorflow.keras import layers**: The layers module provides various types of layers that can be used to build neural networks, such as convolutional layers, pooling layers, dense layers, and more.

**import wget**: wget is a Python module that provides a way to download files from the internet. It provides a simple and convenient interface for downloading files, and can be used for various tasks such as downloading datasets or other resources for machine learning.

**import tarfile**: The tarfile module provides tools for working with tar archives in Python. It can be used to extract files from tar archives, create new tar archives, and more. In this code, it is used to extract the downloaded tar archive containing the dataset.

In [2]:
!pip install wget

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting wget
  Downloading wget-3.2.zip (10 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wget
  Building wheel for wget (setup.py) ... [?25l[?25hdone
  Created wheel for wget: filename=wget-3.2-py3-none-any.whl size=9674 sha256=9fabcc991edc9f6734733c38629792762b478370f9ee018604432f85475993fd
  Stored in directory: /root/.cache/pip/wheels/04/5f/3e/46cc37c5d698415694d83f607f833f83f0149e49b3af9d0f38
Successfully built wget
Installing collected packages: wget
Successfully installed wget-3.2


In [3]:
import tensorflow as tf
import numpy as np
import os
import cv2
from tensorflow import keras
from tensorflow.keras import layers
import wget
import tarfile

These lines download and extract the Microsoft COCO dataset, which is stored as a compressed ZIP file. The wget module is used to download the dataset from the specified URL, and the tarfile module is used to extract the contents of the ZIP file. The os module is used to remove the ZIP file after it has been extracted.

In [6]:
# Download and extract the dataset
dataset_url = 'http://images.cocodataset.org/zips/train2017.zip'
dataset_file = 'train2017.zip'
wget.download(dataset_url, dataset_file, bar=wget.bar_adaptive)
print('Download complete.')
with tarfile.open(dataset_file, 'r') as tar:
    tar.extractall()
os.remove(dataset_file)
print('Extraction complete.')


KeyboardInterrupt: ignored

These lines define a list of animal classes to be used for classification. In this example, the animal classes are 'person_riding_horse', 'elephant', 'bear', 'zebra', and 'giraffe'. You can customize this list depending on the specific animal classes you want to classify.

In [None]:
# Define the classes of animals
class_names = ['person_riding_horse', 'elephant', 'bear', 'zebra', 'giraffe']

These lines load the dataset from the directory where it was extracted. The keras.preprocessing.image_dataset_from_directory function is used to create two datasets: one for training and one for validation. The batch_size parameter determines the number of images to process at a time, the img_height and img_width parameters determine the size of the input images, and the validation_split parameter specifies the fraction of the data to use for validation.

In [None]:
# Load the dataset
batch_size = 32
img_height = 224
img_width = 224
train_ds = keras.preprocessing.image_dataset_from_directory(
    "train2017",
    labels="inferred",
    label_mode="int",
    class_names=class_names,
    color_mode="rgb",
    batch_size=batch_size,
    image_size=(img_height, img_width),
    shuffle=True,
    seed=123,
    validation_split=0.2,
    subset="training"
)

In [None]:
val_ds = keras.preprocessing.image_dataset_from_directory(
    "train2017",
    labels="inferred",
    label_mode="int",
    class_names=class_names,
    color_mode="rgb",
    batch_size=batch_size,
    image_size=(img_height, img_width),
    shuffle=True,
    seed=123,
    validation_split=0.2,
    subset="validation"
)

These lines define the architecture of the convolutional neural network (CNN) used for image classification. The Sequential function is used to create a sequence of layers, including three convolutional layers, three max pooling layers, a flatten

model = keras.Sequential([...]): This creates a Sequential model, which is a linear stack of layers. The layers are defined inside the list passed as an argument to the Sequential function.

layers.experimental.preprocessing.Rescaling(1./255): This rescales the input images so that the pixel values are between 0 and 1, which makes it easier for the model to learn from the data.

layers.Conv2D(32, 3, activation='relu'): This adds a convolutional layer with 32 filters and a 3x3 kernel size, with ReLU activation.

layers.MaxPooling2D(): This adds a max pooling layer to downsample the output of the previous layer.

The next two lines add two more convolutional layers with 32 filters and 3x3 kernel size each, followed by max pooling layers.

layers.Flatten(): This flattens the output of the previous layer into a 1D array, which is then passed to a fully connected layer.

layers.Dense(128, activation='relu'): This adds a fully connected layer with 128 units and ReLU activation.

layers.Dense(len(class_names)): This adds the output layer with the number of units equal to the number of classes, which is determined by the length of the class_names list.

model.compile(...): This compiles the model, specifying the optimizer, loss function, and metrics to be used during training. In this case, the Adam optimizer is used with Sparse Categorical Crossentropy loss and accuracy as the evaluation metric. The from_logits=True parameter indicates that the output of the model is not normalized, and needs to be passed through a softmax function during training.

In [None]:
# Define the model architecture
model = keras.Sequential([
    layers.experimental.preprocessing.Rescaling(1./255),
    layers.Conv2D(32, 3, activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, 3, activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(len(class_names))
])

In [None]:
# Compile the model
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy'])

In [None]:
# Train the model
epochs = 10
model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs
)

In [None]:
# Make predictions on new data
img_path = "path/to/image.jpg"
img = keras.preprocessing.image.load_img(
    img_path, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch

predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])

print("This image most likely belongs to {} with a {:.2f} percent confidence."
      .format(class_names[np.argmax(score)], 100 * np.max(score)))