<a href="https://colab.research.google.com/github/Sbabuthota/imageclassification/blob/main/Copy_of_preprocess_data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
import tensorflow as tf

# Load the CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Resize images to 64x64 (if necessary)
train_images_resized = tf.image.resize(train_images, (64, 64))
test_images_resized = tf.image.resize(test_images, (64, 64))

# Normalize images to the range [0, 1]
train_images_normalized = train_images_resized / 255.0
test_images_normalized = test_images_resized / 255.0

# Convert class vectors to binary class matrices (one-hot encoding)
train_labels_one_hot = to_categorical(train_labels, num_classes=10)
test_labels_one_hot = to_categorical(test_labels, num_classes=10)

# Save preprocessed data locally
np.save('train_images_normalized.npy', train_images_normalized)
np.save('test_images_normalized.npy', test_images_normalized)
np.save('train_labels_one_hot.npy', train_labels_one_hot)
np.save('test_labels_one_hot.npy', test_labels_one_hot)


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [None]:
pip install numpy opencv-python pillow




In [None]:
import numpy as np
import os
import cv2  # OpenCV for resizing

# Function to resize images
def resize_images(images, new_size):
    resized_images = []
    for img in images:
        resized_img = cv2.resize(img, new_size, interpolation=cv2.INTER_AREA)
        resized_images.append(resized_img)
    return np.array(resized_images)

# Function to save images as files (optional)
def save_images(images, labels, directory, prefix):
    if not os.path.exists(directory):
        os.makedirs(directory)
    for i, img in enumerate(images):
        filename = f"{prefix}_{i}.png"
        filepath = os.path.join(directory, filename)
        cv2.imwrite(filepath, img)
        if labels is not None:
            label = labels[i]
            label_filepath = os.path.join(directory, f"{prefix}_label_{i}.txt")
            with open(label_filepath, 'w') as f:
                f.write(str(label))

# Load the .npy files
train_images = np.load('train_images_normalized.npy')
test_images = np.load('test_images_normalized.npy')
train_labels = np.load('train_labels_one_hot.npy')
test_labels = np.load('test_labels_one_hot.npy')

# Define the new size (width, height)
new_size = (32, 32)

# Resize the images
train_images_resized = resize_images(train_images, new_size)
test_images_resized = resize_images(test_images, new_size)

# Save the resized images back to .npy files
np.save('train_images_resized.npy', train_images_resized)
np.save('test_images_resized.npy', test_images_resized)

# Optionally save the resized images to disk as image files
save_images(train_images_resized, train_labels, 'resized_train_images', 'train')
save_images(test_images_resized, test_labels, 'resized_test_images', 'test')

print("Resizing completed and files saved.")


Resizing completed and files saved.


In [None]:
import numpy as np

# Load resized test images and labels
test_images_resized = np.load('test_images_resized.npy')
test_labels = np.load('test_labels_one_hot.npy')


In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten

# Example model architecture (replace with my actual model architecture)
model = Sequential([
    Flatten(input_shape=(32, 32, 3)),  # Assuming resized images are 32x32 pixels with 3 channels
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')   # Assuming 10 classes for classification
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model (example, replace with my actual training code)
model.fit(train_images_resized, train_labels_one_hot, epochs=10, batch_size=32, validation_data=(test_images_resized, test_labels))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7a4dea9a3a60>

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Assuming my model is already trained, make predictions
predicted_labels = model.predict(test_images_resized)

# Convert one-hot encoded labels back to categorical labels if needed
true_labels = np.argmax(test_labels, axis=1)
predicted_labels = np.argmax(predicted_labels, axis=1)

# Compute evaluation metrics
accuracy = accuracy_score(true_labels, predicted_labels)
precision = precision_score(true_labels, predicted_labels, average='weighted')
recall = recall_score(true_labels, predicted_labels, average='weighted')
f1 = f1_score(true_labels, predicted_labels, average='weighted')

# Print or use these metrics as needed
print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1 Score: {f1}")


Accuracy: 0.4189
Precision: 0.40989567381337483
Recall: 0.4189
F1 Score: 0.39758468165670474


1. Load and Prepare Data

First, ensure I load resized test images (test_images_resized.npy) and their corresponding labels (test_labels_one_hot.npy):


2. Train Machine Learning Model

Before make predictions, i need to have a trained model. Here’s a hypothetical example assuming using TensorFlow/Keras for training:

3. Make Predictions and Evaluate

After training the model, then proceed to make predictions on the test data (test_images_resized) and compute evaluation metrics:

Explanation:
Model Definition and Training: I define a simple neural network model using TensorFlow/Keras, compile it with an optimizer and loss function, and then train it on my resized training data (train_images_resized and train_labels_one_hot).

Prediction: After training, the model.predict method is used to generate predictions on my resized test images (test_images_resized).

Evaluation Metrics: Using scikit-learn's metrics (accuracy_score, precision_score, recall_score, f1_score), i compute evaluation metrics comparing the predicted labels (predicted_labels) with the true labels (true_labels).

Ensure that I replace placeholders (train_images_resized, train_labels_one_hot, test_images_resized, test_labels_one_hot) with my actual data variables as per my dataset structure. Additionally, adjust the model architecture and training parameters to suit my specific machine learning task and dataset characteristics. This structured approach will help me effectively integrate model training, prediction, and evaluation in my machine learning workflow.

Explanation:
Import Libraries:

numpy for handling array operations.
cv2 (OpenCV) for resizing images.
Define a Function to Resize Images:

resize_images function takes an array of images and a new size (width, height) as inputs and returns the resized images.
Define a Function to Save Images (optional):

save_images function saves the resized images as actual image files in a specified directory, along with their labels.
Load .npy Files:

Load my .npy files containing the image data using np.load.
Define the New Size:

Set the target size for the images (e.g., 32x32 pixels).
Resize the Images:

Use the cv2.resize function to resize each image to the new size.
Save the Resized Images:

Save the resized image arrays back to .npy files using np.save.
Optional: Save Images as Files:

If i want to save the resized images as actual image files, use the save_images function.

Notes:
Ensure the dimensions of the input images are compatible with the cv2.resize function. If the images are in grayscale or have a different number of channels,i might need to adjust the script accordingly.

The cv2.INTER_AREA interpolation method is generally good for shrinking images. i can experiment with other interpolation methods provided by OpenCV (e.g., cv2.INTER_LINEAR, cv2.INTER_CUBIC) to see which one works best for my specific use case.

The optional step of saving images as files is useful if i need to visually inspect the resized images or use them in a different format.

By following these steps, i can resize image datasets stored in .npy files and prepare them for further processing or model training.






