# Machine Vision Fundamentals and Applications - test

## Introduction

Machine vision, also known as computer vision, is a technology that enables machines to interpret and understand visual information from the world, mimicking human vision. It involves the use of cameras, image sensors (e.g., CCD or CMOS), lighting systems, and computational algorithms to process and analyze images or videos. Key technologies include:
- **Cameras and Sensors**: Capture high-quality images using CCD or CMOS sensors.
- **Lighting**: Ensures consistent image quality for accurate analysis.
- **Software Libraries**: Tools like OpenCV and TensorFlow enable image processing and machine learning.

Applications span multiple industries:
- **Manufacturing**: Quality control, defect detection, and assembly verification.
- **Healthcare**: Medical imaging analysis, such as detecting anomalies in X-rays.
- **Automotive**: Part inspection, paint quality checks, and autonomous vehicle features like object detection.

This notebook demonstrates these concepts through practical Python implementations, using the CIFAR-10 dataset for image classification.

## Data Acquisition and Preprocessing

We'll use the **CIFAR-10 dataset**, which contains 60,000 32x32 color images across 10 classes (e.g., airplanes, cars, birds). This dataset is suitable for machine vision tasks like object recognition, relevant to automotive and manufacturing applications. We'll load it using Keras and apply preprocessing techniques:
- **Normalization**: Scale pixel values to [0, 1] for better model convergence.
- **One-hot Encoding**: Convert labels to categorical format for classification.
- **Data Augmentation**: Apply transformations like rotation and flipping to increase dataset variety and prevent overfitting.

In [None]:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt

# Load the CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Normalize pixel values to [0, 1]
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# One-hot encode the labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Display a sample image
plt.imshow(X_train[0])
plt.title('Sample Image from CIFAR-10')
plt.axis('off')
plt.show()

# Data augmentation
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True
)
datagen.fit(X_train)

print('Data loaded and preprocessed successfully.')

## Implementing Key Machine Vision Techniques

### Basic Image Processing with OpenCV

We'll perform basic image processing operations using OpenCV, including:
- **Grayscale Conversion**: Simplify the image to a single channel.
- **Gaussian Blur**: Reduce noise for better edge detection.
- **Canny Edge Detection**: Identify edges, useful for object boundary detection in manufacturing or automotive inspections.

In [None]:
# Install OpenCV if not already installed (run once)
# !pip install opencv-python

import cv2

# Take a sample image from the dataset (denormalize for OpenCV)
sample_img = (X_train[0] * 255).astype(np.uint8)

# Convert to grayscale
gray = cv2.cvtColor(sample_img, cv2.COLOR_RGB2GRAY)

# Apply Gaussian blur filter
blurred = cv2.GaussianBlur(gray, (5, 5), 0)

# Edge detection with Canny
edges = cv2.Canny(blurred, 50, 150)

# Display results
fig, axs = plt.subplots(1, 3, figsize=(10, 4))
axs[0].imshow(cv2.cvtColor(sample_img, cv2.COLOR_BGR2RGB))
axs[0].set_title('Original')
axs[0].axis('off')
axs[1].imshow(blurred, cmap='gray')
axs[1].set_title('Blurred')
axs[1].axis('off')
axs[2].imshow(edges, cmap='gray')
axs[2].set_title('Edges')
axs[2].axis('off')
plt.show()

### Image Classification with Pre-trained CNN

We'll use a pre-trained **VGG16** model for feature extraction, fine-tuning it on CIFAR-10. This demonstrates transfer learning, a key machine vision technique for efficient model development. The model is modified by adding dense layers for classification and freezing the base layers to reduce training time.

In [None]:
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Load pre-trained VGG16 model (without top layers)
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(32, 32, 3))

# Freeze base model layers to prevent retraining
base_model.trainable = False

# Build the model
model = Sequential([
    base_model,
    Flatten(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model using augmented data
history = model.fit(datagen.flow(X_train, y_train, batch_size=64),
                    epochs=10,
                    validation_data=(X_test, y_test))

# Plot training history
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

print('Model training completed.')

## Analysis and Results

We'll evaluate the model's performance using metrics like accuracy, precision, recall, and F1-score. We'll also visualize a confusion matrix and sample predictions to understand the model's behavior.

In [None]:
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns

# Class names for CIFAR-10
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

# Predict on test set
y_pred = model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = np.argmax(y_test, axis=1)

# Classification report
print('Classification Report:')
print(classification_report(y_true, y_pred_classes, target_names=class_names))

# Confusion matrix
cm = confusion_matrix(y_true, y_pred_classes)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
plt.title('Confusion Matrix')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()

# Visualize sample predictions
fig, axs = plt.subplots(2, 5, figsize=(15, 6))
for i in range(10):
    ax = axs[i // 5, i % 5]
    ax.imshow(X_test[i])
    ax.set_title(f'True: {class_names[y_true[i]]}\nPred: {class_names[y_pred_classes[i]]}')
    ax.axis('off')
plt.tight_layout()
plt.show()

## Conclusion and Future Perspectives

This notebook demonstrated machine vision fundamentals, from basic image processing (grayscale, blurring, edge detection) to advanced classification using a pre-trained VGG16 model on the CIFAR-10 dataset. The model achieved reasonable performance, typically around 60-70% accuracy (exact results depend on training runs), showcasing the power of transfer learning for machine vision tasks.

**Challenges Faced**:
- **Small Image Size**: CIFAR-10's 32x32 images are smaller than VGG16's default input, but the model handled it effectively.
- **Computational Limits**: Freezing VGG16 layers reduced training time, making it feasible for this demo.
- **Overfitting Risk**: Data augmentation helped mitigate overfitting, though longer training could improve results.

**Future Directions**:
- Explore advanced techniques like object detection (e.g., YOLO) or segmentation for more complex tasks.
- Integrate with real-time hardware, such as cameras, for applications like robotic vision.
- Use larger datasets or fine-tune more layers for higher accuracy.
- Address ethical concerns, such as bias in vision systems, to ensure fair deployment in industries like healthcare or automotive.

Machine vision continues to evolve, with potential to revolutionize automation and decision-making across industries.