Deep Learning for Computer Vision

Computer Vision has become ubiquitous in our society, with applications in image/video search and understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, segmentation, localization and detection. Recent developments in neural network (a.k.a. deep learning) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems.

PART1. Principal Component Analysis and k-Nearest Neighbors Classification

Usage

Download the dataset you want to apply PCA to, and use the following command:

    cd PCA&KNN/
    python3 ./pca.py

Remenber that you dataset should be located in "./PCA&KNN/train" as well as "./PCA&KNN/test"

Results

Perform PCA on the training set. Plot the mean face and the first three eigenfaces.

Plot these reconstructed images using the first n = 3, 50, 100, 239 eigenfaces, with the corresponding MSE values.

To apply the k-nearest neighbors classifier to recognize test set images, and use such hyperparameters, k = {1, 3, 5} and n = {3, 50, 159}. Show the 3-fold cross validation results.

PART2. Segmentation

Usage

filterBank.mat: The given .mat file contains a set of 38 filters (also known as filter bank). This filter bank is stored as a 49 x 49 x 38 matrix (i.e., each filter is of size 49 x 49 pixels).
Images: zebra.jpg and mountain.jpg

    cd Segmentation/
    python3 ./color_seg.py
    python3 ./text_seg.py

Results

Original images

Color segmentation

Convert both RGB images into Lab color space and plot the segmentation results for both images based on the clustering results

Texture segmentation

Convert the color images into grayscale ones, before extracting image textural features via the provided filter bank and plot the texture segmentation results for both images.

Combine both color and texture features (3 + 38 = 41-dimensional features) for image segmentation

PART3. Recognition with Bag of Visual Words

Usage

The implementation of a basic image-based bag-of-words (BoW) model for a scene image dataset with 5 categories.

    cd Recognition/
    python3 ./interst_point_detection.py
    python3 ./kmeans.py

Results

Detect interest points, calculate their descriptors for this image using SURF, and plot the interest point detection results

Use k-means algorithm to divide these interest points into C clusters

Extract the detected interest points from all of the 50 images in Train-10, and stack them into a N × d matrix, where N denotes the total number of interest points and d is the dimension of its descriptor. (choose C = 50 and maximum number of iterations = 5000) The centroid of each cluster then indicates a visual word. Randomly select 6 clusters from the above results and plot the visual words and the associated interest points in this PCA subspace.

PART4. Semantic Segmentation

Usage

Perform 2 semantic segmentation models, which predicts a label to each pixel with CNN models. The input is a RGB image while the output is the semantic segmentation/prediction. The models I used are the base model (VGG16-FCN32s) and the improved model(FCN8s). Before using the codes, the training/testing data and model file (vgg16_weights_tf_dim_ordering_tf_kernels.h5) are required.

cd Semantic-segmentation
bash ss.sh $1 $2
bash ss_best.sh $1 $2
    $1: testing images directory (images are named 'xxxx_sat.jpg')
    $2: output images directory
python3 mean_ios_evaluate.py -g ground_truth -p prediction

I also provide my pre-trained models for both. Check model_files .

Results

Show the predicted segmentation mask from the base model.

Show the predicted segmentation mask from the improved model.

Demonstrate the IOU accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
PCA&KNN		PCA&KNN
Recognition		Recognition
Segmentation		Segmentation
Semantic-segmentation		Semantic-segmentation
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning for Computer Vision

PART1. Principal Component Analysis and k-Nearest Neighbors Classification

Usage

Results

PART2. Segmentation

Usage

Results

PART3. Recognition with Bag of Visual Words

Usage

Results

PART4. Semantic Segmentation

Usage

Results

About

Releases

Packages

Languages

PierreSue/Deep-Learning-for-Computer-Vision

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for Computer Vision

PART1. Principal Component Analysis and k-Nearest Neighbors Classification

Usage

Results

PART2. Segmentation

Usage

Results

PART3. Recognition with Bag of Visual Words

Usage

Results

PART4. Semantic Segmentation

Usage

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages