<a href="https://colab.research.google.com/github/Jongpil0911/AIR/blob/main/exe02a_image_recognition_dog_vs_cat_features.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 画像認識特論2025（Advanced Image Recognition 2025）

# \#2a：犬と猫の画像分類１（Dogs vs Cats 1）

---
Kaggleで実施された犬か猫かを見分ける画像分類問題に取り組む。ここでは、深層学習ではなく、古典的な人手で特徴量を設計して機械学習により分類する問題に取り組む。

Study on the image classification problem of distinguishing between a dog or a cat conducted on Kaggle. Here, we tackle the problem of classifying by machine learning, with features designed by hand in the classical way (hand-craft approach), rather than by deep learning.

<br>

Kaggleは、一言でいうと「機械学習のコンペティションサービス」である。

In a nutshell, Kaggle is a machine learning competition service.

<br>

本演習で利用するデータは、下記 Kaggleのサイトからダウンロードできる。しかし、アカウント登録が必要であるため、ここでは、Kaggleのサイトで公開されている学習データのみを提供する。データは他者に配布しないこと。

The dataset used in this exercise can be downloaded from the Kaggle site below. However, since account registration is required, only training data available on the Kaggle site is provided here. The data should not be distributed to others.

- Detail of dataset:
https://www.kaggle.com/competitions/dogs-vs-cats/overview

<br>

- ランタイムのタイプ: Python 3
- ハードウェア アクセラレータ: CPU
---


Write your student number and name.

- <font color= "blue">学生番号（Student number）</font> :
- <font color= "blue">氏名（Name）</font> :


---
### 2a-1. Google Driveのマウント

Mount Google Drive.

Google Colab無償版は以下の利用制限がある。今回利用するデータセットは、データ容量が約600MBであり、利用制限は超えていないものの実施するたびにデータをアップロードすることは手間である。そこで、Google Driveにデータをアップロードし、Google ColabよりGoogle Driveをマウントすることで手間を軽減する。<br>

The free version of Google Colab has the following usage restrictions. The dataset used in this study is approximately 600MB in size, and although it does not exceed the usage limit, uploading the data each time it is used would be time-consuming. Therefore, we will upload the data to Google Drive and mount Google Drive from Google Colab to reduce the time and effort.

* RAM：12GB<br>

* ディスク（storage）：CPU/TPC: max 107GB、GPU: max 68GB

* 90分ルール：何も操作せずに90分経つとリセット<br>
Idle cut-off 90 minutes

* 12時間ルール：インスタンスが起動してから12時間経つとリセット<br>
Limit 12 hours max per session

In [None]:
from google.colab import drive

drive.mount('/content/drive')

### 2a-2. データ準備

Data preparation

下記よりデータをダウンロードして、各自のGoogleドライブにアップロードする。zipファイルをそのままアップロードする場合は、以下の処理を1回だけ実行する。zipファイルではなく、解凍したフォルダをGoogleドライブへアップロードする場合は、以下の処理は実行しなくて良い。

Download the data from the following links and upload the data to your Google Drive. If you upload the zip file, execute the following process only once. If you upload the unzipped folder to Google Drive, you do not need to execute the following process.

* train.zip (543.2MB) <br>
[download](https://drive.google.com/file/d/12gP23qXYenKSPvoWbnuBC_vGeWq4YFAN/view?usp=sharing)

<br>

パスは各自の設定に合わせて変更する。

The path should be changed according to their own settings.

In [None]:
%cd /content/drive/MyDrive/ColabNotebooks/AIR2025/dogs-vs-cats/

# zipファイルの解凍
!unzip train.zip

### 2a-3. インストール

Install

* imutilsモジュール：OpenCVによる画像処理ユーティリティモジュール<br>
A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, displaying Matplotlib images, sorting contours, detecting edges, and much more easier with OpenCV and both Python 2.7 and Python 3.

In [None]:
!pip install imutils

### 2a-4. ライブラリのインポート

Import libraries

In [None]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

from skimage.feature import local_binary_pattern

from imutils import paths
import numpy as np
import argparse
import imutils
import cv2
import os

import matplotlib.pyplot as plt

from google.colab.patches import cv2_imshow

import pandas as pd

### 2a-5. 関数定義

Function definition

In [None]:
# 画像サイズを 32x32[pixel]（=1024）にリサイズし、1024個の画素値を特徴量として利用する。
# Resize the image to a fixed size, then flatten the image into a list of raw pixel intensities
# 特徴量数: 1024
def image_to_feature_vector(image, size=(32, 32)):
  return cv2.resize(image, size).flatten()

# RGB色空間からHSV色空間に変換し、HSV色空間の正規化ヒストグラムを求めて特徴量とする。
# Extract a 3D color histogram from the HSV color space using the supplied number of `bins` per channel
# 特徴量数: 692 = 180 + 256 + 256
def extract_color_histogram(image, bins=(8, 8, 8)):
	# RGB --> HSV
	hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

	hist = cv2.calcHist([hsv], [0, 1, 2], None, bins,	[0, 180, 0, 256, 0, 256])

	if imutils.is_cv2():
		hist = cv2.normalize(hist)
	else:
		cv2.normalize(hist, hist)

	return hist.flatten()

# LBPヒストグラムを求めて特徴量とする。
# Extract a LBP histogram.
def extract_LBP_histogram(image):
	num_points = 24
	radius = 8

	gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

	# compute the Local Binary Pattern representation
	# of the image, and then use the LBP representation
	# to build the histogram of patterns
	lbp = local_binary_pattern(gray, num_points,	radius, method="uniform")
	(hist, _) = np.histogram(lbp.ravel(), bins=np.arange(0, num_points+ 3), range=(0, num_points + 2))
	# normalize the histogram
	hist = hist.astype("float")
	hist /= (hist.sum() + 1e-7)

	# return the histogram of Local Binary Patterns
	return hist

### 2a-6. 画像データおよび特徴量の確認

Check of image data and features

<br>

画像データのパスは各自の設定に合わせて変更する。

The path of the image data should be changed according to their own settings.

In [None]:
dir_name = "/content/drive/MyDrive/ColabNotebooks/AIR2025/dogs-vs-cats/train/"

image_paths = list(paths.list_images(dir_name))

print(image_paths)

image_path = image_paths[0]
print(image_path)

# load a sample image
image = cv2.imread(image_path)

# show a loaded image
cv2_imshow(image)

# extract a color histogram
hist = extract_color_histogram(image)

# extract a LBP histogram
lbp = extract_LBP_histogram(image)

# show the color histogram
data_num = len(hist)
hist_x = np.linspace(1, data_num, data_num)

print(data_num)

fig = plt.figure(figsize=[10, 5])
plt.grid()
plt.bar(hist_x, hist)
plt.xlim(0, data_num)
plt.xticks(np.arange(0, data_num+1, 32))
plt.show()

# show the LBP histogram
data_num = len(lbp)
lbp_x = np.linspace(1, data_num, data_num)

print(data_num)

fig = plt.figure(figsize=[10, 5])
plt.grid(False)
plt.bar(lbp_x, lbp)
plt.xlim(0, data_num+1)
plt.xticks(np.arange(1, data_num+1, 1))
plt.show()

### 2a-7. 画像データの読み込み

Load image data

<br>

処理時間は約50分

Processing time is approximately 50 minutes.

In [None]:
# grab the list of images that we'll be describing
print("[INFO] describing images...")
dir_name = "/content/drive/MyDrive/ColabNotebooks/AIR2025/dogs-vs-cats/train/"

image_paths = list(paths.list_images(dir_name))

# initialize the raw pixel intensities matrix, the features matrix,
# and labels list
raw_images = []
hist_features = []
lbp_features = []
labels = []

# loop over the input images
for (i, image_path) in enumerate(image_paths):
	#	print(image_path)
	# load the image and extract the class label (assuming that our
	# path as the format: /path/to/dataset/{class}.{image_num}.jpg
	image = cv2.imread(image_path)
	label = image_path.split(os.path.sep)[-1].split(".")[0]

	# extract raw pixel intensity "features", followed by a color
	# histogram to characterize the color distribution of the pixels
	# in the image
	pixels = image_to_feature_vector(image)
	hist = extract_color_histogram(image)
	lbp = extract_LBP_histogram(image)

	# update the raw images, features, and labels matricies,
	# respectively
	raw_images.append(pixels)
	hist_features.append(hist)
	lbp_features.append(lbp)
	labels.append(label)

	# show an update every 1,000 images
	if i > 0 and i % 1000 == 0:
		print("[INFO] processed {}/{}".format(i, len(image_paths)))

# show some information on the memory consumed by the raw images
# matrix and features matrix
raw_images = np.array(raw_images)
hist_features = np.array(hist_features)
lbp_features = np.array(lbp_features)
labels = np.array(labels)

print("[INFO] pixels matrix: {:.2f}MB".format(raw_images.nbytes / (1024 * 1000.0)))
print("[INFO] hist features matrix: {:.2f}MB".format(hist_features.nbytes / (1024 * 1000.0)))
print("[INFO] lbp features matrix: {:.2f}MB".format(lbp_features.nbytes / (1024 * 1000.0)))

# データの分割
# partition the data into training and testing splits, using 75%
# of the data for training and the remaining 25% for testing
(train_raw_images, test_raw_images, train_raw_labels, test_raw_labels) = train_test_split(raw_images, labels, test_size=0.25, random_state=42)
(train_hist_features, test_hist_features, train_hist_features_labels, test_hist_features_labels) = train_test_split(hist_features, labels, test_size=0.25, random_state=42)
(train_lbp_features, test_lbp_features, train_lbp_features_labels, test_lbp_features_labels) = train_test_split(lbp_features, labels, test_size=0.25, random_state=42)

### 2a-8. 学習＋認識処理

Training and recognition

In [None]:
# 集計用
scores = {}

In [None]:
print("[INFO] evaluating raw pixel accuracy...")
model = KNeighborsClassifier(n_neighbors=1, n_jobs=-1)
model.fit(train_raw_images, train_raw_labels)
acc_train = model.score(train_raw_images, train_raw_labels)
acc_test = model.score(test_raw_images, test_raw_labels)
print("[INFO] raw pixel accuracy: {:.2f}%".format(acc_train * 100))
print("[INFO] raw pixel accuracy: {:.2f}%".format(acc_test * 100))

scores[("1. raw pixel", "(a) train_accuracy")] = acc_train
scores[("1. raw pixel", "(b) test_accuracy")] = acc_test

In [None]:
print("[INFO] evaluating histogram accuracy...")
model = KNeighborsClassifier(n_neighbors=1,	n_jobs=-1)
model.fit(train_hist_features, train_hist_features_labels)
acc_train = model.score(train_hist_features, train_hist_features_labels)
acc_test = model.score(test_hist_features, test_hist_features_labels)
print("[INFO] histogram accuracy: {:.2f}%".format(acc_train * 100))
print("[INFO] histogram accuracy: {:.2f}%".format(acc_test * 100))

scores[("2. HSV histogram", "(a) train_accuracy")] = acc_train
scores[("2. HSV histogram", "(b) test_accuracy")] = acc_test

In [None]:
print("[INFO] evaluating LBP accuracy...")
model = KNeighborsClassifier(n_neighbors=1,	n_jobs=-1)
model.fit(train_lbp_features, train_lbp_features_labels)
acc_train = model.score(train_lbp_features, train_lbp_features_labels)
acc_test = model.score(test_lbp_features, test_lbp_features_labels)
print("[INFO] LBP accuracy: {:.2f}%".format(acc_train * 100))
print("[INFO] LBP accuracy: {:.2f}%".format(acc_test * 100))

scores[("3. LBP", "(a) train_accuracy")] = acc_train
scores[("3. LBP", "(b) test_accuracy")] = acc_test

### 2a-9. 実験結果の集計

Summary experimental results.

In [None]:
pd.Series(scores).unstack()

---
## <font color= "blue">課題（Report）</font>

画素値、HSVカラーヒストグラム、LBPヒストグラムの三つの特徴量を用いた。各自で他の特徴量を計算するコードを実装し、四つの特徴量を用いた比較実験を実施しなさい。

Three features were used: pixel value, HSV color histogram, and LBP histogram. Implement a code to calculate the other features on your own, and conduct a comparison experiment using the four features.

<br>

実験結果について、日本語または英語で説明しなさい。

Discuss the experimental results in Japanese or English..

---

## Report #2a

After completing all the exercises, save the notebook (ipynb) and submit it to Moodle as Report #2a.