3.3. CIFAR-10 画像認識

CIFAR-10 は、カナダのトロント大学が提供する、10種類の画像を分類するためのデータセットです。このデータセットは、32x32ピクセルのカラー画像 60,000枚で構成されています。そのうち、50,000枚が訓練用、10,000枚がテスト用になっています。画像は、飛行機、自動車、鳥、猫、鹿、犬、カエル、馬、船、トラックの10種類です。

In [1]:
import os
import keras

from keras.models import Sequential

from keras.layers import Conv2D, MaxPooling2D, Activation
from keras.layers import Flatten, Dense, Dropout

from keras.datasets import cifar10

from keras.optimizers import RMSprop

from keras.callbacks import TensorBoard, ModelCheckpoint


2024-02-07 20:02:29.467485: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-02-07 20:02:29.488197: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-07 20:02:29.488217: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-07 20:02:29.488996: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-07 20:02:29.493174: I tensorflow/core/platform/cpu_feature_guar

In [2]:
# LeNet configuration

def network(input_shape, num_classes):
  model = Sequential()

  model.add(Conv2D(32, (3, 3), padding='same', input_shape=input_shape, activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.5))
  model.add(Flatten())
  model.add(Dense(512, activation='relu'))
  model.add(Dropout(0.5))
  model.add(Dense(num_classes, activation='softmax'))
  return model

In [4]:
class CIFAR10Dataset():
  """CIFAR-10 dataset.
  
  Describes:
    CIFAR-10 데이터셋을 캡슐화한다.
  
  Attributes:
    self.image_shape: 이미지의 형태 (높이 32, 너비 32, 채널 수 3)
    self.num_classes: 클래스 수 (10)
  """
  def __init__(self):
    """Constructor.
    
    인스턴스 변수를 초기화한다.
    """
    self.image_shape = (32, 32, 3)
    self.num_classes = 10

  def get_batch(self):
    """CIFAR-10 데이터셋을 로드하고 전처리한다.
    
    Describes:
      CIFAR-10 데이터셋을 로드하고 전처리한다. 데이터는 0과 1 사이로 정규화된다.
      레이블은 원-핫 인코딩된다.

    Returns:
      훈련 데이터, 훈련 레이블, 테스트 데이터, 테스트 레이블을 반환한다.
    """
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()

    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')

    x_train /= 255
    x_test /= 255

    y_train = keras.utils.to_categorical(y_train, self.num_classes)
    y_test = keras.utils.to_categorical(y_test, self.num_classes)

    return x_train, y_train, x_test, y_test
  
  def preprocess(self, data, label_data=False):
    if label_data:
      data = keras.utils.to_categorical(data, self.num_classes)
    else:
      data = data.astype('float32')
      data /= 255
      shape = (data.shape[0],) + self.image_shape
      return data

In [5]:
class Trainer():
  def __init__(self, model, loss, optimizer):
    self._target = model
    self._target.compile(loss=loss, optimizer=optimizer, metrics=['accuracy'])
    self.verbose = 1
    self.log_dir = os.path.join(os.getcwd(), 'logdir')
    
    
    self.model_file_name = 'model_file.hdf5'

  def train(self, x_train, y_train, batch_size, epochs, validation_split):
    if os.path.exists(self.log_dir):
      import shutil
      shutil.rmtree(self.log_dir)
    os.mkdir(self.log_dir)

    self._target.fit(
      x_train, y_train,
      batch_size=batch_size, epochs=epochs,
      validation_split=validation_split,
      callbacks=[
        TensorBoard(log_dir=self.log_dir),
        ModelCheckpoint(os.path.join(self.log_dir, self.model_file_name), save_best_only=True)
      ],
      verbose=self.verbose)

In [6]:
dataset = CIFAR10Dataset()
x_train, y_train, x_test, y_test = dataset.get_batch()

model = network(dataset.image_shape, dataset.num_classes)
trainer = Trainer(model, loss='categorical_crossentropy', optimizer=RMSprop())
trainer.train(x_train, y_train, batch_size=128, epochs=12, validation_split=0.2)

# show result
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

2024-02-07 20:12:55.637797: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-07 20:12:55.673556: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-07 20:12:55.673596: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-07 20:12:55.676272: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-07 20:12:55.676311: I external/local_xla/xla/stream_executor

Epoch 1/12


2024-02-07 20:12:57.601901: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
2024-02-07 20:12:58.678995: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8800
2024-02-07 20:12:59.800388: I external/local_xla/xla/service/service.cc:168] XLA service 0x7f8948015c50 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-02-07 20:12:59.800418: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 4090, Compute Capability 8.9
2024-02-07 20:12:59.804087: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1707304379.838820  278507 device_compiler.h:186] Compiled cluster using XL

Epoch 2/12
 39/313 [==>...........................] - ETA: 1s - loss: 1.4110 - accuracy: 0.4906

  saving_api.save_model(


Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
Test loss: 0.8171332478523254
Test accuracy: 0.7210999727249146


参考書と同じコードだと思うが、結果が大違いだ！

In [20]:
# check tensorboard
# !tensorboard --logdir=logdir

**ネットワークをもっと深くすることによる改善**

In [7]:
# LeNet configuration

def network(input_shape, num_classes):
  model = Sequential()

  model.add(Conv2D(32, (3, 3), padding='same', input_shape=input_shape, activation='relu'))
  model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(64, (3, 3), activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.5))
  model.add(Flatten())
  model.add(Dense(512, activation='relu'))
  model.add(Dropout(0.5))
  model.add(Dense(num_classes, activation='softmax'))
  return model

In [10]:
dataset = CIFAR10Dataset()
x_train, y_train, x_test, y_test = dataset.get_batch()

model = network(dataset.image_shape, dataset.num_classes)
trainer = Trainer(model, loss='categorical_crossentropy', optimizer=RMSprop())
trainer.train(x_train, y_train, batch_size=256, epochs=64, validation_split=0.125)

# show result
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Epoch 1/64
Epoch 2/64
Epoch 3/64
Epoch 4/64
Epoch 5/64
Epoch 6/64
Epoch 7/64
Epoch 8/64
Epoch 9/64
Epoch 10/64
Epoch 11/64
Epoch 12/64
Epoch 13/64
Epoch 14/64
Epoch 15/64
Epoch 16/64
Epoch 17/64
Epoch 18/64
Epoch 19/64
Epoch 20/64
Epoch 21/64
Epoch 22/64
Epoch 23/64
Epoch 24/64
Epoch 25/64
Epoch 26/64
Epoch 27/64
Epoch 28/64
Epoch 29/64
Epoch 30/64
Epoch 31/64
Epoch 32/64
Epoch 33/64
Epoch 34/64
Epoch 35/64
Epoch 36/64
Epoch 37/64
Epoch 38/64
Epoch 39/64
Epoch 40/64
Epoch 41/64
Epoch 42/64
Epoch 43/64
Epoch 44/64
Epoch 45/64
Epoch 46/64
Epoch 47/64
Epoch 48/64
Epoch 49/64
Epoch 50/64
Epoch 51/64
Epoch 52/64
Epoch 53/64
Epoch 54/64
Epoch 55/64
Epoch 56/64
Epoch 57/64
Epoch 58/64
Epoch 59/64
Epoch 60/64
Epoch 61/64
Epoch 62/64
Epoch 63/64
Epoch 64/64
Test loss: 0.6143041849136353
Test accuracy: 0.8187999725341797
