3.3. CIFAR-10 画像認識

CIFAR-10 は、カナダのトロント大学が提供する、10種類の画像を分類するためのデータセットです。このデータセットは、32x32ピクセルのカラー画像 60,000枚で構成されています。そのうち、50,000枚が訓練用、10,000枚がテスト用になっています。画像は、飛行機、自動車、鳥、猫、鹿、犬、カエル、馬、船、トラックの10種類です。

In [33]:
import os
import keras

from keras.models import Sequential

from keras.layers import Conv2D, MaxPooling2D, Activation
from keras.layers import Flatten, Dense, Dropout

from keras.datasets import cifar10

from keras.optimizers import RMSprop

from keras.callbacks import TensorBoard, ModelCheckpoint


In [34]:
# LeNet configuration

def network(input_shape, num_classes):
  model = Sequential()

  model.add(Conv2D(32, (3, 3), padding='same', input_shape=input_shape, activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.5))
  model.add(Flatten())
  model.add(Dense(512, activation='relu'))
  model.add(Dropout(0.5))
  model.add(Dense(num_classes, activation='softmax'))
  return model

In [35]:
class CIFAR10Dataset():
  """
  CIFAR-10 dataset.
  attributes:
    - image_shape: image shape
    - num_classes: number of classes
  """
  def __init__(self):
    self.image_shape = (32, 32, 3)
    self.num_classes = 10

  def get_batch(self):
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()

    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')

    x_train /= 255
    x_test /= 255

    y_train = keras.utils.to_categorical(y_train, self.num_classes)
    y_test = keras.utils.to_categorical(y_test, self.num_classes)

    return x_train, y_train, x_test, y_test
  
  def preprocess(slef, data, label_data=False):
    if label_data:
      data = keras.utils.to_categorical(data, self.num_classes)
    else:
      data = data.astype('float32')
      data /= 255
      shape = (data.shape[0],) + self.image_shape
      return data

In [36]:
class Trainer():
  def __init__(self, model, loss, optimizer):
    self._target = model
    self._target.compile(loss=loss, optimizer=optimizer, metrics=['accuracy'])
    self.verbose = 1
    self.log_dir = os.path.join(os.getcwd(), 'logdir')
    
    
    self.model_file_name = 'model_file.hdf5'

  def train(self, x_train, y_train, batch_size, epochs, validation_split):
    if os.path.exists(self.log_dir):
      import shutil
      shutil.rmtree(self.log_dir)
    os.mkdir(self.log_dir)

    self._target.fit(
      x_train, y_train,
      batch_size=batch_size, epochs=epochs,
      validation_split=validation_split,
      callbacks=[
        TensorBoard(log_dir=self.log_dir),
        ModelCheckpoint(os.path.join(self.log_dir, self.model_file_name), save_best_only=True)
      ],
      verbose=self.verbose)

In [37]:
dataset = CIFAR10Dataset()
x_train, y_train, x_test, y_test = dataset.get_batch()

model = network(dataset.image_shape, dataset.num_classes)
trainer = Trainer(model, loss='categorical_crossentropy', optimizer=RMSprop())
trainer.train(x_train, y_train, batch_size=128, epochs=12, validation_split=0.2)

# show result
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


2024-02-02 10:45:56.553153: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-02 10:45:56.554711: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-02 10:45:56.554736: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-02 10:45:56.557841: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-02 10:45:56.557861: I external/local_xla/xla/stream_executor

Epoch 1/12


2024-02-02 10:45:59.676155: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential/dropout/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
2024-02-02 10:46:00.803206: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8902
2024-02-02 10:46:02.562755: I external/local_xla/xla/service/service.cc:168] XLA service 0x7f1de4028710 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-02-02 10:46:02.562782: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 4090, Compute Capability 8.9
2024-02-02 10:46:02.568600: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1706838362.607428   28437 device_compiler.h:186] Compiled cluster using XL

Epoch 2/12
 44/313 [===>..........................] - ETA: 0s - loss: 1.4411 - accuracy: 0.4846

  saving_api.save_model(


Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
Test loss: 0.7863341569900513
Test accuracy: 0.7296000123023987


参考書と同じコードだと思うが、結果が大違いだ！

In [38]:
# check tensorboard
# !tensorboard --logdir=logdir

**ネットワークをもっと深くすることによる改善**

In [39]:
# LeNet configuration

def network(input_shape, num_classes):
  model = Sequential()

  model.add(Conv2D(32, (3, 3), padding='same', input_shape=input_shape, activation='relu'))
  model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.25))
  model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(64, (3, 3), activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Dropout(0.5))
  model.add(Flatten())
  model.add(Dense(512, activation='relu'))
  model.add(Dropout(0.5))
  model.add(Dense(num_classes, activation='softmax'))
  return model

In [40]:
dataset = CIFAR10Dataset()
x_train, y_train, x_test, y_test = dataset.get_batch()

model = network(dataset.image_shape, dataset.num_classes)
trainer = Trainer(model, loss='categorical_crossentropy', optimizer=RMSprop())
trainer.train(x_train, y_train, batch_size=128, epochs=12, validation_split=0.2)

# show result
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
Test loss: 0.6960493326187134
Test accuracy: 0.7573999762535095
