IMDB 영화평데이터 > 감성분류를 위한 트랜스포머 아키텍처 모델 구축

1. 정수토큰 시퀀스(길이80)입력
2. 토큰임베딩 + 위치임베딩
3. 멀티헤드어텐션 3헤드
4. concate+정규화
5. FFN (Dense+Dense)
6. concat+정규화
7. 분류기 (Dense)

# 1. 정수토큰 시퀀스(길이80)입력

In [1]:
import tensorflow as tf
from tensorflow.keras import Model, layers

2025-09-05 11:48:39.724966: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [None]:
# 토큰 임베딩
inputs = layers.Input(shape=(80,))
input_embedding = layers.Embedding(input_dim=1000, output_dim=32)(inputs)

I0000 00:00:1757040664.628812  201308 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4459 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


# 2. 토큰임베딩 + 위치임베딩

In [3]:
# 위치임베딩
positions = tf.range(start=0, limit=80)
pos_embedding = layers.Embedding(input_dim=80, output_dim=32)(positions)
pos_enc_output = pos_embedding + input_embedding

# 3. 멀티헤드어텐션 3헤드

In [None]:
attention_output = layers.MultiHeadAttention(num_heads=3, key_dim=32)(pos_enc_output, pos_enc_output) #K,V

# 4. concate+정규화

In [6]:
x = layers.add([pos_enc_output, attention_output])
x = layers.BatchNormalization()(x)

# 5. FFN (Dense+Dense)
# 6. concat+정규화

In [8]:
from tensorflow.keras.models import Sequential
ffnn = Sequential(
  [
    layers.Dense(64,activation='relu'),
    layers.Dense(32, activation='relu')
  ]
)(x)
x = layers.add([ffnn, x])
x = layers.BatchNormalization()(x)

# 7. 분류기 (Dense)

In [10]:
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dropout(0.1)(x)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dropout(0.1)(x)
outputs = layers.Dense(2, activation='softmax')(x)

# 모델 구성

In [11]:
model = Model(inputs=inputs, outputs=outputs)
model.summary()

In [12]:
# 손실함수와, 옵티마이저 지정
model.compile(loss='sparse_categorical_crossentropy'
              , optimizer='adam'
              , metrics=['accuracy'])

# imdb data load

In [None]:
from tensorflow.keras.datasets import imdb
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=10000)
(X_train, y_train), (X_test, y_test)

In [14]:
# 텍스트데이터 전처리 
from tensorflow.keras.preprocessing.sequence import pad_sequences
X_train_pad = pad_sequences(X_train, maxlen=80, padding='post', truncating='post' )
X_test_pad  = pad_sequences(X_test, maxlen=80, padding='post', truncating='post' )

In [15]:
model.fit(X_train_pad, y_train, epochs=10, batch_size=200)

Epoch 1/10


2025-09-05 12:25:46.803038: I external/local_xla/xla/service/service.cc:163] XLA service 0x7c9af4007bc0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-09-05 12:25:46.803120: I external/local_xla/xla/service/service.cc:171]   StreamExecutor device (0): NVIDIA GeForce RTX 4060 Laptop GPU, Compute Capability 8.9
2025-09-05 12:25:46.871184: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2025-09-05 12:25:47.218875: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:473] Loaded cuDNN version 91002
2025-09-05 12:25:47.431218: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-09-05 12:25:47.

[1m  3/125[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m7s[0m 58ms/step - accuracy: 0.5397 - loss: 0.6946 

I0000 00:00:1757042755.807897  202056 device_compiler.h:196] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 7ms/step - accuracy: 0.7106 - loss: 0.5438
Epoch 2/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.7818 - loss: 0.4591
Epoch 3/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.7886 - loss: 0.4480
Epoch 4/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.7930 - loss: 0.4423
Epoch 5/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.7965 - loss: 0.4345
Epoch 6/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.7978 - loss: 0.4276
Epoch 7/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.8003 - loss: 0.4186
Epoch 8/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.8034 - loss: 0.4140
Epoch 9/10
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━

<keras.src.callbacks.history.History at 0x7c9bf01419c0>

In [16]:
model.evaluate(X_test_pad, y_test)

2025-09-05 12:27:07.519926: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-09-05 12:27:07.519982: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-09-05 12:27:07.520028: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.








[1m776/782[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - accuracy: 0.7762 - loss: 0.4875

2025-09-05 12:27:11.443464: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-09-05 12:27:11.443527: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-09-05 12:27:11.443588: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.








[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 5ms/step - accuracy: 0.7723 - loss: 0.4973


[0.4972839057445526, 0.7723199725151062]

125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8059 - loss: 0.4047

In [18]:
import numpy as np
pred = model.predict(X_test_pad)
pred = np.argmax(pred, axis=1)

[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step


In [19]:
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, pred)

array([[10335,  2165],
       [ 3527,  8973]])