# Import danych z użyciem Kaggle API

Aby móc pobrać dataset automatycznie, należy wprowadzić swoją nazwę użytkownika oraz klucz API z nim związany.

INFO: https://www.kaggle.com/docs/api#authentication

In [1]:
import os

# From kaggle.json file, paste the values here.
os.environ['KAGGLE_USERNAME'] = "username"  # username
os.environ['KAGGLE_KEY']      = "api_key"   # key

In [2]:
!pip install kaggle
!kaggle datasets download -d ahemateja19bec1025/traffic-sign-dataset-classification

Dataset URL: https://www.kaggle.com/datasets/ahemateja19bec1025/traffic-sign-dataset-classification
License(s): CC0-1.0
Downloading traffic-sign-dataset-classification.zip to /content
 65% 123M/190M [00:00<00:00, 1.28GB/s]
100% 190M/190M [00:00<00:00, 788MB/s] 


Ekstrakcja danych

In [3]:
import zipfile

filename = "traffic-sign-dataset-classification.zip"
with zipfile.ZipFile(filename, "r") as archive:
    archive.extractall("dataset")

# Wczytywanie zbioru danych

In [48]:
import keras

BATCH_SIZE = 32
IMG_SIZE   = (256, 256)

train_set = keras.utils.image_dataset_from_directory(
    directory='dataset/traffic_Data/DATA',
    labels='inferred',
    label_mode='categorical',
    batch_size=BATCH_SIZE,
    image_size=IMG_SIZE
)

test_set = keras.utils.image_dataset_from_directory(
    directory='dataset/traffic_Data/TEST',
    labels=None,
    batch_size=None,    # Ważne! W dalszym etapie będzie spinanie batchem.
    image_size=IMG_SIZE
)

Found 4170 files belonging to 58 classes.
Found 1994 files.


Łączenie danych testowych z etykietami (są tam tylko zdjęcia zamiast segregacji między folderami jako klasy)

In [49]:
import pandas as pd
import tensorflow as tf

df = pd.read_csv("dataset/labels.csv") \
       .sort_values("Name") \
       .reset_index(drop=True)

num_classes = train_set.element_spec[1].shape[-1]
labels_int  = df["ClassId"].astype(int).tolist()
labels_oh   = tf.one_hot(labels_int, depth=num_classes)

label_ds = tf.data.Dataset.from_tensor_slices(labels_oh)
test_set = tf.data.Dataset.zip((test_set, label_ds)).batch(BATCH_SIZE)

In [50]:
num_classes = train_set.element_spec[1].shape[-1]

model = keras.Sequential([
    keras.layers.InputLayer(input_shape=IMG_SIZE+(3,)),
    keras.layers.Rescaling(1./255),
    keras.layers.Conv2D(32, 3, activation='relu'),
    keras.layers.MaxPooling2D(),
    keras.layers.Conv2D(64, 3, activation='relu'),
    keras.layers.MaxPooling2D(),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(num_classes, activation='softmax'),
])



In [51]:
model.summary()

In [52]:
model.compile(
    optimizer = 'adam',
    loss      = 'categorical_crossentropy',
    metrics   = ['accuracy']
)

In [54]:
EPOCHS = 2

history = model.fit(
    train_set,
    epochs          = EPOCHS,
    validation_data = test_set
)

Epoch 1/2
[1m131/131[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 56ms/step - accuracy: 1.0000 - loss: 6.3293e-04 - val_accuracy: 0.0000e+00 - val_loss: 23.3164
Epoch 2/2
[1m131/131[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 60ms/step - accuracy: 1.0000 - loss: 2.9970e-04 - val_accuracy: 0.0172 - val_loss: 21.9078
