**Giới thiệu**

* Mô hình fask mask detection qua camera theo dõi hành vi đeo khẩu trang nơi công cộng.

* Ngày này, có rất nhiều virus truyền nhiễm qua đường hô hấp khi giao tiếp. Đeo khẩu trang nhằm tạo ra rào cản khi giao tiếp  giúpngăn ngừa và bảo vệ người đeo khỏi bị lây nhiễm các loại vi khuẩn, vi-rút, bụi bặm thông qua đường hô hấp.

*  Bộ dữ liệu được thu thập từ Kaggle: https://www.kaggle.com/datasets/andrewmvd/face-mask-detection?fbclid=IwY2xjawNJ-2JleHRuA2FlbQIxMQABHliTM7BPKg6MLRw_TwKuHo-w_NFd4Ic1H4LN3NBmPiwstLYPt4DuKijCELqk_aem_YfIg5F96RnGGx4aNknP2aA

* Bộ dữ liệu gồm gần 1000 ảnh màu đươc chia thành 3 lớp: with_mask, without_mask và mask_weared_incorrect.




**Thư viện và tải dữ liệu**

* google.colab.drive kết nối với google drive để lấy dữ liệu và ghi dữ liệu.

* os cho phép tương tác với hệ điều hành, dùng để liệt kê các tệp ảnh, chú thích, nối các đường dẫn với nhau, tách tên têp ra khỏi đuôi, tạo thư mục mới.

* cv2 chuyên sử lý thị giác máy tính , dùng để đọc tệp ảnh từ ổ đĩa, phát hiện các ảnh bị hỏng.

* xml.etree.ElementTree được sử dụng cho việc đọc, phân tích các tệp định dạng xml, tìm các thẻ, lấy dữ liệu bên trong thẻ.

* collections.Counter đếm số lần suất hiện các lớp.

* shutil sao chép tệp từ thư mục gốc vào các thư mục chia.

* sklearn.model_selecton -> train_test_split trong scikit-learn để chia bộ dữ liệu thành các mục train/val/test.

* random chọn ngẫu nhiên tiệp từ danh sách gốc

In [5]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [None]:
!ls /content/drive/MyDrive/Face_mask_project


data  face_mask.yaml  notebooks  results  runs	yolo11n.pt  yolov8s.pt


In [11]:
import os
import cv2
import xml.etree.ElementTree as ET
from collections import Counter
import shutil
from sklearn.model_selection import train_test_split
import random

**Phân tích, khám phá dữ liệu**

* 853 ảnh màu, mỗi ảnh đều có annotation tương ứng.

* Tất cả các ảnh đều đọc được.

* 4072 object chia không đều vào các lớp, đây là một imbalance data.

* Không có annotations nào bị lỗi. Một số bị over bouding box nhưng tỉ lệ rất nhỏ vẫn sử dụng đươc mà không bi ảnh hưởng.

* Kích cỡ ảnh trung bình là 370.59 x 309.29 và kích cỡ bbox trung bình là 31.15 x 35.00.

In [7]:
BASE_DIR = '/content/drive/MyDrive/Face_mask_project'
DATA_DIR = f'{BASE_DIR}/data'
IMG_DIR  = f'{DATA_DIR}/images'
ANN_DIR  = f'{DATA_DIR}/annotations'

print("BASE_DIR:", BASE_DIR)
print("IMG_DIR:", IMG_DIR)
print("ANN_DIR:", ANN_DIR)


BASE_DIR: /content/drive/MyDrive/Face_mask_project
IMG_DIR: /content/drive/MyDrive/Face_mask_project/data/images
ANN_DIR: /content/drive/MyDrive/Face_mask_project/data/annotations


In [9]:
image_files = sorted([f for f in os.listdir(IMG_DIR) if f.lower().endswith(('.jpg', '.jpeg', '.png'))])
anno_files  = sorted([f for f in os.listdir(ANN_DIR) if f.lower().endswith('.xml')])

print(f"Number of images: {len(image_files)}")
print(f"Number of annotations: {len(anno_files)}")

image_names = {os.path.splitext(f)[0] for f in image_files}
anno_names  = {os.path.splitext(f)[0] for f in anno_files}

missing_annos = image_names - anno_names
missing_images = anno_names - image_names

print(f"Images without annotations: {len(missing_annos)}")
print(f"Annotations without images: {len(missing_images)}")


Number of images: 853
Number of annotations: 853
Images without annotations: 0
Annotations without images: 0


In [None]:
broken_images = []

for f in image_files:
    path = os.path.join(IMG_DIR, f)
    img = cv2.imread(path)
    if img is None:
        broken_images.append(f)

print(f"Number of broken images: {len(broken_images)}")


Number of broken images: 0


In [10]:
VALID_CLASSES = ['with_mask', 'without_mask', 'mask_weared_incorrect']
class_counter = Counter()
broken_annos, invalid_bbox_files = [], []
total_objects = total_img_count = total_bbox_count = 0
total_img_width = total_img_height = total_bbox_width = total_bbox_height = 0

for f in anno_files:
    path = os.path.join(ANN_DIR, f)
    try:
        tree = ET.parse(path)
        root = tree.getroot()
        #parse XML
        size = root.find('size')
        if size is None:
            broken_annos.append(f"Missing <size> tag: {f}")
            continue

        w, h = int(size.find('width').text), int(size.find('height').text)
        total_img_width += w
        total_img_height += h
        total_img_count += 1

        objects = root.findall('object')
        if not objects:
            broken_annos.append(f"No object tag: {f}")
            continue

        max_x_over = max_y_over = 0
        has_invalid_bbox = False
        invalid_classes = []

        #classes
        for obj in objects:
            name = obj.find('name').text
            class_counter[name] += 1
            total_objects += 1
            if name not in VALID_CLASSES:
                invalid_classes.append(name)

            #bounding-box
            b = obj.find('bndbox')
            xmin, ymin, xmax, ymax = map(int, [b.find(tag).text for tag in ('xmin', 'ymin', 'xmax', 'ymax')])
            bbox_w, bbox_h = xmax - xmin, ymax - ymin
            total_bbox_width += bbox_w
            total_bbox_height += bbox_h
            total_bbox_count += 1

            #over bouding-box
            x_over = max(0, -xmin, xmax - w) / w * 100
            y_over = max(0, -ymin, ymax - h) / h * 100
            if x_over > 0 or y_over > 0:
                has_invalid_bbox = True
                max_x_over, max_y_over = max(max_x_over, x_over), max(max_y_over, y_over)

        if invalid_classes:
            broken_annos.append(f"Invalid class {invalid_classes} in file: {f}")
        if has_invalid_bbox:
            invalid_bbox_files.append((f, max_x_over, max_y_over))

    except Exception as e:
        broken_annos.append(f"Error parsing {f}: {e}")

avg_bbox_w = total_bbox_width / total_bbox_count if total_bbox_count else 0
avg_bbox_h = total_bbox_height / total_bbox_count if total_bbox_count else 0
avg_img_w  = total_img_width / total_img_count if total_img_count else 0
avg_img_h  = total_img_height / total_img_count if total_img_count else 0

print(f"Total objects: {total_objects}")
for cls, nb in class_counter.items():
    status = "(VALID)" if cls in VALID_CLASSES else "(INVALID)"
    print(f"{cls}: {nb} {status}")

print(f"\n Broken annotations: {len(broken_annos)}")

print(f"\n Invalid bounding boxes: {len(invalid_bbox_files)}")
for fname, x, y in invalid_bbox_files:
    print(f" - {fname}: X exceed={x:.2f}%, Y exceed={y:.2f}%")

print(f"\n Average image size: {avg_img_w:.2f} x {avg_img_h:.2f}")
print(f" Average bounding box size: {avg_bbox_w:.2f} x {avg_bbox_h:.2f}")


Total objects: 4072
without_mask: 717 (VALID)
with_mask: 3232 (VALID)
mask_weared_incorrect: 123 (VALID)

 Broken annotations: 0

 Invalid bounding boxes: 11
 - maksssksksss110.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss231.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss251.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss457.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss5.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss501.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss603.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss616.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss706.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss787.xml: X exceed=0.25%, Y exceed=0.00%
 - maksssksksss93.xml: X exceed=0.25%, Y exceed=0.00%

 Average image size: 370.59 x 309.29
 Average bounding box size: 31.15 x 35.00


**Preprocessing**

* Chia dữ liệu thành 3 phần: 70% train, 20% val và 10% test.

* Kiểm tra lại các dữ liệu đã chia.

* Chuyển đổi dãn nhãn chứ thành dạng số.

* Chuyển đổi dữ liệu ảnh sang txt để sủ dụng YOLO.

* Oversample dữ liệu train giúp cân bằng dữ liệu, mô hình học tốt hơn.

In [8]:
TRAIN_RATIO = 0.7
VAL_RATIO   = 0.2
TEST_RATIO  = 0.1
SEED = 42

pairs, labels = [], []
for xml_file in anno_files:
    base = os.path.splitext(xml_file)[0]
    for ext in ['.jpg', '.jpeg', '.png']:
        img_file = base + ext
        if img_file in image_files:
            pairs.append((img_file, xml_file))
            tree = ET.parse(os.path.join(ANN_DIR, xml_file))
            objs = tree.getroot().findall('object')
            labels.append(objs[0].find('name').text if objs else 'no_object')
            break

#Split train+val/test
trainval, test, y_trainval, y_test = train_test_split(
    pairs, labels, test_size=TEST_RATIO, random_state=SEED, stratify=labels
)
#Split train/val
val_ratio_adj = VAL_RATIO / (1 - TEST_RATIO)
train, val, _, _ = train_test_split(
    trainval, y_trainval, test_size=val_ratio_adj, random_state=SEED, stratify=y_trainval
)

#Copy files
for split_name, split_pairs in zip(['train','val','test'], [train,val,test]):
    img_out = os.path.join(DATA_DIR, split_name, 'images')
    ann_out = os.path.join(DATA_DIR, split_name, 'annotations')
    os.makedirs(img_out, exist_ok=True)
    os.makedirs(ann_out, exist_ok=True)
    for img_file, xml_file in split_pairs:
        shutil.copy(os.path.join(IMG_DIR, img_file), os.path.join(img_out, img_file))
        shutil.copy(os.path.join(ANN_DIR, xml_file), os.path.join(ann_out, xml_file))

for split_name, split_pairs in zip(['train','val','test'], [train,val,test]):
    print(f"{split_name}: {len(split_pairs)} pairs")

train: 596 pairs
val: 171 pairs
test: 86 pairs


In [11]:
splits = ['train', 'val', 'test']

for split in splits:
    ann_dir = os.path.join(DATA_DIR, split, 'annotations')
    class_counter = Counter()
    for xml_file in os.listdir(ann_dir):
        if not xml_file.lower().endswith('.xml'):
            continue
        tree = ET.parse(os.path.join(ann_dir, xml_file))
        root = tree.getroot()
        for obj in root.findall('object'):
            class_name = obj.find('name').text
            class_counter[class_name] += 1
    total_objects = sum(class_counter.values())
    print(f"\n{split.upper()}")
    print(f"Total objects: {total_objects}")
    for cls, count in class_counter.items():
        print(f"{cls}: {count}")



TRAIN
Total objects: 3770
without_mask: 663
with_mask: 3000
mask_weared_incorrect: 107

VAL
Total objects: 1481
without_mask: 241
with_mask: 1197
mask_weared_incorrect: 43

TEST
Total objects: 650
with_mask: 513
without_mask: 113
mask_weared_incorrect: 24


In [10]:
def check_dataset(img_dir, ann_dir):
    image_files = sorted([f for f in os.listdir(img_dir) if f.lower().endswith(('.jpg','.jpeg','.png'))])
    anno_files  = sorted([f for f in os.listdir(ann_dir) if f.lower().endswith('.xml')])

    image_names = {os.path.splitext(f)[0] for f in image_files}
    anno_names  = {os.path.splitext(f)[0] for f in anno_files}

    missing_annos = image_names - anno_names
    missing_images = anno_names - image_names

    print(f"Folder: {img_dir}")
    print(f"Images without annotations: {len(missing_annos)}")
    print(f"Annotations without images: {len(missing_images)}")

check_dataset(os.path.join(DATA_DIR,'train/images'), os.path.join(DATA_DIR,'train/annotations'))
check_dataset(os.path.join(DATA_DIR,'val/images'),   os.path.join(DATA_DIR,'val/annotations'))
check_dataset(os.path.join(DATA_DIR,'test/images'),  os.path.join(DATA_DIR,'test/annotations'))


Folder: /content/drive/MyDrive/Face_mask_project/data/train/images
Images without annotations: 0
Annotations without images: 0
Folder: /content/drive/MyDrive/Face_mask_project/data/val/images
Images without annotations: 0
Annotations without images: 0
Folder: /content/drive/MyDrive/Face_mask_project/data/test/images
Images without annotations: 0
Annotations without images: 0


In [24]:
FOLDERS = ["train", "val", "test"]

CLASS_MAP = {
    'with_mask': 0,
    'without_mask': 1,
    'mask_weared_incorrect': 2
}

# Convert XML to YOLO
def convert_xml_to_yolo(xml_file, classes):
    tree = ET.parse(xml_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    yolo_lines = []

    for obj in root.findall('object'):
        cls_name = obj.find('name').text
        if cls_name not in classes:
            continue
        cls_id = classes[cls_name]

        bbox = obj.find('bndbox')
        xmin = float(bbox.find('xmin').text)
        ymin = float(bbox.find('ymin').text)
        xmax = float(bbox.find('xmax').text)
        ymax = float(bbox.find('ymax').text)

        x_center = (xmin + xmax) / 2.0 / w
        y_center = (ymin + ymax) / 2.0 / h
        width = (xmax - xmin) / w
        height = (ymax - ymin) / h

        yolo_lines.append(f"{cls_id} {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}")

    return yolo_lines

for folder in FOLDERS:
    img_dir = Path(DATA_DIR) / folder
    label_dir = img_dir / "labels"
    label_dir.mkdir(parents=True, exist_ok=True)

    xml_dir = Path(DATA_DIR)/ folder / "annotations"
    xml_files = list(xml_dir.glob("*.xml"))
    print(f"Processing {len(xml_files)} XMLs in {folder} ...")


    for xml_file in xml_files:
        yolo_lines = convert_xml_to_yolo(xml_file, CLASS_MAP)
        if not yolo_lines:
            continue

        txt_file = label_dir / f"{xml_file.stem}.txt"

        with open(txt_file, "w") as f:
            f.write("\n".join(yolo_lines))

    print(f"Finished {folder}: labels saved in {label_dir}")

print("Finish transfer train/val/test!")




Processing 781 XMLs in train ...
Finished train: labels saved in /content/drive/MyDrive/Face_mask_project/data/train/labels
Processing 310 XMLs in val ...
Finished val: labels saved in /content/drive/MyDrive/Face_mask_project/data/val/labels
Processing 161 XMLs in test ...
Finished test: labels saved in /content/drive/MyDrive/Face_mask_project/data/test/labels
Finish transfer train/val/test!


In [12]:
TRAIN_DATA_DIR = "/content/drive/MyDrive/Face_mask_project/data/train"
TRAIN_IMGS_DIR = os.path.join(TRAIN_DATA_DIR, "images")
TRAIN_LABELS_DIR = os.path.join(TRAIN_DATA_DIR, "labels")

OVERSAMPLED_DIR = "/content/drive/MyDrive/Face_mask_project/data/train_oversampled"
OV_IMG = os.path.join(OVERSAMPLED_DIR, "images")
OV_LBL = os.path.join(OVERSAMPLED_DIR, "labels")

os.makedirs(OV_IMG, exist_ok=True)
os.makedirs(OV_LBL, exist_ok=True)

class_images = {0: [], 1: [], 2: []}

for lbl_filename in os.listdir(TRAIN_LABELS_DIR):
    if not lbl_filename.lower().endswith('.txt'):
        continue

    lbl_file_path = os.path.join(TRAIN_LABELS_DIR, lbl_filename)

    with open(lbl_file_path) as f:
        lines = f.readlines()
        if not lines:
            continue

        classes_in_file = set(int(line.split()[0]) for line in lines)

        file_stem = os.path.splitext(lbl_filename)[0]
        img_filename = f"{file_stem}.png"
        img_file_path = os.path.join(TRAIN_IMGS_DIR, img_filename)

        for cls in classes_in_file:
            class_images[cls].append((img_file_path, lbl_file_path))

# 3/2/1
class_sizes = {cls: len(files) for cls, files in class_images.items()}
max_class = max(class_sizes, key=class_sizes.get)
base_count = class_sizes[max_class]

target_counts = {
    max_class: base_count,
}
other_classes = [c for c in class_images.keys() if c != max_class]
target_counts[other_classes[0]] = int(base_count * 2/3)
target_counts[other_classes[1]] = int(base_count * 1/3)

#Oversampling and copy
for cls, files in class_images.items():
    n_needed = target_counts[cls]
    if len(files) == 0:
        continue

    n_repeat = n_needed // len(files)
    remainder = n_needed % len(files)
    all_files = files * n_repeat + random.sample(files, remainder)

    for img_file, lbl_file in all_files:
        img_filename = os.path.basename(img_file)
        dst_img = os.path.join(OV_IMG, img_filename)

        lbl_filename = os.path.basename(lbl_file)
        dst_lbl = os.path.join(OV_LBL, lbl_filename)

        if not os.path.exists(dst_img):
            shutil.copy(img_file, dst_img)
        if not os.path.exists(dst_lbl):
            shutil.copy(lbl_file, dst_lbl)

print("Dataset oversampled:", OVERSAMPLED_DIR)


Dataset oversampled: /content/drive/MyDrive/Face_mask_project/data/train_oversampled
