# FreeWheel Logo Detection模型训练指导

## 1. 准备LogoDet-3K数据到S3

请运行如下脚本以安装实验需要使用的依赖，并将原始数据集准备好；请阅读脚本中的注释并进行对应的修改，以适配自己的环境

In [3]:
!/bin/bash ./logodet-prep.sh 1>/dev/null

Cloning into 's5cmd'...
remote: Enumerating objects: 11174, done.[K
remote: Counting objects: 100% (7/7), done.[K
remote: Compressing objects: 100% (6/6), done.[K
remote: Total 11174 (delta 0), reused 4 (delta 0), pack-reused 11167[K
Receiving objects: 100% (11174/11174), 22.52 MiB | 26.59 MiB/s, done.
Resolving deltas: 100% (5601/5601), done.
100%|██████████████████████████████████████| 2.87G/2.87G [00:36<00:00, 85.2MB/s]


处理logodet-3k数据，使其满足yolov8的数据格式

LogoDet-3K/      
|── cfg    
|── datasets    
|$~~~~~~~~~$|── images     
│$~~~~~~~~~$|$~~~~~~~~~$|── train     
|$~~~~~~~~~$|$~~~~~~~~~$|── val     
|$~~~~~~~~~$|── labels     
|$~~~~~~~~~$$~~~~~~~~~~$|── train   
|$~~~~~~~~~$$~~~~~~~~~~$|── val   
|── weights   

- cfg文件夹中存储训练的配置文件，文件中指定训练数据和标注的路径，以及分类关系的映射
- datasets中存储训练数据，请按照如下目录结构组织
- （可选）weights文件夹中存储weights pt文件，可以基于一个以训练好的pt文件加载到模型中，然后进行增量训练。

In [4]:
'''
准备logo3k数据，将logo3k数据集的标注数据转换成yolo格式，具体如何标注请参考此链接：
https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/#12-create-labels
'''

import os
import shutil
import xmltodict
from tqdm import tqdm

# 这个dict存储类别和ID的映射关系
class_names_map = {'logo': 0}

DET_DATA_DIR = 'train_data' # 训练数据目录
LD3K_DATA_DIR = DET_DATA_DIR + '/' + 'LogoDet-3K' # logo3k数据目录
LD3K_DS_DIR = LD3K_DATA_DIR + '/' + 'datasets' # logo3k训练数据集目录
RAW_LD3K_DS_DIR = 'raw_data/LogoDet-3K' # logo3k原始数据存储目录

sub_dirs = os.listdir(RAW_LD3K_DS_DIR)
for sub_dir in tqdm(sub_dirs):
    if 'DS_Store' not in sub_dir:
        sub_sub_dirs = os.listdir(os.path.join(RAW_LD3K_DS_DIR, sub_dir))
        for sub_sub_dir in sub_sub_dirs:
            if 'DS_Store' not in sub_sub_dir:
                filenames = os.listdir(os.path.join(RAW_LD3K_DS_DIR, sub_dir, sub_sub_dir))
                for filename in filenames:
                    old_filename = os.path.join(RAW_LD3K_DS_DIR, sub_dir, sub_sub_dir, filename)
                    if filename.endswith('xml'):
                        new_filename = os.path.join(LD3K_DS_DIR, 'labels/train', sub_dir+'_'+sub_sub_dir+'_'+filename.replace('xml', 'txt'))
                        file_object = open(old_filename, encoding='utf-8')                                                                                                            
                        try:
                            all_the_xmlStr = file_object.read()
                        finally:
                            file_object.close()
                        convertedDict = xmltodict.parse(all_the_xmlStr)
                #         print(len(convertedDict['annotation']['object']))
                        if 'object' in convertedDict['annotation']:
                            fix_width = int(convertedDict['annotation']['size']['width'])
                            fix_height = int(convertedDict['annotation']['size']['height'])
                            
                            objs = convertedDict['annotation']['object']
                            if not isinstance(objs,list):
                                objs = [objs]
                #                 print('objs:', objs)
                            with open(new_filename, 'w') as fout:
                                for annotation in objs:
                                    # class_id = 0
                                    if annotation['name'] not in class_names_map:
                                        class_names_map[annotation['name']] = len(class_names_map)
                                    class_id = class_names_map[annotation['name']]

                                    xmin = int(annotation['bndbox']['xmin'])
                                    ymin = int(annotation['bndbox']['ymin'])
                                    xmax = int(annotation['bndbox']['xmax'])
                                    ymax = int(annotation['bndbox']['ymax'])

                                    w = xmax-xmin
                                    h = ymax-ymin

                                    if w>0 and h>0:
                                        center_x = (xmin+xmax)/2
                                        center_y = (ymin+ymax)/2
                                        fout.write(str(class_id)+' '+str(center_x/fix_width)+' '+str(center_y/fix_height)+' '+str(w/fix_width)+' '+str(h/fix_height)+'\n')
                                        fout.write(str(0)+' '+str(center_x/fix_width)+' '+str(center_y/fix_height)+' '+str(w/fix_width)+' '+str(h/fix_height)+'\n')
                        else:
                            print('Delete', old_filename)
                            os.remove(old_filename)
                    elif filename.endswith('jpg'):
                        new_filename = os.path.join(LD3K_DS_DIR, 'images/train', sub_dir+'_'+sub_sub_dir+'_'+filename)
                        shutil.copy(old_filename, new_filename)
                    else:
                        print('Warning:', old_filename)

100%|██████████| 9/9 [00:51<00:00,  5.71s/it]


In [5]:
'''
将训练数据切分成训练集和验证集
'''

import os
from sklearn.model_selection import train_test_split

filenames = os.listdir(os.path.join(LD3K_DS_DIR, 'images/train'))
train_filenames, test_filenames = train_test_split(filenames, test_size=0.2)
print(len(filenames), len(train_filenames), len(test_filenames))
for filename in tqdm(test_filenames):
    old_filename = os.path.join(LD3K_DS_DIR, 'images/train', filename)
    new_filename = os.path.join(LD3K_DS_DIR, 'images/val', filename)
    shutil.move(old_filename, new_filename)
    
    old_filename = os.path.join(LD3K_DS_DIR, 'labels/train', filename.replace('jpg', 'txt'))
    new_filename = os.path.join(LD3K_DS_DIR, 'labels/val', filename.replace('jpg', 'txt'))
    if os.path.exists(old_filename):
        shutil.move(old_filename, new_filename)
    else:
        print('Not exist:', old_filename)

158654 126923 31731


100%|██████████| 31731/31731 [00:04<00:00, 6545.53it/s] 


In [12]:
print('class_names_map:', len(class_names_map))

# 创建logo3k训练需要的配置文件，训练脚本会加载这个配置文件，因此需要按照sagemaker训练容器中的路径指定训练数据的‘path’
cfg_path = os.path.join(LD3K_DATA_DIR, 'cfg', 'LogoDet-3K.yaml')
with open(cfg_path, 'w') as fout:
    fout.write('path: ' + '/opt/ml/input/data/' + '  # dataset root dir\n')
    fout.write('train: images/train  # train images (relative to \'path\')\n')
    fout.write('val: images/val  # val images (relative to \'path\')\n')
    fout.write('test:  # test images (optional)\n')
    fout.write('names:\n')
    for k,v in class_names_map.items():
        fout.write('  '+str(v)+': '+str(k)+'\n')

class_names_map: 2994


将logo3k数据通过s5cmd上传到s3

In [13]:
# 指定要上传的s3 bucket，和上传到bucket中哪个prefix（文件夹）下
%env TRN_BUCKET=sagemaker-us-west-2-935206693453 
%env PRE=fw-logo-detection

#!aws s3api put-object --bucket $TRN_BUCKET --key $PRE
#!docker run --rm -v $(pwd):/aws -v ~/.aws:/root/.aws s5cmd rm s3://sagemaker-us-west-2-935206693453/fw-logo-detection/* 1>/dev/null
!docker run --rm -v $(pwd):/aws -v ~/.aws:/root/.aws s5cmd sync /aws/train_data s3://$TRN_BUCKET/$PRE/ 1>/dev/null

env: TRN_BUCKET=sagemaker-us-west-2-935206693453
env: PRE=fw-logo-detection


## 2. 基于LogoDet-3K数据进行训练

初始化操作

In [9]:
import os
import sagemaker
print(sagemaker.__version__)

from sagemaker.pytorch import PyTorch
from sagemaker.pytorch.model import PyTorchModel

sagemaker_session = sagemaker.Session()

role = sagemaker.get_execution_role()

2.146.0


准备Sagemaker训练任务的input参数，input中的s3数据会被复制到训练容器的 /opt/ml/input/data/ 目录中。并且会以channel名为子文件夹名存储对应的数据

In [54]:
TRN_BUCKET='sagemaker-us-west-2-935206693453'
PRE='fw-logo-detection'
LD3K_PRE=PRE + '/' + LD3K_DATA_DIR
data_location = 's3://{}/{}'.format(TRN_BUCKET, LD3K_PRE)

logo3k_inputs = {
    'cfg': data_location+'/cfg', 
    #'weights': data_location+'/weights', 
    'images': data_location+'/datasets/images', 
    'labels': data_location+'/datasets/labels'}
print(logo3k_inputs)

{'cfg': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/LogoDet-3K/cfg', 'images': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/LogoDet-3K/datasets/images', 'labels': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/LogoDet-3K/datasets/labels'}


In [57]:
hyperparameters = {'data': '/opt/ml/input/data/cfg/LogoDet-3K.yaml', 
                   'weight': 'yolov8s.pt', # 使用yolo预训练好的参数
                   'project': '/opt/ml/model/',
                   'name': 'fw-logo-detection', 'imgsz': 640, 'batch': 4, 'epochs': 1, 'workers':1}  # Single CPU or GPU
                   # 'name': 'fw-logo-detection', 'imgsz': 640, 'batch': 12, 'epochs': 1, 'device': '0,1,2,3', 'workers':1}  # Multi-GPU: DP Mode

instance_type = 'ml.p3.2xlarge'  # 'ml.p3.2xlarge' or 'ml.p3.8xlarge' or ...


metric_definitions = [{'Name': 'mAP50',
                       'Regex': '^all\s+(?:[\d.]+\s+){4}([\d.]+)'}]

logo3k_estimator = PyTorch(entry_point='train.py',
                            source_dir='./code/',
                            role=role,
                            hyperparameters=hyperparameters,
                            framework_version='1.13.1', # 2.0.1
                            py_version='py39', # py310
                            # framework_version='2.0.1',
                            # py_version='py310',
                            script_mode=True,
                            instance_count=1,  # 1 or 2 or ...
                            metric_definitions=metric_definitions,
                            instance_type=instance_type,
                            # distribution={
                            #     "torch_distributed": {
                            #         "enabled": True
                            #     }
                            # }
            )

estimator.fit(logo3k_inputs)

INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
INFO:sagemaker.image_uris:image_uri is not presented, retrieving image_uri based on instance_type, framework etc.
INFO:sagemaker:Creating training-job with name: pytorch-training-2023-06-20-06-13-39-318


2023-06-20 06:13:39 Starting - Starting the training job.

KeyboardInterrupt: 

## 3. 准备FreeWheel-5K数据到S3

In [29]:
'''
按照video name组织数据，基于#对视频文件名进行拆解
按照dict[str, list]的方式组织，key是video name，value是包含对应文件名的列表
'''
import os
import shutil
from sklearn.model_selection import train_test_split

base_dir = 'train_data/FreeWheel-5K-by-video-name/datasets/images/train'
filenames = os.listdir(base_dir)
video_names = {}
for filename in filenames:
    video_name = filename.split('#')[0]
    if video_name not in video_names:
        video_names[video_name] = []
    video_names[video_name].append(filename)
print('video_names:', len(video_names))

video_names: 1182


In [30]:
'''
按照8:2的比例拆分训练集和数据集
基于上一步video name dict的结果进行拆分，将val数据集从image/train复制到image/val
'''
train_video_names, val_video_names = train_test_split(list(video_names.keys()), test_size=0.2)
for video_name in val_video_names:
    val_filenames = video_names[video_name]
    for filename in val_filenames:
        if filename.endswith('jpg'):
            filename = os.path.join(base_dir, filename)
            shutil.move(filename, filename.replace('images/train', 'images/val'))
            label_filename = filename.replace('jpg', 'txt').replace('images', 'labels')
            if os.path.exists(label_filename):
                shutil.move(label_filename, label_filename.replace('labels/train', 'labels/val'))
print(len(train_video_names), len(val_video_names))

945 237


In [37]:
# 获取logo3k的模型best.pt文件，用于下一步增量训练
logo3k_estimator = PyTorch.attach('pytorch-training-2023-06-19-14-24-52-624')
logo3k_estimator.model_data

INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole



2023-06-19 16:55:03 Starting - Preparing the instances for training
2023-06-19 16:55:03 Downloading - Downloading input data
2023-06-19 16:55:03 Training - Training image download completed. Training in progress.
2023-06-19 16:55:03 Uploading - Uploading generated training model
2023-06-19 16:55:03 Completed - Training job completed


's3://sagemaker-us-west-2-935206693453/pytorch-training-2023-06-19-14-24-52-624/output/model.tar.gz'

In [44]:
# 将s3路径替换为上一步中得到的sm training job的模型数据路径
!docker run --rm -v $(pwd):/aws -v ~/.aws:/root/.aws s5cmd cp 's3://sagemaker-us-west-2-935206693453/pytorch-training-2023-06-19-14-24-52-624/output/model.tar.gz' /aws/ 1>/dev/null
!tar -zxv -C train_data/FreeWheel-5K-by-video-name/weights/ -f model.tar.gz --strip-components=2 fw-logo-detection/weights/best.pt
!rm -rf model.tar.gz

fw-logo-detection/weights/best.pt


In [46]:
'''
创建freewheel-5k的yolo config文件
'''
class_names_map = {}

with open('raw_data/image-data-4.28/labels.txt', 'r') as fin:
    lines = fin.readlines()
    for i, line in enumerate(lines):
        class_names_map[line.strip()] = i

with open('train_data/FreeWheel-5K-by-video-name/cfg/FreeWheel-5K-by-video-name.yaml', 'w') as fout:
    fout.write('path: /opt/ml/input/data/  # dataset root dir\n')
    fout.write('train: images/train  # train images (relative to \'path\')\n')
    fout.write('val: images/val  # val images (relative to \'path\')\n')
    fout.write('test:  # test images (optional)\n')
    fout.write('names:\n')
    for k,v in class_names_map.items():
        fout.write('  '+str(v)+': '+str(k)+'\n')

In [47]:
# 将数据同步到s3中
!docker run --rm -v $(pwd):/aws -v ~/.aws:/root/.aws s5cmd sync /aws/train_data s3://$TRN_BUCKET/$PRE/ 1>/dev/null

## 3. 基于FW-5k数据进行训练

In [58]:
DET_DATA_DIR = 'train_data'
FW5K_DATA_DIR = DET_DATA_DIR + '/' + 'FreeWheel-5K-by-video-name'
FW5K_DS_DIR = FW5K_DATA_DIR + '/' + 'datasets'

TRN_BUCKET='sagemaker-us-west-2-935206693453'
PRE='fw-logo-detection'
FW5K_PRE=PRE + '/' + FW5K_DATA_DIR
data_location = 's3://{}/{}'.format(TRN_BUCKET, FW5K_PRE)
fw5k_inputs = {'cfg': data_location+'/cfg', 'weights': data_location+'/weights', 'images': data_location+'/datasets/images', 'labels': data_location+'/datasets/labels'}
print(fw5k_inputs)

{'cfg': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/FreeWheel-5K-by-video-name/cfg', 'weights': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/FreeWheel-5K-by-video-name/weights', 'images': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/FreeWheel-5K-by-video-name/datasets/images', 'labels': 's3://sagemaker-us-west-2-935206693453/fw-logo-detection/train_data/FreeWheel-5K-by-video-name/datasets/labels'}


In [60]:
hyperparameters = {'data': '/opt/ml/input/data/cfg/FreeWheel-5K-by-video-name.yaml', 
                   'weight': '/opt/ml/input/data/weights/best.pt',
                   'project': '/opt/ml/model/',
                   'name': 'fw-logo-detection', 'imgsz': 640, 'batch': 4, 'epochs': 20}  # Single CPU or GPU
#                    'name': 'fw-logo-detection', 'imgsz': 640, 'batch': 16, 'epochs': 5, 'device': '0,1,2,3'}  # Multi-GPU: DP Mode

instance_type = 'ml.p3.2xlarge'  # 'ml.p3.2xlarge' or 'ml.p3.8xlarge' or ...


metric_definitions = [{'Name': 'mAP50',
                       'Regex': '^all\s+(?:[\d.]+\s+){4}([\d.]+)'}]

fw5k_estimator = PyTorch(entry_point='train.py',
                            source_dir='./code/',
                            role=role,
                            hyperparameters=hyperparameters,
                            framework_version='1.13.1',
                            py_version='py39',
                            script_mode=True,
                            instance_count=1,  # 1 or 2 or ...
                            metric_definitions=metric_definitions,
                            instance_type=instance_type)

estimator.fit(fw5k_inputs)

INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole
INFO:sagemaker.image_uris:image_uri is not presented, retrieving image_uri based on instance_type, framework etc.
INFO:sagemaker:Creating training-job with name: pytorch-training-2023-06-20-06-15-09-227


2023-06-20 06:15:09 Starting - Starting the training job.

KeyboardInterrupt: 