# 1. 引入
* 在自动化焊接中，通常使用主动光源技术（线激光器），在焊接工件的表面形成激光线

* 激光线在摄像机成像单元中的偏移数据反映了工件表面的形状信息，利用此信息可以完成焊缝的跟踪

* 焊缝的跟踪通常要提取两个目标信息：

    1. 激光线的提取

    2. 在被提取激光线上定位感兴趣区域（焊缝区域）

* 但在实际工况下，受焊接时弧光飞溅、光照变换的等影响，给这两个目标的完成造成了很大的难度

* 所以本比赛寻求一种快速、精确的提取激光线与焊缝拐点位置的算法

# 2. 参考资料
* [国际自主机器人大赛官网](http://www.running-robot.net/)

* [2022国际自主智能机器人大赛-焊接机器人焊接图片识别赛事](https://aistudio.baidu.com/aistudio/competition/detail/238)


# 3. 赛题介绍
* 对焊接过程中激光传感器返回照片或视频进行分析，最终给出坡口位置的两个关键点

* 需要模型能够对抗包括飞溅、焊渣遮蔽、不规则焊缝等干扰

* 样例结果如下（蓝色矩形框为坡口位置，绿色曲线为激光线的 N 个关键点）：

    ![](https://ai-studio-static-online.cdn.bcebos.com/bc2f19e2ea7e4adab7525e7c883687d396f765ff714748ce9e44cb391705b415)



# 4. 数据集
* 本次比赛提供了 2875 张图像数据及其对应的标注文件

* 图像均为 1920 x 1080 分辨率的 jpg 图像，标注文件为一个对应的 txt 文本文件

* 标注文本中包含两部分的信息:

    1. 激光线位置信息，在文本的第一行，本质是一维数组，数组长度1920，数据类型为整形，变化范围0-1079

    2. 坡口位置信息，在文本的第二行，数据类型为整形，存储顺序 x1，y1，x2，y2

* 标注文本中，行间使用分行符分隔，数字和数字之间以空格符分隔
    
* 数据样例如下：

    |图像|可视化|
    |:-:|:-:|
    |![](https://ai-studio-static-online.cdn.bcebos.com/4df08243a42545d69ca9f9516b4690790d548c3912e048cb8dfc0f9516f5c7c5)|![](https://ai-studio-static-online.cdn.bcebos.com/bc2f19e2ea7e4adab7525e7c883687d396f765ff714748ce9e44cb391705b415)|

    |激光线位置|坡口位置|
    |:-:|:-:|
    |`165 165 165 165 ... 206 206 206 206`|`422 180 1588 220`|

# 5. 算法目标
* 对焊缝坡口的两个预测点相对于标签的偏移程度进行定量的打分，共有三个分数（总分，x轴预测得分，y轴预测得分）

    1. x轴方向预测的数值相对于Label平均每偏移2个像素点，扣1分，满分100最低0分；

    2. y轴方向的预测值相对于Label平均每偏移4个像素点扣1分，满分100最低0分；

    3. 总分为上述x轴预测得分及y轴预测得分的平均分，最后排行榜结果会按照总分进行排序
    
    ![](http://bj.bcebos.com/v1/ai-studio-match/file/f412443875f54c3a8598f09ccf84d5ff2c4f7d46aa584d6cb9dc051410e536dc?authorization=bce-auth-v1%2F0ef6765c1e494918bc0d4c3ca3e5c6d1%2F2022-06-03T07%3A35%3A08Z%2F-1%2F%2Ffae5b80909454619fb760767cdcbbd940833a01f8e86db27608ec2a6680ebfce)

# 6. 比赛基线
## 6.1 基线说明
* 本项目基于 PaddleX 开发，检测坡口位置可以使用简单的目标检测算法实现

* PaddleX 内置了多种常见的深度学习机器视觉算法，如目标检测、图像分类、语义分割等

* 使用 PaddleX 可以简单方便地完成比赛任务所需的模型训练和预测

## 6.2 安装依赖
* 首先需要安装 PaddleX

In [1]:
!pip install paddlex

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting paddlex
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ca/03/b401c6a34685aa698e7c2fbcfad029892cbfa4b562eaaa7722037fef86ed/paddlex-2.1.0-py3-none-any.whl (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
Collecting paddleslim==2.2.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0b/dc/f46c4669d4cb35de23581a2380d55bf9d38bb6855aab1978fdb956d85da6/paddleslim-2.2.1-py3-none-any.whl (310 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m310.9/310.9 KB[0m [31m349.4 kB/s[0m eta [36m0:00:00[0m00:01[0m
Collecting motmetrics
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/41/b019fe934eb811b9aba9b335f852305b804b9c66f098d7e35c2bdb09d1c8/motmetrics-1.2.5-py3-none-any.whl (161 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m161.1/161.1 KB[0m [31m150.5 kB/s[0m eta 

## 6.3 解压数据集
* 比赛数据集已上传至 AIStudio 平台，并挂载在本项目中

* 解压后即可直接使用该数据集

In [2]:
!mkdir ./dataset
!mkdir ./dataset/img
!tar -xf data/data150501/hegd.tar.gz -C ./dataset/img
!unzip -O CP936 data/data150501/测试集.zip  -d ./dataset/test

Archive:  data/data150501/测试集.zip
  inflating: ./dataset/test/测试集/00001.jpg  
  inflating: ./dataset/test/测试集/00002.jpg  
  inflating: ./dataset/test/测试集/00003.jpg  
  inflating: ./dataset/test/测试集/00004.jpg  
  inflating: ./dataset/test/测试集/00005.jpg  
  inflating: ./dataset/test/测试集/00006.jpg  
  inflating: ./dataset/test/测试集/00007.jpg  
  inflating: ./dataset/test/测试集/00008.jpg  
  inflating: ./dataset/test/测试集/00009.jpg  
  inflating: ./dataset/test/测试集/00010.jpg  
  inflating: ./dataset/test/测试集/00011.jpg  
  inflating: ./dataset/test/测试集/00012.jpg  
  inflating: ./dataset/test/测试集/00013.jpg  
  inflating: ./dataset/test/测试集/00014.jpg  
  inflating: ./dataset/test/测试集/00015.jpg  
  inflating: ./dataset/test/测试集/00016.jpg  
  inflating: ./dataset/test/测试集/00017.jpg  
  inflating: ./dataset/test/测试集/00018.jpg  
  inflating: ./dataset/test/测试集/00019.jpg  
  inflating: ./dataset/test/测试集/00020.jpg  
  inflating: ./dataset/test/测试集/00021.jpg  
  inflating: ./dataset/test/测试集/00022.jpg 

## 6.4 数据处理
* 将标注的坡口位置坐标点转换为 VOC 格式的目标检测数据，样例如下：

    ```
    842 456 1150 448 
    ```

    ```xml
    <annotation>
        <filename>2019-12-03_10-38-26_440_4400.jpg</filename>
        <size>
            <height>1080</height>
            <width>1920</width>
            <depth>3</depth>
        </size>
        <object>
            <name>Groove</name>
            <bndbox>
                <xmin>842</xmin>
                <ymin>448</ymin>
                <xmax>1150</xmax>
                <ymax>456</ymax>
            </bndbox>
        </object>
    </annotation>
    ```

In [2]:
import os
import cv2
import random
import numpy as np

from tqdm import tqdm


def get_datas(data_dir, endswith):
    items = [os.path.join(data_dir, item) for item in os.listdir(data_dir)]
    sub_dirs = [item for item in items if os.path.isdir(item)]
    files = [item for item in items if item.endswith(endswith)]
    for sub_dir in sub_dirs:
        _files = get_datas(sub_dir, endswith)
        files += _files
    files.sort()
    return files


def makedirs(dir):
    if not os.path.isdir(dir):
        os.makedirs(dir)


def vis_anno_label(jpg_files, txt_files, data_dir, xml_dir, list_dir, split_num):
    data_list = []
    for jpg_file, txt_file in tqdm(zip(jpg_files, txt_files)):
        item_dir, item_file = os.path.split(jpg_file)
        # complete = jpg_file.split('/')[4:]
        # c = ''
        # for i in complete:
        #     c+=i
        #     c+='/'
        # c = c[:-1]
        # # print(c)

        img = cv2.imdecode(np.fromfile(jpg_file, dtype=np.uint8), -1)
        with open(txt_file, 'r', encoding='UTF-8') as f:
            line, position = [item.split(' ') for item in f.read().split('\n')]

        position = [int(item) for item in position if item]
        x1, y1, x2, y2 = position

        x1_l,x1_r = x1-10,x1+10
        x2_l,x2_r = x2-10,x2+10
        y1_l,y1_r = y1-10,y1+10
        y2_l,y2_r = y2-10,y2+10
        # x1, x2 = min(x1, x2), max(x1, x2)
        # y1, y2 = min(y1, y2), max(y1, y2)

        anno = f'''<annotation>
    <filename>{item_file}</filename>
    <size>
        <height>{img.shape[0]}</height>
        <width>{img.shape[1]}</width>
        <depth>3</depth>
    </size>
    <object>
        <name>Left</name>
        <bndbox>
            <xmin>{x1_l}</xmin>
            <ymin>{y1_l}</ymin>
            <xmax>{x1_r}</xmax>
            <ymax>{y1_r}</ymax>
        </bndbox>
    </object>
    <object>
        <name>Right</name>
        <bndbox>
            <xmin>{x2_l}</xmin>
            <ymin>{y2_l}</ymin>
            <xmax>{x2_r}</xmax>
            <ymax>{y2_r}</ymax>
        </bndbox>
    </object>
</annotation>'''
        makedirs(item_dir.replace(data_dir, xml_dir))
        xml_file = jpg_file.replace(data_dir, xml_dir)[:-4]+'.xml'
        with open(xml_file, 'w', encoding='UTF-8') as f:
            f.write(anno)
        
        data_list.append(f'{jpg_file} {xml_file}\n')
    
    random.shuffle(data_list)
    with open(os.path.join(list_dir, 'train.txt'), 'w', encoding='UTF-8') as f:
        for item in data_list[:split_num]:
            f.write(item)
    with open(os.path.join(list_dir, 'val.txt'), 'w', encoding='UTF-8') as f:
        for item in data_list[split_num:]:
            f.write(item)

    with open(os.path.join(list_dir, 'trainval.txt'), 'w', encoding='UTF-8') as f:
        for item in data_list:
            f.write(item)
    
    with open(os.path.join(list_dir, 'label_list.txt'), 'w', encoding='UTF-8') as f:
        f.write('Left\n')
        f.write('Right\n')

vis_anno_label(get_datas('./dataset/img', '.jpg'),
               get_datas('./dataset/img', '.txt'), 
               './dataset/img', 
               './dataset/xml', 
               './dataset',
               2500)

2875it [00:53, 54.24it/s]


## 6.5 模型训练

In [14]:
import paddlex as pdx
from paddlex import transforms as T


train_transforms = T.Compose([
    # T.RandomCrop(),
    # T.RandomHorizontalFlip(),
    T.Resize(
        target_size=[288, 512],
        interp='LINEAR'
    ),
    T.RandomDistort(),
    T.RandomBlur(prob=0.2),
    T.MixupImage(alpha=1.5, beta=1.5, mixup_epoch=10),
    T.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ),
])

# 验证集数据增强
eval_transforms = T.Compose([
    T.Resize(
        target_size=[288, 512], interp='LINEAR'),
    T.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

# 训练集
train_dataset = pdx.datasets.VOCDetection(
    data_dir='./',
    file_list='./dataset/train.txt',
    label_list='./dataset/label_list.txt',
    transforms=train_transforms,
    num_workers=4,
    shuffle=True
)

# 验证集
eval_dataset = pdx.datasets.VOCDetection(
    data_dir='./',
    file_list='./dataset/val.txt',
    label_list='./dataset/label_list.txt',
    transforms=eval_transforms,
    num_workers=4,
    shuffle=False
)

# 检测模型
num_classes = len(train_dataset.labels)
# print(num_classes)
# model = pdx.det.FasterRCNN(num_classes=2, backbone='ResNet101_vd', aspect_ratios=[1.0], anchor_sizes=[[30], [60]], keep_top_k=20,test_pre_nms_top_n=20, test_post_nms_top_n=20)
# model = pdx.det.YOLOv3(num_classes=num_classes)
model = pdx.det.PPYOLOv2(num_classes=num_classes, backbone='ResNet101_vd_dcn')

# 模型训练
model.train(
    num_epochs=20,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    train_batch_size=16,
    pretrain_weights='COCO',
    learning_rate=0.0001,
    warmup_steps=200,
    warmup_start_lr=0.0,
    save_interval_epochs=10,
    lr_decay_epochs=[25, 28],
    save_dir='./ckpt',
    use_vdl=False,
    log_interval_steps=10
)

2022-06-17 15:38:26 [INFO]	Starting to read file list from dataset...
2022-06-17 15:38:32 [INFO]	2500 samples in file ./dataset/train.txt, including 2500 positive samples and 0 negative samples.
creating index...
index created!
2022-06-17 15:38:32 [INFO]	Starting to read file list from dataset...
2022-06-17 15:38:32 [INFO]	375 samples in file ./dataset/val.txt, including 375 positive samples and 0 negative samples.
creating index...
index created!
2022-06-17 15:38:32 [INFO]	Loading pretrained model from ./ckpt/pretrain/yolov3_mobilenet_v1_270e_coco.pdparams
2022-06-17 15:38:33 [INFO]	There are 235/241 variables loaded into YOLOv3.
2022-06-17 15:39:37 [INFO]	[TRAIN] Epoch=1/20, Step=10/78, loss_xy=3.043134, loss_wh=4.050359, loss_obj=64.077850, loss_cls=0.466878, loss=71.638222, lr=0.000045, time_each_step=6.32s, eta=2:44:33
2022-06-17 15:40:41 [INFO]	[TRAIN] Epoch=1/20, Step=20/78, loss_xy=3.064740, loss_wh=3.220538, loss_obj=42.496613, loss_cls=0.441506, loss=49.223396, lr=0.000095, t

KeyboardInterrupt: 

## 6.6 模型预测
* 测试集的结果需保存为 txt 格式，文件命名为 test-1.txt，然后直接进行压缩，最终的提交结果必须是 test-1.zip

* 提交的结果文件需要参照下列的要求：

    1. 测试集图片信息，在文本的第 0 列，一共 99 张图片
 
    2. 坡口位置信息，在文本的第 1-4 列，数据类型为整形，存储顺序 x1，y1，x2，y2


* 提交文件格式样例如下：

    ```
    00001.jpg 1083 329 1557 349
    00002.jpg 1087 413 1370 428
    00003.jpg 1129 409 1417 428
    00004.jpg 1070 415 1371 419
    ...
    00096.jpg 1060 408 1390 430
    00097.jpg 1112 413 1436 426
    00098.jpg 1097 413 1451 427
    00099.jpg 1131 414 1421 426
    ```

In [5]:
import os
import paddlex as pdx

model_path = './ckpt/best_model'
test_dir = './dataset/test/测试集'
submit_file = './test-1.txt'

model = pdx.load_model(model_path)

test_imgs = os.listdir(test_dir)
# print(test_imgs)
test_files = [os.path.join(test_dir, img_file) for img_file in test_imgs]

# print(test_files)

# results = model.predict(test_files)
# print(results)
results = []
for ti in test_files:
    # print(ti)
    results.append(model.predict(ti))

texts = []
for result, img in zip(results, test_imgs):
    # print(result[0])
    # print(result)
    # print(result[3])
    left_score=0
    right_score=0
    for r in result:
        if r['category_id']==0:
            if r['score']>left_score:
                left_score=r['score']
                x1,y1=round(r['bbox'][0]+0.5*r['bbox'][2]),round(r['bbox'][1]+0.5*r['bbox'][3])
        if r['category_id']==1:
            if r['score']>right_score:
                right_score=r['score']
                x2,y2=round(r['bbox'][0]+0.5*r['bbox'][2]),round(r['bbox'][1]+0.5*r['bbox'][3])
    # print(len(result))
    # print(result)
    # x1, y1, w, h = result[0]['bbox']
    # x2, y2 = x1+w, y1+h
    x1,x2 = min(x1,x2),max(x1,x2)
    y1,y2 = min(y1,y2),max(y1,y2)
    bbox = [int(item) for item in [x1, y1, x2, y2]]
    texts.append('%s %d %d %d %d\n' % (img, *bbox))
texts.sort()

with open(submit_file, 'w', encoding='UTF-8') as f:
    for line in texts:
        f.write(line)

SystemError: (Fatal) Operator gaussian_random raises an paddle::memory::allocation::BadAlloc exception.
The exception content is
:ResourceExhaustedError: 

Out of memory error on GPU 0. Cannot allocate 2.250000MB memory on GPU 0, 31.744873GB memory has been allocated and available memory is only 3.750000MB.

Please check whether there is any other process using GPU 0.
1. If yes, please stop them, or start PaddlePaddle on another GPU.
2. If no, please decrease the batch size of your model. 
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is `export FLAGS_use_cuda_managed_memory=false`.
 (at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:87)
. (at /paddle/paddle/fluid/imperative/tracer.cc:307)


## 6.7 结果可视化
* 使用 PaddleX 可视化 API 对测试集数据检测结果进行可视化，样例如下：

    ![](https://ai-studio-static-online.cdn.bcebos.com/de6e440e51e44034b9ae0d112de6cfd575079ebdde6f4f38bbf55c028d5e37cc)
    
    ![](https://ai-studio-static-online.cdn.bcebos.com/3a59f390ea724da2bd23c0e0b1fa443bd8e59fe75dd74bf990985fee502fca5d)
    
    

In [None]:
for img, result in zip(test_files, results):
    pdx.det.visualize(img, result, threshold=0.5, save_dir='./vis')

## 6.8 结果提交
* 根据提交要求，使用如下命令对结果文件进行压缩

* 前往比赛页面中的 [提交结果](https://aistudio.baidu.com/aistudio/competition/detail/238/0/submit-result) 选项卡中上传压缩文件进行提交

    ![](https://ai-studio-static-online.cdn.bcebos.com/0896fed473204837b99190d377237edc0aa5589e7311452b9d0bc0d557010a55)

* 提交之后，等待系统自动完成评分过程，就可以在下方查看提交的结果的得分详情了

    ![](https://ai-studio-static-online.cdn.bcebos.com/8633e7cc08f546b79d919ff1e31415a29facbad30bfe47f69066af8149cf4e03)

    

In [None]:
!zip test-1.zip test-1.txt

# 7. 优化建议
* 数据层面：

    1. 额外数据：利用额外的激光线位置信息辅助坡口位置定位

    2. 数据增广：尝试使用各种数据增广的方式

* 模型层面：

    1. 超参数调节：学习率 / 训练轮次 等等
    
    2. 更换模型：更换其他目标检测模型算法
    
    3. 预测逻辑：优化预测逻辑，提高准确度

# 8. 尾巴
* 现在越来越多工具的出现使得搭建一个基线项目变得很轻松

* 能够极大的助力比赛打榜的效率,可以腾出更多时间和精力

* 去专注于研发和尝试新的算法模型，新的数据处理方案等等

* 总结一句话就是：善用工具能够事半功倍
