# Human Pose Estimation(HPE)

* 첫 번째 방법은 Top-down 방법입니다.**(여기서 다루는 내용)**

    모든 사람의 정확한 keypoint 를 찾기 위해 object detection 을 사용합니다.
    crop 한 이미지 내에서 keypoint 를 찾아내는 방법으로 표현합니다.
    detector가 선행되어야 하고 모든 사람마다 알고리즘을 적용해야 하기 때문에 사람이 많이 등장할 때는 느리다는 단점이 있습니다.  


* 두 번째 방법은 Bottom-up 방법입니다.

    detector가 없고 keypoint 를 먼저 검출합니다.
    예를 들어 손목에 해당하는 모든 점들을 검출합니다.
    한 사람에 해당하는 keypoint 를 clustering 합니다.
    detector 가 없기 때문에 다수의 사람이 영상에 등장하더라도 속도 저하가 크지 않습니다. 반면 top down 방식에 비해 keypoint 검출 범위가 넓어 성능이 떨어진다는 단점이 있습니다.

 Pose 는 face landmark 랑 비슷해요. human pose estimation 은 keypoint 의 localization 문제를 푼다는 점에서 비슷합니다. 하지만 손목, 팔꿈치 등의 joint keypoint 정보는 얼굴의 keypoint 보다 훨씬 다양한 위치와 변화를 보입니다.

위 이미지에서 볼 수 있듯이 손이 얼굴을 가리는 행위, 모든 keypoint 가 영상에 담기지 않는 등 invisible , occlusions, clothing, lighting change 가 face landmark 에 비해 더 어려운 환경을 만들어 냅니다.

딥러닝 기반 방법이 적용되기 전에는 다양한 사전 지식이 사용되었습니다.
가장 기본이 되는 아이디어는 "인체는 변형 가능 부분으로 나누어져 있고 각 부분끼리 연결성을 가지고 있다." 는 것입니다.

#### 데이터 전처리하기

In [25]:
# 주의! ray를 tensorflow보다 먼저 import하면 오류가 발생할 수 있습니다
import io, json, os, math

import tensorflow as tf
from tensorflow.keras.layers import Add, Concatenate, Lambda
from tensorflow.keras.layers import Input, Conv2D, ReLU, MaxPool2D
from tensorflow.keras.layers import UpSampling2D, ZeroPadding2D
from tensorflow.keras.layers import BatchNormalization
import ray

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import pickle

PROJECT_PATH = os.getenv('HOME') + '/aiffel/mpii'
IMAGE_PATH = os.path.join(PROJECT_PATH, 'images')
MODEL_PATH = os.path.join(PROJECT_PATH, 'models')
TFRECORD_PATH = os.path.join(PROJECT_PATH, 'tfrecords_mpii')
TRAIN_JSON = os.path.join(PROJECT_PATH, 'mpii_human_pose_v1_u12_2', 'train.json')
VALID_JSON = os.path.join(PROJECT_PATH, 'mpii_human_pose_v1_u12_2', 'validation.json')

BEST_MODEL_PATH = os.path.join(PROJECT_PATH, 'best_models') # 수정

print('슝=3')

슝=3


* **json 파싱하기**  

이전 스텝에서 train.json과 validation.json 파일이 소개된 것을 기억하시나요? 이 파일들은 이미지에 담겨 있는 사람들의 pose keypoint 정보들을 가지고 있어서 Pose Estimation을 위한 label로 삼을 수 있습니다.

우선 json이 어떻게 구성되어 있는지 파악해 보기 위해 json 파일을 열어 샘플로 annotation 정보를 1개만 출력해 봅시다. json.dumps()를 활용해서 좀 더 명확하게 하면 더욱 좋습니다.

In [26]:
with open(TRAIN_JSON) as train_json:
    train_annos = json.load(train_json) # json파일객체을 열어서
    json_formatted_str = json.dumps(train_annos[0], indent=2) # 해당부분을 문자열로 변환
    print(json_formatted_str)

{
  "joints_vis": [
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1
  ],
  "joints": [
    [
      620.0,
      394.0
    ],
    [
      616.0,
      269.0
    ],
    [
      573.0,
      185.0
    ],
    [
      647.0,
      188.0
    ],
    [
      661.0,
      221.0
    ],
    [
      656.0,
      231.0
    ],
    [
      610.0,
      187.0
    ],
    [
      647.0,
      176.0
    ],
    [
      637.0201,
      189.8183
    ],
    [
      695.9799,
      108.1817
    ],
    [
      606.0,
      217.0
    ],
    [
      553.0,
      161.0
    ],
    [
      601.0,
      167.0
    ],
    [
      692.0,
      185.0
    ],
    [
      693.0,
      240.0
    ],
    [
      688.0,
      313.0
    ]
  ],
  "image": "015601864.jpg",
  "scale": 3.021046,
  "center": [
    594.0,
    257.0
  ]
}


joints 가 우리가 label 로 사용할 keypoint 의 label 입니다. 이미지 형상과 사람의 포즈에 따라 모든 label 이 이미지에 나타나지 않기 때문에 joints_vis 를 이용해서 실제로 사용할 수 있는 keypoint 인지 나타냅니다. MPII 의 경우 1 (visible) / 0(non) 으로만 나누어지기 때문에 조금 더 쉽게 사용할 수 있습니다. coco 의 경우 2 / 1 / 0 으로 표현해서 occlusion 상황까지 label 화 되어 있습니다.

joints 순서는 아래와 같은 순서로 배치되어 저장해 뒀습니다.

>0 - 오른쪽 발목  
1 - 오른쪽 무릎  
2 - 오른쪽 엉덩이    
3 - 왼쪽 엉덩이  
4 - 왼쪽 무릎  
5 - 왼쪽 발목  
6 - 골반  
7 - 가슴(흉부)  
8 - 목  
9 - 머리 위  
10 - 오른쪽 손목  
11 - 오른쪽 팔꿈치  
12 - 오른쪽 어깨  
13 - 왼쪽 어깨  
14 - 왼쪽 팔꿈치  
15 - 왼쪽 손목  

scale과 center는 사람 몸의 크기와 중심점 입니다.scale은 200을 곱해야 온전한 크기가 됩니다. 추후에 전처리 과정에서 200을 곱해서 사용할 예정이에요.

이제 json annotation 을 파싱하는 함수를 만들어 보겠습니다.

> MPII 데이터셋에서 'scale'은 사람의 크기 비율을 나타내는데, 관절의 위치 및 사람의 신체 부분 크기를 상대적으로 표현하는 데 사용됩니다.
일반적으로 관절의 좌표는 이미지의 크기에 따라 상대적으로 바뀌는 값이기 때문에 이를 보정하기 위해 scale 값을 제공하는 것이며, 이를 통해서 다른 크기의 이미지에서도 신체 부위의 위치를 일관성 있게 분석할 수 있습니다.

In [27]:
def parse_one_annotation(anno, image_dir):
    filename = anno['image']
    joints = anno['joints']
    joints_visibility = anno['joints_vis']
    annotation = {
        'filename': filename,
        'filepath': os.path.join(image_dir, filename),
        'joints_visibility': joints_visibility,
        'joints': joints,
        'center': anno['center'],
        'scale' : anno['scale']
    }
    return annotation

print('슝=3')

슝=3


한 번 parse_one_annotation()함수를 테스트 해봅시다.

In [28]:
with open(TRAIN_JSON) as train_json:
    train_annos = json.load(train_json) # json파일객체을 열어서
    test = parse_one_annotation(train_annos[0], IMAGE_PATH)
    print(test)

{'filename': '015601864.jpg', 'filepath': '/aiffel/aiffel/mpii/images/015601864.jpg', 'joints_visibility': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'joints': [[620.0, 394.0], [616.0, 269.0], [573.0, 185.0], [647.0, 188.0], [661.0, 221.0], [656.0, 231.0], [610.0, 187.0], [647.0, 176.0], [637.0201, 189.8183], [695.9799, 108.1817], [606.0, 217.0], [553.0, 161.0], [601.0, 167.0], [692.0, 185.0], [693.0, 240.0], [688.0, 313.0]], 'center': [594.0, 257.0], 'scale': 3.021046}


### TFRecord 파일 만들기

이전까지는 tf.keras 의 ImageDataGenerator 를 이용해서 주로 학습 데이터를 읽었습니다. 하지만 실제 프로젝트에서는 튜토리얼 데이터셋보다 훨씬 큰 크기의 데이터를 다뤄야 합니다.

학습을 많이 해볼수록 학습 속도에 관심을 가지게 되는데요. tensorflow 튜토리얼 문서에는 다음과 같은 표현으로 나타나 있습니다.

unless you are using tf.data and reading data is still the bottleneck to training.

일반적으로 학습 과정에서 gpu 의 연산 속도보다 HDD I/O 가 느리기 때문에 병목 현상이 발생하고 대단위 프로젝트 실험에서 효율성이 떨어지는 것을 관찰할 수 있습니다. (답답해요..)

따라서 "학습 데이터를 어떻게 빠르게 읽는가?" 에 대한 고민을 반드시 수행하셔야 더 많은 실험을 할 수 있습니다.

* 학습 속도를 향상시키기 위해서 데이터 관점에서 고려해야하는 단계는 어떤 단계인가요? 속도 향상을 위한 처리 방법을 위 링크에서 찾아대답해 봅시다.

> data read(또는 prefetch) 또는 데이터 변환 단계. gpu 학습과 병렬적으로 수행되도록 prefetch를 적용해야 함. 수행방법은 tf.data의 map 함수를 이용하고 cache 에 저장해두는 방법을 사용해야함.

내용이 꽤 어렵습니다만 tf 에서는 위 변환을 자동화해주는 도구를 제공합니다. 데이터셋을 TFRecord 형태로 표현하는 것인데요. TFRecord 는 binary record sequence 를 저장하기 위한 형식입니다.

내부적으로 protocol buffer 라는 것을 이용합니다.

protocol buffer 는 크로스 플랫폼에서 사용할 수 있는 직렬화 데이터 라이브러리라고 생각하시면 됩니다.

데이터를 직렬화 한다는 것은 의미가 있는 정보만 추출해 나열하는 것을 뜻합니다. 예를 들어 "서울 강동구에 사는 32살 고길동"님을 나타내기 위해 "서울강동32고길동"과 같이 쓴다면 데이터만 담을 수 있겠죠? 다만 데이터 형식을 고정해야 하기 때문에 첫 두 글자는 시를 뜻하고, 다음 두 글자는 구를 뜻하고 다음 두 자리 숫자가 오고 이름은 세 글자여야 하는 제한이 생깁니다. 100살이 넘는 사람이나 이름이 세 글자가 아닌 사람을 담을 수 없어요. 그래서 매우 정형화된 데이터는 직렬화 하면 데이터 크기와 처리 속도 측면에서 유리하지만 사용하는데 제한이 있다고 기억하면 좋습니다. 어떤 데이터에서 어떤 방법으로 직렬화를 수행할지에 따라 발생되는 제한이 달라질 수도 있고요.

이제 구현을 시작하겠습니다. 하나하나 천천히 가보죠. **앞서 추출한 annotation을 TFRecord로 변환하는 함수를 만들게요. TFRecord 는 tf.train.Example들의 합으로 이루어지므로 하나의 annotation을 하나의 tf.train.Example로 만들어 주는 함수부터 작성합니다.**

In [29]:
def generate_tfexample(anno):

    # byte 인코딩을 위한 함수
    def _bytes_feature(value):
        if isinstance(value, type(tf.constant(0))):
            value = value.numpy()
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

    filename = anno['filename']
    filepath = anno['filepath']
    with open(filepath, 'rb') as image_file:
        content = image_file.read()

    image = Image.open(filepath)
    if image.format != 'JPEG' or image.mode != 'RGB':
        image_rgb = image.convert('RGB')
        with io.BytesIO() as output:
            image_rgb.save(output, format="JPEG", quality=95)
            content = output.getvalue()

    width, height = image.size
    depth = 3

    c_x = int(anno['center'][0])
    c_y = int(anno['center'][1])
    scale = anno['scale']

    x = [
        int(joint[0]) if joint[0] >= 0 else int(joint[0]) 
        for joint in anno['joints']
    ]
    y = [
        int(joint[1]) if joint[1] >= 0 else int(joint[0]) 
        for joint in anno['joints']
    ]

    v = [0 if joint_v == 0 else 2 for joint_v in anno['joints_visibility']]

    feature = {
        'image/height':
        tf.train.Feature(int64_list=tf.train.Int64List(value=[height])),
        'image/width':
        tf.train.Feature(int64_list=tf.train.Int64List(value=[width])),
        'image/depth':
        tf.train.Feature(int64_list=tf.train.Int64List(value=[depth])),
        'image/object/parts/x':
        tf.train.Feature(int64_list=tf.train.Int64List(value=x)),
        'image/object/parts/y':
        tf.train.Feature(int64_list=tf.train.Int64List(value=y)),
        'image/object/center/x': 
        tf.train.Feature(int64_list=tf.train.Int64List(value=[c_x])),
        'image/object/center/y': 
        tf.train.Feature(int64_list=tf.train.Int64List(value=[c_y])),
        'image/object/scale':
        tf.train.Feature(float_list=tf.train.FloatList(value=[scale])),
        'image/object/parts/v':
        tf.train.Feature(int64_list=tf.train.Int64List(value=v)),
        'image/encoded':
        _bytes_feature(content),
        'image/filename':
        _bytes_feature(filename.encode())
    }

    return tf.train.Example(features=tf.train.Features(feature=feature))

print('슝=3')

슝=3


하나의 annotation이 tf.train.Example이 되었다면 이제 여러 annotation에 대해 작업할 수 있도록 함수를 만들어야 합니다.

그런데 **여기서 하나의 TFRecord를 만들지 않고 여러 TFRecord를 만들어 볼 거예요. 우선 얼마나 많은 TFRecord를 만들지 결정할 함수를 만들게요.**

In [30]:
def chunkify(l, n):
    size = len(l) // n
    start = 0
    results = []
    for i in range(n):
        results.append(l[start:start + size])
        start += size
    return results

print('슝=3')

슝=3


이 함수는 전체 데이터를 몇 개의 그룹으로 나눌지 결정해 줍니다. 전체 데이터 l을 n그룹으로 나눕니다. 결과적으로 n개의 TFRecord 파일을 만들겠다는 이야기입니다.

여기서 좀 더 전문적으로는 n개로 shard 했다고 말합니다. 기업 단위의 데이터는 매우 크고 여러 장비에 나누어져 있기 때문에 sharding은 자주 이루어 집니다. 하나의 큰 데이터를 여러 개의 파일로 쪼개고 여러 컴퓨터에 나누어 담는 것이라고 여기면 됩니다. 이렇게 하면 데이터 저장에도 용이하고 병렬 처리에 하는데 이점이 있습니다.

이 때, 이 데이터는 원래 하나의 큰 데이터였기 때문에 저장 후 어떻게 사용할 것인가에 따라서 sharding 전략이 달라지는데요. 그 내용을 다루기에는 너무 방대하니 여기서는 간단히만 다룹니다. 모델을 학습하는데 각 파일의 크기와 개수만 고려해보면, 너무 작은 파일로 많이 나누는 것도, 너무 큰 파일로 적게 나누는 것도 좋지 않습니다. 너무 작은 파일로 많이 나누면 학습 중간에 너무 잦은 입출력이 요구되고, 너무 큰 파일로 적게 나누면 입출력마다 걸리는 시간이 길어집니다. 입출력에 걸리는 시간이 GPU의 계산 시간보다 길어지면 그만큼 손해가 됩니다. 적절한 파일 크기와 개수는 케바케라고 할 수 있어요.

설명은 어렵지만 실행해보면 단순합니다. 그럼 chunkify함수를 테스트 해볼까요?

In [31]:
test_chunks = chunkify([0] * 1000, 64)
print(test_chunks)
print(len(test_chunks))
print(len(test_chunks[0]))

[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0,

0이 1000개 들어 있는 리스트가 64개로 쪼개졌네요. chunkify 함수를 테스트 해봤으니 **하나의 chunk를 TFRecord로 만들어 줄 함수를 만듭시다.**

In [32]:
@ray.remote
def build_single_tfrecord(chunk, path):
    print('start to build tf records for ' + path)

    with tf.io.TFRecordWriter(path) as writer:
        for anno in chunk:
            tf_example = generate_tfexample(anno)
            writer.write(tf_example.SerializeToString())

    print('finished building tf records for ' + path)

print('슝=3')

슝=3


chunk안에는 여러 annotation들이 있고, annotation들은 tf.train.Example로 변환된 후에 문자열로 직렬화되어 TFRecord에 담기는군요.

또 한 가지 주의해서 봐야할 것은 함수 정의 위에 @ray.remote가 있다는 점이예요.

RAY
Ray는 병렬 처리를 위한 라이브러리인데요. 파이썬에서 기본적으로 제공하는 multiprocessing 패키지보다 편하게 다양한 환경에서 사용할 수 있습니다.

이제 모든 준비가 되었으니 **전체 데이터를 적당한 수의 TFRecord 파일로 만들어주는 함수를 만듭시다.** ray를 사용하기 때문에 함수를 호출하는 문법이 약간 다르다는 점에 주의하세요.

In [33]:
def build_tf_records(annotations, total_shards, split):
    chunks = chunkify(annotations, total_shards)
    futures = [
        build_single_tfrecord.remote(
            chunk, '{}/{}_{}_of_{}.tfrecords'.format(
                TFRECORD_PATH,
                split,
                str(i + 1).zfill(4),
                str(total_shards).zfill(4),
            )) for i, chunk in enumerate(chunks)
    ]
    ray.get(futures)

print('슝=3')

슝=3


### Ray

앞서 작성한 함수를 사용해 데이터를 TFRecord로 만들어 줍니다. train 데이터는 64개로, val 데이터는 8개의 파일로 만듭니다. 시간이 꽤 걸립니다.

함수나 클래스에 @ray.remote 데코레이터를 붙이고 some_function.remote()형식으로 함수를 만들어 냅니다. 클래스의 경우에는 메서드를 호출할 때 remote()를 이용하네요. 함수나 메서드는 이 시점에 실행되는 것이 아니라 생성만 됩니다. 그리고 ray.get()을 통해 실행이 되는 구조입니다. 함수를 바로 호출하는 것이 아니라 작업으로써 생성만 해놓고 나중에 실행한다는 점에 주의하세요.

### data label 로 만들기

TFRecord로 저장된 데이터를 모델 학습에 필요한 데이터로 바꿔줄 함수가 필요합니다. tensorflow에서 이미 제공해주는 함수를 사용하면 되기 때문에 간단합니다. **주의할 점은 TFRecord가 직렬화된 데이터이기 때문에 만들 때 데이터 순서와 읽어올 때 데이터 순서가 같아야 한다는 점이에요. 데이터의 형식도 동일하게 맞춰 줘야 합니다.**

In [34]:
def parse_tfexample(example):
    image_feature_description = {
        'image/height': tf.io.FixedLenFeature([], tf.int64),
        'image/width': tf.io.FixedLenFeature([], tf.int64),
        'image/depth': tf.io.FixedLenFeature([], tf.int64),
        'image/object/parts/x': tf.io.VarLenFeature(tf.int64),
        'image/object/parts/y': tf.io.VarLenFeature(tf.int64),
        'image/object/parts/v': tf.io.VarLenFeature(tf.int64),
        'image/object/center/x': tf.io.FixedLenFeature([], tf.int64),
        'image/object/center/y': tf.io.FixedLenFeature([], tf.int64),
        'image/object/scale': tf.io.FixedLenFeature([], tf.float32),
        'image/encoded': tf.io.FixedLenFeature([], tf.string),
        'image/filename': tf.io.FixedLenFeature([], tf.string),
    }
    return tf.io.parse_single_example(example, image_feature_description)

print('슝=3')

슝=3


이렇게 얻은 image 와 label 을 이용해서 적절한 학습 형태로 변환합니다. 이미지를 그대로 사용하지 않고 적당히 정사각형으로 crop하여 사용합니다.

우리가 알고 있는 것은 joints 의 위치, center 의 좌표, body height 값입니다. 균일하게 학습하기 위해 body width 를 적절히 정하는 것도 중요합니다. 이와 관련해서는 여러 방법이 있을 수 있겠지만 배우는 단계에서 더 중요하게 봐야 할 부분은 우리가 임의로 조정한 crop box 가 이미지 바깥으로 나가지 않는지 예외 처리를 잘 해주어야 한다는 점입니다.

In [35]:
def crop_roi(image, features, margin=0.2):
    img_shape = tf.shape(image)
    img_height = img_shape[0]
    img_width = img_shape[1]
    img_depth = img_shape[2]

    keypoint_x = tf.cast(tf.sparse.to_dense(features['image/object/parts/x']), dtype=tf.int32)
    keypoint_y = tf.cast(tf.sparse.to_dense(features['image/object/parts/y']), dtype=tf.int32)
    center_x = features['image/object/center/x']
    center_y = features['image/object/center/y']
    body_height = features['image/object/scale'] * 200.0

    # keypoint 중 유효한값(visible = 1) 만 사용합니다.
    masked_keypoint_x = tf.boolean_mask(keypoint_x, keypoint_x > 0)
    masked_keypoint_y = tf.boolean_mask(keypoint_y, keypoint_y > 0)

    # min, max 값을 찾습니다.
    keypoint_xmin = tf.reduce_min(masked_keypoint_x)
    keypoint_xmax = tf.reduce_max(masked_keypoint_x)
    keypoint_ymin = tf.reduce_min(masked_keypoint_y)
    keypoint_ymax = tf.reduce_max(masked_keypoint_y)

    # 높이 값을 이용해서 x, y 위치를 재조정 합니다. 박스를 정사각형으로 사용하기 위해 아래와 같이 사용합니다.
    xmin = keypoint_xmin - tf.cast(body_height * margin, dtype=tf.int32)
    xmax = keypoint_xmax + tf.cast(body_height * margin, dtype=tf.int32)
    ymin = keypoint_ymin - tf.cast(body_height * margin, dtype=tf.int32)
    ymax = keypoint_ymax + tf.cast(body_height * margin, dtype=tf.int32)

    # 이미지 크기를 벗어나는 점을 재조정 해줍니다.
    effective_xmin = xmin if xmin > 0 else 0
    effective_ymin = ymin if ymin > 0 else 0
    effective_xmax = xmax if xmax < img_width else img_width
    effective_ymax = ymax if ymax < img_height else img_height
    effective_height = effective_ymax - effective_ymin
    effective_width = effective_xmax - effective_xmin

    image = image[effective_ymin:effective_ymax, effective_xmin:effective_xmax, :]
    new_shape = tf.shape(image)
    new_height = new_shape[0]
    new_width = new_shape[1]

    effective_keypoint_x = (keypoint_x - effective_xmin) / new_width
    effective_keypoint_y = (keypoint_y - effective_ymin) / new_height

    return image, effective_keypoint_x, effective_keypoint_y

print('슝=3')

슝=3


(x, y) 좌표로 되어있는 keypoint 를 heatmap 으로 변경시킵니다. 하나의 점에만 표시 되어있는 정보를 좌표 근처 여러 지점에 확률 분포 형태로 학습시키면 결과가 더 좋았던 점을 떠올려 주세요. 이런 확률 분표 형태의 정보를 heatmap이라고 부릅니다.

In [36]:
def generate_2d_guassian(height, width, y0, x0, visibility=2, sigma=1, scale=12):
    heatmap = tf.zeros((height, width))

    xmin = x0 - 3 * sigma
    ymin = y0 - 3 * sigma
    xmax = x0 + 3 * sigma
    ymax = y0 + 3 * sigma
    
    if xmin >= width or ymin >= height or xmax < 0 or ymax < 0 or visibility == 0:
        return heatmap

    size = 6 * sigma + 1
    x, y = tf.meshgrid(tf.range(0, 6 * sigma + 1, 1), tf.range(0, 6 * sigma + 1, 1), indexing='xy')

    center_x = size // 2
    center_y = size // 2

    gaussian_patch = tf.cast(tf.math.exp(
        -(tf.math.square(x - center_x) + tf.math.square(y - center_y)) / (tf.math.square(sigma) * 2)) * scale,
                             dtype=tf.float32)

    patch_xmin = tf.math.maximum(0, -xmin)
    patch_ymin = tf.math.maximum(0, -ymin)
    patch_xmax = tf.math.minimum(xmax, width) - xmin
    patch_ymax = tf.math.minimum(ymax, height) - ymin

    heatmap_xmin = tf.math.maximum(0, xmin)
    heatmap_ymin = tf.math.maximum(0, ymin)
    heatmap_xmax = tf.math.minimum(xmax, width)
    heatmap_ymax = tf.math.minimum(ymax, height)

    indices = tf.TensorArray(tf.int32, 1, dynamic_size=True)
    updates = tf.TensorArray(tf.float32, 1, dynamic_size=True)

    count = 0

    for j in tf.range(patch_ymin, patch_ymax):
        for i in tf.range(patch_xmin, patch_xmax):
            indices = indices.write(count, [heatmap_ymin + j, heatmap_xmin + i])
            updates = updates.write(count, gaussian_patch[j][i])
            count += 1

    heatmap = tf.tensor_scatter_nd_update(heatmap, indices.stack(), updates.stack())

    return heatmap

def make_heatmaps(features, keypoint_x, keypoint_y, heatmap_shape):
    v = tf.cast(tf.sparse.to_dense(features['image/object/parts/v']), dtype=tf.float32)
    x = tf.cast(tf.math.round(keypoint_x * heatmap_shape[0]), dtype=tf.int32)
    y = tf.cast(tf.math.round(keypoint_y * heatmap_shape[1]), dtype=tf.int32)

    num_heatmap = heatmap_shape[2]
    heatmap_array = tf.TensorArray(tf.float32, 16)

    for i in range(num_heatmap):
        gaussian = self.generate_2d_guassian(heatmap_shape[1], heatmap_shape[0], y[i], x[i], v[i])
        heatmap_array = heatmap_array.write(i, gaussian)

    heatmaps = heatmap_array.stack()
    heatmaps = tf.transpose(heatmaps, perm=[1, 2, 0])  # change to (64, 64, 16)

    return heatmaps

print('슝=3')

슝=3


**지금까지 만든 함수들을 개별 함수로도 만들 수 있지만 객체 형태로 조합해 볼게요.** 객체 형태로 만들면 선언부는 복잡해 보여도 훨씬 장점이 많습니다. 함수에서 객체의 메서드로 수정할 때는 self를 추가해야 하는 점을 잊지 마세요!

In [37]:
class Preprocessor(object):
    def __init__(self,
                 image_shape=(256, 256, 3),
                 heatmap_shape=(64, 64, 16),
                 is_train=False):
        self.is_train = is_train
        self.image_shape = image_shape
        self.heatmap_shape = heatmap_shape

    def __call__(self, example):
        features = self.parse_tfexample(example)
        image = tf.io.decode_jpeg(features['image/encoded'])

        if self.is_train:
            random_margin = tf.random.uniform([1], 0.1, 0.3)[0]
            image, keypoint_x, keypoint_y = self.crop_roi(image, features, margin=random_margin)
            image = tf.image.resize(image, self.image_shape[0:2])
        else:
            image, keypoint_x, keypoint_y = self.crop_roi(image, features)
            image = tf.image.resize(image, self.image_shape[0:2])

        image = tf.cast(image, tf.float32) / 127.5 - 1
        heatmaps = self.make_heatmaps(features, keypoint_x, keypoint_y, self.heatmap_shape)

        return image, heatmaps

        
    def crop_roi(self, image, features, margin=0.2):
        img_shape = tf.shape(image)
        img_height = img_shape[0]
        img_width = img_shape[1]
        img_depth = img_shape[2]

        keypoint_x = tf.cast(tf.sparse.to_dense(features['image/object/parts/x']), dtype=tf.int32)
        keypoint_y = tf.cast(tf.sparse.to_dense(features['image/object/parts/y']), dtype=tf.int32)
        center_x = features['image/object/center/x']
        center_y = features['image/object/center/y']
        body_height = features['image/object/scale'] * 200.0
        
        masked_keypoint_x = tf.boolean_mask(keypoint_x, keypoint_x > 0)
        masked_keypoint_y = tf.boolean_mask(keypoint_y, keypoint_y > 0)
        
        keypoint_xmin = tf.reduce_min(masked_keypoint_x)
        keypoint_xmax = tf.reduce_max(masked_keypoint_x)
        keypoint_ymin = tf.reduce_min(masked_keypoint_y)
        keypoint_ymax = tf.reduce_max(masked_keypoint_y)
        
        xmin = keypoint_xmin - tf.cast(body_height * margin, dtype=tf.int32)
        xmax = keypoint_xmax + tf.cast(body_height * margin, dtype=tf.int32)
        ymin = keypoint_ymin - tf.cast(body_height * margin, dtype=tf.int32)
        ymax = keypoint_ymax + tf.cast(body_height * margin, dtype=tf.int32)
        
        effective_xmin = xmin if xmin > 0 else 0
        effective_ymin = ymin if ymin > 0 else 0
        effective_xmax = xmax if xmax < img_width else img_width
        effective_ymax = ymax if ymax < img_height else img_height
        effective_height = effective_ymax - effective_ymin
        effective_width = effective_xmax - effective_xmin

        image = image[effective_ymin:effective_ymax, effective_xmin:effective_xmax, :]
        new_shape = tf.shape(image)
        new_height = new_shape[0]
        new_width = new_shape[1]
        
        effective_keypoint_x = (keypoint_x - effective_xmin) / new_width
        effective_keypoint_y = (keypoint_y - effective_ymin) / new_height
        
        return image, effective_keypoint_x, effective_keypoint_y
        
    
    def generate_2d_guassian(self, height, width, y0, x0, visibility=2, sigma=1, scale=12):
        
        heatmap = tf.zeros((height, width))

        xmin = x0 - 3 * sigma
        ymin = y0 - 3 * sigma
        xmax = x0 + 3 * sigma
        ymax = y0 + 3 * sigma

        if xmin >= width or ymin >= height or xmax < 0 or ymax <0 or visibility == 0:
            return heatmap

        size = 6 * sigma + 1
        x, y = tf.meshgrid(tf.range(0, 6*sigma+1, 1), tf.range(0, 6*sigma+1, 1), indexing='xy')

        center_x = size // 2
        center_y = size // 2

        gaussian_patch = tf.cast(tf.math.exp(-(tf.square(x - center_x) + tf.math.square(y - center_y)) / (tf.math.square(sigma) * 2)) * scale, dtype=tf.float32)

        patch_xmin = tf.math.maximum(0, -xmin)
        patch_ymin = tf.math.maximum(0, -ymin)
        patch_xmax = tf.math.minimum(xmax, width) - xmin
        patch_ymax = tf.math.minimum(ymax, height) - ymin

        heatmap_xmin = tf.math.maximum(0, xmin)
        heatmap_ymin = tf.math.maximum(0, ymin)
        heatmap_xmax = tf.math.minimum(xmax, width)
        heatmap_ymax = tf.math.minimum(ymax, height)

        indices = tf.TensorArray(tf.int32, 1, dynamic_size=True)
        updates = tf.TensorArray(tf.float32, 1, dynamic_size=True)

        count = 0

        for j in tf.range(patch_ymin, patch_ymax):
            for i in tf.range(patch_xmin, patch_xmax):
                indices = indices.write(count, [heatmap_ymin+j, heatmap_xmin+i])
                updates = updates.write(count, gaussian_patch[j][i])
                count += 1
                
        heatmap = tf.tensor_scatter_nd_update(heatmap, indices.stack(), updates.stack())

        return heatmap


    def make_heatmaps(self, features, keypoint_x, keypoint_y, heatmap_shape):
        v = tf.cast(tf.sparse.to_dense(features['image/object/parts/v']), dtype=tf.float32)
        x = tf.cast(tf.math.round(keypoint_x * heatmap_shape[0]), dtype=tf.int32)
        y = tf.cast(tf.math.round(keypoint_y * heatmap_shape[1]), dtype=tf.int32)
        
        num_heatmap = heatmap_shape[2]
        heatmap_array = tf.TensorArray(tf.float32, 16)

        for i in range(num_heatmap):
            gaussian = self.generate_2d_guassian(heatmap_shape[1], heatmap_shape[0], y[i], x[i], v[i])
            heatmap_array = heatmap_array.write(i, gaussian)
        
        heatmaps = heatmap_array.stack()
        heatmaps = tf.transpose(heatmaps, perm=[1, 2, 0]) # change to (64, 64, 16)
        
        return heatmaps

    def parse_tfexample(self, example):
        image_feature_description = {
            'image/height': tf.io.FixedLenFeature([], tf.int64),
            'image/width': tf.io.FixedLenFeature([], tf.int64),
            'image/depth': tf.io.FixedLenFeature([], tf.int64),
            'image/object/parts/x': tf.io.VarLenFeature(tf.int64),
            'image/object/parts/y': tf.io.VarLenFeature(tf.int64),
            'image/object/parts/v': tf.io.VarLenFeature(tf.int64),
            'image/object/center/x': tf.io.FixedLenFeature([], tf.int64),
            'image/object/center/y': tf.io.FixedLenFeature([], tf.int64),
            'image/object/scale': tf.io.FixedLenFeature([], tf.float32),
            'image/encoded': tf.io.FixedLenFeature([], tf.string),
            'image/filename': tf.io.FixedLenFeature([], tf.string),
        }
        return tf.io.parse_single_example(example,
                                          image_feature_description)

print('슝=3')

슝=3


### 모델을 학습해보자

![.hrglass](./hrglass.png)

#### Hourglass 모델 만들기

> **여기에서는 BottleneckBlock를 모델의 기본으로 사용하였습니다.**  
주어진 코드에서는 BottleneckBlock을 사용하여 복잡한 연산을 효율적으로 수행하고 있습니다. 기본 Residual Block이 사용되지 않은 이유는 모델의 복잡성과 효율성을 높이기 위한 선택일 가능성이 큽니다.

In [38]:
def BottleneckBlock(inputs, filters, strides=1, downsample=False, name=None):
    identity = inputs
    if downsample:
        identity = Conv2D(
            filters=filters,
            kernel_size=1,
            strides=strides,
            padding='same',
            kernel_initializer='he_normal')(inputs)

    x = BatchNormalization(momentum=0.9)(inputs)
    x = ReLU()(x)
    x = Conv2D(
        filters=filters // 2,
        kernel_size=1,
        strides=1,
        padding='same',
        kernel_initializer='he_normal')(x)

    x = BatchNormalization(momentum=0.9)(x)
    x = ReLU()(x)
    x = Conv2D(
        filters=filters // 2,
        kernel_size=3,
        strides=strides,
        padding='same',
        kernel_initializer='he_normal')(x)

    x = BatchNormalization(momentum=0.9)(x)
    x = ReLU()(x)
    x = Conv2D(
        filters=filters,
        kernel_size=1,
        strides=1,
        padding='same',
        kernel_initializer='he_normal')(x)

    x = Add()([identity, x])
    return x

print('슝=3')

슝=3


다시 돌아와서 hourglass 모델을 잘 생각해 보면 마치 양파처럼 가장 바깥의 layer 를 제거하면 똑같은 구조가 나타나는 것을 알 수 있습니다. 이 점을 이용해서 간단하게 모델을 표현할 수 있는데요.

바로 재귀 함수를 이용하는 것이겠죠! 바깥부터 5개의 양파껍질(층)을 만들고 싶다면 order 를 이용해서 5,4...1 이 될 때까지 HourglassModule 을 반복하면 order 가 1이 되면 BottleneckBlock 으로 대체해 주면 아주 간결하게 만들 수 있습니다.

In [39]:
def HourglassModule(inputs, order, filters, num_residual):
    
    up1 = BottleneckBlock(inputs, filters, downsample=False)
    for i in range(num_residual):
        up1 = BottleneckBlock(up1, filters, downsample=False)

    low1 = MaxPool2D(pool_size=2, strides=2)(inputs)
    for i in range(num_residual):
        low1 = BottleneckBlock(low1, filters, downsample=False)

    low2 = low1
    if order > 1:
        low2 = HourglassModule(low1, order - 1, filters, num_residual)
    else:
        for i in range(num_residual):
            low2 = BottleneckBlock(low2, filters, downsample=False)

    low3 = low2
    for i in range(num_residual):
        low3 = BottleneckBlock(low3, filters, downsample=False)

    up2 = UpSampling2D(size=2)(low3)

    return up2 + up1

print('슝=3')

슝=3


여러 모듈을 쌓을수록 모델이 깊어지는 만큼 학습이 어려워, 저자들은 Intermediate supervision을 적용하였습니다.
도식에서 보이는 모듈 사이의 네트워크의 파란 박스는 모델 중간에 계산되는 히트맵 결과를 출력하는 convolution layer입니다. 이 히트맵과 ground truth의 차이를 intermediate loss (auxilary loss) 로 계산합니다. 이로써 stacked hourglass module은 보다 정교한 결과를 도출해냅니다.

In [40]:
def LinearLayer(inputs, filters):
    x = Conv2D(
        filters=filters,
        kernel_size=1,
        strides=1,
        padding='same',
        kernel_initializer='he_normal')(inputs)
    x = BatchNormalization(momentum=0.9)(x)
    x = ReLU()(x)
    return x

print('슝=3')

슝=3


따라서 stacked 되는 hourglass 층 사이사이에 LinearLayer 를 삽입하고 중간 loss 를 계산해 줍니다.

지금까지 만든 hourglass 를 여러 층으로 쌓으면 stacked hourglass 가 됩니다.

In [41]:
def StackedHourglassNetwork(
        input_shape=(256, 256, 3), 
        num_stack=4, 
        num_residual=1,
        num_heatmap=16):
    
    inputs = Input(shape=input_shape)

    x = Conv2D(
        filters=64,
        kernel_size=7,
        strides=2,
        padding='same',
        kernel_initializer='he_normal')(inputs)
    x = BatchNormalization(momentum=0.9)(x)
    x = ReLU()(x)
    x = BottleneckBlock(x, 128, downsample=True)
    x = MaxPool2D(pool_size=2, strides=2)(x)
    x = BottleneckBlock(x, 128, downsample=False)
    x = BottleneckBlock(x, 256, downsample=True)

    ys = []
    for i in range(num_stack):
        x = HourglassModule(x, order=4, filters=256, num_residual=num_residual)
        for i in range(num_residual):
            x = BottleneckBlock(x, 256, downsample=False)

        x = LinearLayer(x, 256)

        y = Conv2D(
            filters=num_heatmap,
            kernel_size=1,
            strides=1,
            padding='same',
            kernel_initializer='he_normal')(x)
        ys.append(y)

        if i < num_stack - 1:
            y_intermediate_1 = Conv2D(filters=256, kernel_size=1, strides=1)(x)
            y_intermediate_2 = Conv2D(filters=256, kernel_size=1, strides=1)(y)
            x = Add()([y_intermediate_1, y_intermediate_2])

    return tf.keras.Model(inputs, ys, name='stacked_hourglass')

print('슝=3')

슝=3


### 학습 엔진 만들기

이제 모델 학습을 진행할 차례입니다.

그런데 학습을 할 수 있는 GPU가 여러 개이고 데이터를 병렬로 학습시키려면 어떻게 해야할까요? 여러 GPU를 사용하기 위해서는 약간의 코드를 추가해줘야 합니다. 학습 환경은 GPU가 하나이지만 나중에 기업 환경에서 여러 GPU를 사용할 수 있으니 살짝 공부해 봅시다.

가장 핵심 키워드는 **tf.distribute.MirroredStrategy**입니다. 한 컴퓨터에 GPU가 여러 개인 경우 사용할 수 있는 방법인데요. 여러 GPU가 모델을 학습한 후 각각의 Loss를 계산하면 CPU가 전체 Loss를 종합합니다. 그런 후 모델의 가중치를 업데이트 하도록 하는 것이죠.

각 GPU에서 계산한 Loss를 토대로 전체 Loss를 종합해주는 역할은 **strategy.reduce** 함수가 담당합니다.

이번에도 각 함수를 별개로 만들지 않고 하나의 객체로 만들어 봅니다.

In [42]:
class Trainer(object):
    def __init__(self,
                 model,
                 epochs,
                 global_batch_size,
                 strategy,
                 initial_learning_rate):
        self.model = model
        self.epochs = epochs
        self.strategy = strategy
        self.global_batch_size = global_batch_size
        self.loss_object = tf.keras.losses.MeanSquaredError(
            reduction=tf.keras.losses.Reduction.NONE)
        self.optimizer = tf.keras.optimizers.Adam(
            learning_rate=initial_learning_rate)
        self.model = model

        self.current_learning_rate = initial_learning_rate
        self.last_val_loss = math.inf
        self.lowest_val_loss = math.inf
        self.patience_count = 0
        self.max_patience = 10
        self.best_model = None
        
        # Initialize history dictionary
        self.history = {'train_loss': [], 'val_loss': []}


    def lr_decay(self):
        if self.patience_count >= self.max_patience:
            self.current_learning_rate /= 10.0
            self.patience_count = 0
        elif self.last_val_loss == self.lowest_val_loss:
            self.patience_count = 0
        self.patience_count += 1

        self.optimizer.learning_rate = self.current_learning_rate

    def lr_decay_step(self, epoch):
        if epoch == 25 or epoch == 50 or epoch == 75:
            self.current_learning_rate /= 10.0
        self.optimizer.learning_rate = self.current_learning_rate

    def compute_loss(self, labels, outputs):
        loss = 0
        for output in outputs:
            weights = tf.cast(labels > 0, dtype=tf.float32) * 81 + 1
            loss += tf.math.reduce_mean(
                tf.math.square(labels - output) * weights) * (
                    1. / self.global_batch_size)
        return loss

    def train_step(self, inputs):
        images, labels = inputs
        with tf.GradientTape() as tape:
            outputs = self.model(images, training=True)
            loss = self.compute_loss(labels, outputs)

        grads = tape.gradient(
            target=loss, sources=self.model.trainable_variables)
        self.optimizer.apply_gradients(
            zip(grads, self.model.trainable_variables))

        return loss

    def val_step(self, inputs):
        images, labels = inputs
        outputs = self.model(images, training=False)
        loss = self.compute_loss(labels, outputs)
        return loss

    def run(self, train_dist_dataset, val_dist_dataset):
        @tf.function
        def distributed_train_epoch(dataset):
            tf.print('Start distributed traininng...')
            total_loss = 0.0
            num_train_batches = 0.0
            for one_batch in dataset:
                per_replica_loss = self.strategy.run(
                    self.train_step, args=(one_batch, ))
                batch_loss = self.strategy.reduce(
                    tf.distribute.ReduceOp.SUM, per_replica_loss, axis=None)
                total_loss += batch_loss
                num_train_batches += 1
                tf.print('Trained batch', num_train_batches, 'batch loss',
                         batch_loss, 'epoch total loss', total_loss / num_train_batches)
            return total_loss, num_train_batches

        @tf.function
        def distributed_val_epoch(dataset):
            total_loss = 0.0
            num_val_batches = 0.0
            for one_batch in dataset:
                per_replica_loss = self.strategy.run(
                    self.val_step, args=(one_batch, ))
                num_val_batches += 1
                batch_loss = self.strategy.reduce(
                    tf.distribute.ReduceOp.SUM, per_replica_loss, axis=None)
                tf.print('Validated batch', num_val_batches, 'batch loss',
                         batch_loss)
                if not tf.math.is_nan(batch_loss):
                    # TODO: Find out why the last validation batch loss become NaN
                    total_loss += batch_loss
                else:
                    num_val_batches -= 1

            return total_loss, num_val_batches

        for epoch in range(1, self.epochs + 1):
            self.lr_decay()
            print('Start epoch {} with learning rate {}'.format(
                epoch, self.current_learning_rate))

            train_total_loss, num_train_batches = distributed_train_epoch(
                train_dist_dataset)
            train_loss = train_total_loss / num_train_batches
            print('Epoch {} train loss {}'.format(epoch, train_loss))

            val_total_loss, num_val_batches = distributed_val_epoch(
                val_dist_dataset)
            val_loss = val_total_loss / num_val_batches
            print('Epoch {} val loss {}'.format(epoch, val_loss))
            
            # Save the training and validation loss for each epoch
            self.history['train_loss'].append(train_loss.numpy())
            self.history['val_loss'].append(val_loss.numpy())

            # save model when reach a new lowest validation loss
            if val_loss < self.lowest_val_loss:
                self.save_model(epoch, val_loss)
                self.lowest_val_loss = val_loss
            self.last_val_loss = val_loss
        
        # Save the history to a file
        with open('training_history.pkl', 'wb') as f:
            pickle.dump(self.history, f)

        return self.best_model

    def save_model(self, epoch, loss):
        model_name = BEST_MODEL_PATH + '/model-epoch-{}-loss-{:.4f}.h5'.format(epoch, loss) # 수정
        self.model.save_weights(model_name)
        self.best_model = model_name
        print("Model {} saved.".format(model_name))

print('슝=3')

슝=3


이제 데이터셋을 만드는 함수를 작성합니다. TFRecord 파일이 여러개이므로 tf.data.Dataset.list_files를 통해 불러옵니다.

In [43]:
IMAGE_SHAPE = (256, 256, 3)
HEATMAP_SIZE = (64, 64)

def create_dataset(tfrecords, batch_size, num_heatmap, is_train):
    preprocess = Preprocessor(
        IMAGE_SHAPE, (HEATMAP_SIZE[0], HEATMAP_SIZE[1], num_heatmap), is_train)

    dataset = tf.data.Dataset.list_files(tfrecords) # 여기서 사용
    dataset = tf.data.TFRecordDataset(dataset)
    dataset = dataset.map(
        preprocess, num_parallel_calls=tf.data.experimental.AUTOTUNE)

    if is_train:
        dataset = dataset.shuffle(batch_size)

    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

    return dataset

print('슝=3')

슝=3


이제 데이터셋과 모델, 훈련용 객체를 조립만하면 되겠군요. 하나의 함수로 만들어 줄 텐데요. 주의할 점은 with strategy.scope():부분이 반드시 필요하다는 점이에요.

또 데이터셋도 experimental_distribute_dataset를 통해 연결해 줘야 한다는 것도 중요해요.

In [44]:
def train(epochs, learning_rate, num_heatmap, batch_size, train_tfrecords, val_tfrecords):
    strategy = tf.distribute.MirroredStrategy()
    global_batch_size = strategy.num_replicas_in_sync * batch_size
    train_dataset = create_dataset(
        train_tfrecords, global_batch_size, num_heatmap, is_train=True)
    val_dataset = create_dataset(
        val_tfrecords, global_batch_size, num_heatmap, is_train=False)

    if not os.path.exists(BEST_MODEL_PATH): # 수정
        os.makedirs(BEST_MODEL_PATH) # 수정

    with strategy.scope():
        train_dist_dataset = strategy.experimental_distribute_dataset(
            train_dataset)
        val_dist_dataset = strategy.experimental_distribute_dataset(
            val_dataset)

        model = StackedHourglassNetwork(IMAGE_SHAPE, 4, 1, num_heatmap)

        trainer = Trainer(
            model,
            epochs,
            global_batch_size,
            strategy,
            initial_learning_rate=learning_rate)

        print('Start training...')
        return trainer.run(train_dist_dataset, val_dist_dataset)

print('슝=3')

슝=3


In [45]:
# 실행방법

train_tfrecords = os.path.join(TFRECORD_PATH, 'train*')  
val_tfrecords = os.path.join(TFRECORD_PATH, 'val*')  
epochs = 3
batch_size = 16  
num_heatmap = 16  
learning_rate = 0.0007
best_model_file = train(epochs, learning_rate, num_heatmap, batch_size, train_tfrecords, val_tfrecords)

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
Start training...
Start epoch 1 with learning rate 0.0007
Start distributed traininng...
Trained batch 1 batch loss 2.62562418 epoch total loss 2.62562418
Trained batch 2 batch loss 2.4533639 epoch total loss 2.53949404
Trained batch 3 batch loss 2.64274406 epoch total loss 2.57391071
Trained batch 4 batch loss 2.53032923 epoch total loss 2.56301546
Trained batch 5 batch loss 2.45593643 epoch total loss 2.54159975
Trained batch 6 batch loss 2.40004778 epoch total loss 2.51800752
Trained batch 7 batch loss 2.31933403 epoch total loss 2.48962569
Trained batch 8 batch loss 2.09833241 epoch total loss 2.44071388
Trained batch 9 batch loss 2.18560219 epoch total loss 2.41236806
Trained batch 10 batch loss 2.22844219 epoch total loss 2.3939755
Trained batch 11 batch loss 2.20921421 epoch total loss 2.37717891
Trained batch 12 batch loss 2.12779522 epoch total loss 2.35639691
Trained batch 13

Trained batch 121 batch loss 1.6291995 epoch total loss 1.83917284
Trained batch 122 batch loss 1.71674633 epoch total loss 1.83816946
Trained batch 123 batch loss 1.76331663 epoch total loss 1.83756089
Trained batch 124 batch loss 1.69552755 epoch total loss 1.83641541
Trained batch 125 batch loss 1.82400203 epoch total loss 1.83631611
Trained batch 126 batch loss 1.81040084 epoch total loss 1.83611047
Trained batch 127 batch loss 1.8073808 epoch total loss 1.83588433
Trained batch 128 batch loss 1.78659034 epoch total loss 1.83549917
Trained batch 129 batch loss 1.75727344 epoch total loss 1.83489275
Trained batch 130 batch loss 1.53699493 epoch total loss 1.83260119
Trained batch 131 batch loss 1.45283544 epoch total loss 1.82970226
Trained batch 132 batch loss 1.39119899 epoch total loss 1.82638025
Trained batch 133 batch loss 1.49168491 epoch total loss 1.82386374
Trained batch 134 batch loss 1.65148294 epoch total loss 1.82257736
Trained batch 135 batch loss 1.7653265 epoch total

Trained batch 243 batch loss 1.63217115 epoch total loss 1.74374151
Trained batch 244 batch loss 1.66107106 epoch total loss 1.74340272
Trained batch 245 batch loss 1.65116692 epoch total loss 1.74302614
Trained batch 246 batch loss 1.63123441 epoch total loss 1.74257171
Trained batch 247 batch loss 1.59750795 epoch total loss 1.74198437
Trained batch 248 batch loss 1.51177919 epoch total loss 1.74105608
Trained batch 249 batch loss 1.67358756 epoch total loss 1.74078512
Trained batch 250 batch loss 1.64724791 epoch total loss 1.74041104
Trained batch 251 batch loss 1.67537522 epoch total loss 1.74015188
Trained batch 252 batch loss 1.73083508 epoch total loss 1.74011493
Trained batch 253 batch loss 1.75179577 epoch total loss 1.74016118
Trained batch 254 batch loss 1.62117612 epoch total loss 1.73969281
Trained batch 255 batch loss 1.6252234 epoch total loss 1.73924387
Trained batch 256 batch loss 1.34978986 epoch total loss 1.73772252
Trained batch 257 batch loss 1.65072894 epoch tot

Trained batch 365 batch loss 1.68420696 epoch total loss 1.69413364
Trained batch 366 batch loss 1.73616242 epoch total loss 1.69424844
Trained batch 367 batch loss 1.71913147 epoch total loss 1.69431615
Trained batch 368 batch loss 1.53994548 epoch total loss 1.69389653
Trained batch 369 batch loss 1.53278279 epoch total loss 1.69346
Trained batch 370 batch loss 1.56211364 epoch total loss 1.69310498
Trained batch 371 batch loss 1.58942425 epoch total loss 1.69282556
Trained batch 372 batch loss 1.6277889 epoch total loss 1.69265079
Trained batch 373 batch loss 1.54341102 epoch total loss 1.69225061
Trained batch 374 batch loss 1.62637913 epoch total loss 1.69207454
Trained batch 375 batch loss 1.62569165 epoch total loss 1.69189751
Trained batch 376 batch loss 1.60896277 epoch total loss 1.69167686
Trained batch 377 batch loss 1.6186223 epoch total loss 1.69148314
Trained batch 378 batch loss 1.64882159 epoch total loss 1.69137025
Trained batch 379 batch loss 1.63166404 epoch total l

Trained batch 487 batch loss 1.60112095 epoch total loss 1.66454327
Trained batch 488 batch loss 1.5718019 epoch total loss 1.66435313
Trained batch 489 batch loss 1.57243776 epoch total loss 1.66416526
Trained batch 490 batch loss 1.56512213 epoch total loss 1.66396308
Trained batch 491 batch loss 1.63391292 epoch total loss 1.66390193
Trained batch 492 batch loss 1.59104073 epoch total loss 1.66375387
Trained batch 493 batch loss 1.59226894 epoch total loss 1.66360891
Trained batch 494 batch loss 1.60287094 epoch total loss 1.66348588
Trained batch 495 batch loss 1.60724437 epoch total loss 1.66337228
Trained batch 496 batch loss 1.64867032 epoch total loss 1.6633426
Trained batch 497 batch loss 1.58668613 epoch total loss 1.66318834
Trained batch 498 batch loss 1.57414389 epoch total loss 1.66300952
Trained batch 499 batch loss 1.50181186 epoch total loss 1.66268659
Trained batch 500 batch loss 1.52557147 epoch total loss 1.66241241
Trained batch 501 batch loss 1.53988063 epoch tota

Trained batch 608 batch loss 1.52150548 epoch total loss 1.63842297
Trained batch 609 batch loss 1.47615325 epoch total loss 1.63815641
Trained batch 610 batch loss 1.50167966 epoch total loss 1.63793278
Trained batch 611 batch loss 1.65734541 epoch total loss 1.63796449
Trained batch 612 batch loss 1.45683742 epoch total loss 1.63766861
Trained batch 613 batch loss 1.39244831 epoch total loss 1.63726854
Trained batch 614 batch loss 1.41096139 epoch total loss 1.6369
Trained batch 615 batch loss 1.39249468 epoch total loss 1.63650262
Trained batch 616 batch loss 1.41823208 epoch total loss 1.63614821
Trained batch 617 batch loss 1.52652395 epoch total loss 1.63597059
Trained batch 618 batch loss 1.603899 epoch total loss 1.63591862
Trained batch 619 batch loss 1.59203076 epoch total loss 1.63584781
Trained batch 620 batch loss 1.41383219 epoch total loss 1.6354897
Trained batch 621 batch loss 1.42906713 epoch total loss 1.63515735
Trained batch 622 batch loss 1.53522885 epoch total los

Trained batch 730 batch loss 1.61102724 epoch total loss 1.62184978
Trained batch 731 batch loss 1.6233747 epoch total loss 1.62185192
Trained batch 732 batch loss 1.63830876 epoch total loss 1.62187433
Trained batch 733 batch loss 1.55079651 epoch total loss 1.62177742
Trained batch 734 batch loss 1.54805923 epoch total loss 1.62167704
Trained batch 735 batch loss 1.55910504 epoch total loss 1.62159181
Trained batch 736 batch loss 1.46756625 epoch total loss 1.62138247
Trained batch 737 batch loss 1.55431402 epoch total loss 1.62129152
Trained batch 738 batch loss 1.38905787 epoch total loss 1.62097681
Trained batch 739 batch loss 1.30767322 epoch total loss 1.62055278
Trained batch 740 batch loss 1.46834493 epoch total loss 1.62034714
Trained batch 741 batch loss 1.58726406 epoch total loss 1.62030256
Trained batch 742 batch loss 1.65161514 epoch total loss 1.62034476
Trained batch 743 batch loss 1.5932827 epoch total loss 1.62030828
Trained batch 744 batch loss 1.63655007 epoch tota

Trained batch 851 batch loss 1.44269419 epoch total loss 1.60649323
Trained batch 852 batch loss 1.42817831 epoch total loss 1.60628402
Trained batch 853 batch loss 1.40749621 epoch total loss 1.60605097
Trained batch 854 batch loss 1.48826051 epoch total loss 1.60591304
Trained batch 855 batch loss 1.53032923 epoch total loss 1.60582459
Trained batch 856 batch loss 1.45305443 epoch total loss 1.60564601
Trained batch 857 batch loss 1.50990689 epoch total loss 1.60553432
Trained batch 858 batch loss 1.49692714 epoch total loss 1.60540771
Trained batch 859 batch loss 1.41563392 epoch total loss 1.60518682
Trained batch 860 batch loss 1.48545241 epoch total loss 1.60504758
Trained batch 861 batch loss 1.58281088 epoch total loss 1.60502172
Trained batch 862 batch loss 1.48033476 epoch total loss 1.60487711
Trained batch 863 batch loss 1.57304478 epoch total loss 1.60484016
Trained batch 864 batch loss 1.53057146 epoch total loss 1.60475409
Trained batch 865 batch loss 1.39490581 epoch to

Trained batch 973 batch loss 1.38984764 epoch total loss 1.5927366
Trained batch 974 batch loss 1.39027965 epoch total loss 1.5925287
Trained batch 975 batch loss 1.38294947 epoch total loss 1.59231365
Trained batch 976 batch loss 1.46622539 epoch total loss 1.59218442
Trained batch 977 batch loss 1.52310324 epoch total loss 1.59211373
Trained batch 978 batch loss 1.51779175 epoch total loss 1.5920378
Trained batch 979 batch loss 1.62022364 epoch total loss 1.59206653
Trained batch 980 batch loss 1.45678878 epoch total loss 1.59192848
Trained batch 981 batch loss 1.57220578 epoch total loss 1.59190845
Trained batch 982 batch loss 1.4849391 epoch total loss 1.59179962
Trained batch 983 batch loss 1.54974496 epoch total loss 1.59175694
Trained batch 984 batch loss 1.60906339 epoch total loss 1.59177446
Trained batch 985 batch loss 1.49785662 epoch total loss 1.59167898
Trained batch 986 batch loss 1.49946237 epoch total loss 1.59158552
Trained batch 987 batch loss 1.55017877 epoch total 

Trained batch 1093 batch loss 1.32005537 epoch total loss 1.58045149
Trained batch 1094 batch loss 1.44464672 epoch total loss 1.58032739
Trained batch 1095 batch loss 1.41791701 epoch total loss 1.5801791
Trained batch 1096 batch loss 1.50604463 epoch total loss 1.5801115
Trained batch 1097 batch loss 1.5996232 epoch total loss 1.58012927
Trained batch 1098 batch loss 1.58385623 epoch total loss 1.58013272
Trained batch 1099 batch loss 1.68748415 epoch total loss 1.58023036
Trained batch 1100 batch loss 1.54147303 epoch total loss 1.58019519
Trained batch 1101 batch loss 1.59243798 epoch total loss 1.58020627
Trained batch 1102 batch loss 1.48139012 epoch total loss 1.58011663
Trained batch 1103 batch loss 1.35981882 epoch total loss 1.57991695
Trained batch 1104 batch loss 1.25697327 epoch total loss 1.57962441
Trained batch 1105 batch loss 1.22960126 epoch total loss 1.57930768
Trained batch 1106 batch loss 1.28929877 epoch total loss 1.57904553
Trained batch 1107 batch loss 1.56382

Trained batch 1213 batch loss 1.42624092 epoch total loss 1.566782
Trained batch 1214 batch loss 1.39627147 epoch total loss 1.56664157
Trained batch 1215 batch loss 1.48038185 epoch total loss 1.56657052
Trained batch 1216 batch loss 1.36616421 epoch total loss 1.56640577
Trained batch 1217 batch loss 1.39946711 epoch total loss 1.56626856
Trained batch 1218 batch loss 1.40131474 epoch total loss 1.56613314
Trained batch 1219 batch loss 1.52175248 epoch total loss 1.56609666
Trained batch 1220 batch loss 1.55939972 epoch total loss 1.5660913
Trained batch 1221 batch loss 1.61023 epoch total loss 1.56612742
Trained batch 1222 batch loss 1.5645889 epoch total loss 1.56612611
Trained batch 1223 batch loss 1.62089586 epoch total loss 1.56617093
Trained batch 1224 batch loss 1.48773217 epoch total loss 1.56610692
Trained batch 1225 batch loss 1.56633949 epoch total loss 1.56610703
Trained batch 1226 batch loss 1.55173731 epoch total loss 1.56609535
Trained batch 1227 batch loss 1.52603376 

Trained batch 1333 batch loss 1.45354986 epoch total loss 1.55783701
Trained batch 1334 batch loss 1.4176302 epoch total loss 1.55773199
Trained batch 1335 batch loss 1.39499879 epoch total loss 1.55761
Trained batch 1336 batch loss 1.40203774 epoch total loss 1.55749369
Trained batch 1337 batch loss 1.47533572 epoch total loss 1.55743217
Trained batch 1338 batch loss 1.40730202 epoch total loss 1.55732
Trained batch 1339 batch loss 1.45964754 epoch total loss 1.55724704
Trained batch 1340 batch loss 1.46003139 epoch total loss 1.55717444
Trained batch 1341 batch loss 1.42063737 epoch total loss 1.55707264
Trained batch 1342 batch loss 1.32246566 epoch total loss 1.55689788
Trained batch 1343 batch loss 1.39016867 epoch total loss 1.55677366
Trained batch 1344 batch loss 1.37281823 epoch total loss 1.55663681
Trained batch 1345 batch loss 1.43034577 epoch total loss 1.55654299
Trained batch 1346 batch loss 1.45418191 epoch total loss 1.55646682
Trained batch 1347 batch loss 1.40859473 

Validated batch 107 batch loss 1.35569346
Validated batch 108 batch loss 1.42751431
Validated batch 109 batch loss 1.41085362
Validated batch 110 batch loss 1.46041524
Validated batch 111 batch loss 1.54238558
Validated batch 112 batch loss 1.6565764
Validated batch 113 batch loss 1.56721592
Validated batch 114 batch loss 1.43219376
Validated batch 115 batch loss 1.34128332
Validated batch 116 batch loss 1.4044981
Validated batch 117 batch loss 1.4141444
Validated batch 118 batch loss 1.38128281
Validated batch 119 batch loss 1.37395465
Validated batch 120 batch loss 1.3811003
Validated batch 121 batch loss 1.52036452
Validated batch 122 batch loss 1.35015035
Validated batch 123 batch loss 1.44428301
Validated batch 124 batch loss 1.40942097
Validated batch 125 batch loss 1.43824649
Validated batch 126 batch loss 1.39405608
Validated batch 127 batch loss 1.33756375
Validated batch 128 batch loss 1.45481908
Validated batch 129 batch loss 1.51233721
Validated batch 130 batch loss 1.42572

Trained batch 72 batch loss 1.44066191 epoch total loss 1.41956
Trained batch 73 batch loss 1.43066275 epoch total loss 1.41971207
Trained batch 74 batch loss 1.29450798 epoch total loss 1.41802025
Trained batch 75 batch loss 1.40010297 epoch total loss 1.41778123
Trained batch 76 batch loss 1.47755694 epoch total loss 1.41856778
Trained batch 77 batch loss 1.47567534 epoch total loss 1.4193095
Trained batch 78 batch loss 1.4225204 epoch total loss 1.41935062
Trained batch 79 batch loss 1.38790369 epoch total loss 1.41895258
Trained batch 80 batch loss 1.47087955 epoch total loss 1.41960168
Trained batch 81 batch loss 1.41392601 epoch total loss 1.41953158
Trained batch 82 batch loss 1.42693567 epoch total loss 1.41962183
Trained batch 83 batch loss 1.30104089 epoch total loss 1.4181931
Trained batch 84 batch loss 1.35302067 epoch total loss 1.41741729
Trained batch 85 batch loss 1.48854458 epoch total loss 1.41825414
Trained batch 86 batch loss 1.46489096 epoch total loss 1.41879642
T

Trained batch 194 batch loss 1.3524127 epoch total loss 1.42224085
Trained batch 195 batch loss 1.48370528 epoch total loss 1.42255604
Trained batch 196 batch loss 1.40515864 epoch total loss 1.42246723
Trained batch 197 batch loss 1.48322296 epoch total loss 1.42277563
Trained batch 198 batch loss 1.49990368 epoch total loss 1.4231652
Trained batch 199 batch loss 1.44373214 epoch total loss 1.42326856
Trained batch 200 batch loss 1.47059226 epoch total loss 1.42350507
Trained batch 201 batch loss 1.44132829 epoch total loss 1.42359376
Trained batch 202 batch loss 1.34943581 epoch total loss 1.42322659
Trained batch 203 batch loss 1.4204272 epoch total loss 1.42321277
Trained batch 204 batch loss 1.55536032 epoch total loss 1.42386055
Trained batch 205 batch loss 1.43014872 epoch total loss 1.42389119
Trained batch 206 batch loss 1.34406841 epoch total loss 1.42350364
Trained batch 207 batch loss 1.36780548 epoch total loss 1.42323458
Trained batch 208 batch loss 1.49420691 epoch total

Trained batch 315 batch loss 1.43915129 epoch total loss 1.42644906
Trained batch 316 batch loss 1.47864223 epoch total loss 1.42661417
Trained batch 317 batch loss 1.54478836 epoch total loss 1.42698705
Trained batch 318 batch loss 1.51623273 epoch total loss 1.42726767
Trained batch 319 batch loss 1.59098458 epoch total loss 1.42778087
Trained batch 320 batch loss 1.57548261 epoch total loss 1.42824244
Trained batch 321 batch loss 1.55903888 epoch total loss 1.4286499
Trained batch 322 batch loss 1.52659988 epoch total loss 1.42895412
Trained batch 323 batch loss 1.48269153 epoch total loss 1.42912054
Trained batch 324 batch loss 1.52109659 epoch total loss 1.42940438
Trained batch 325 batch loss 1.45557547 epoch total loss 1.42948484
Trained batch 326 batch loss 1.48374844 epoch total loss 1.42965126
Trained batch 327 batch loss 1.46835196 epoch total loss 1.42976964
Trained batch 328 batch loss 1.4219203 epoch total loss 1.42974567
Trained batch 329 batch loss 1.43829012 epoch tota

Trained batch 436 batch loss 1.19287944 epoch total loss 1.42287552
Trained batch 437 batch loss 1.2714541 epoch total loss 1.4225291
Trained batch 438 batch loss 1.53685737 epoch total loss 1.42279
Trained batch 439 batch loss 1.72672629 epoch total loss 1.42348254
Trained batch 440 batch loss 1.45995855 epoch total loss 1.42356539
Trained batch 441 batch loss 1.35745478 epoch total loss 1.42341554
Trained batch 442 batch loss 1.43822873 epoch total loss 1.42344904
Trained batch 443 batch loss 1.51020074 epoch total loss 1.4236449
Trained batch 444 batch loss 1.51223445 epoch total loss 1.42384434
Trained batch 445 batch loss 1.51077902 epoch total loss 1.42403972
Trained batch 446 batch loss 1.40277052 epoch total loss 1.42399204
Trained batch 447 batch loss 1.47247982 epoch total loss 1.42410052
Trained batch 448 batch loss 1.45391238 epoch total loss 1.42416704
Trained batch 449 batch loss 1.4643569 epoch total loss 1.42425656
Trained batch 450 batch loss 1.28205502 epoch total los

Trained batch 557 batch loss 1.32332611 epoch total loss 1.42004657
Trained batch 558 batch loss 1.36575162 epoch total loss 1.41994917
Trained batch 559 batch loss 1.37340987 epoch total loss 1.41986597
Trained batch 560 batch loss 1.345222 epoch total loss 1.41973269
Trained batch 561 batch loss 1.3350656 epoch total loss 1.41958177
Trained batch 562 batch loss 1.27203238 epoch total loss 1.41931927
Trained batch 563 batch loss 1.35746658 epoch total loss 1.41920936
Trained batch 564 batch loss 1.37797034 epoch total loss 1.41913629
Trained batch 565 batch loss 1.41981411 epoch total loss 1.41913748
Trained batch 566 batch loss 1.44336116 epoch total loss 1.41918027
Trained batch 567 batch loss 1.45429718 epoch total loss 1.41924214
Trained batch 568 batch loss 1.38600481 epoch total loss 1.41918361
Trained batch 569 batch loss 1.36994636 epoch total loss 1.41909707
Trained batch 570 batch loss 1.48122263 epoch total loss 1.41920602
Trained batch 571 batch loss 1.33896744 epoch total

Trained batch 678 batch loss 1.26921535 epoch total loss 1.40762269
Trained batch 679 batch loss 1.33345342 epoch total loss 1.40751338
Trained batch 680 batch loss 1.28691196 epoch total loss 1.40733612
Trained batch 681 batch loss 1.30220282 epoch total loss 1.40718162
Trained batch 682 batch loss 1.2623539 epoch total loss 1.40696931
Trained batch 683 batch loss 1.24901605 epoch total loss 1.40673804
Trained batch 684 batch loss 1.33169568 epoch total loss 1.40662837
Trained batch 685 batch loss 1.28890157 epoch total loss 1.40645647
Trained batch 686 batch loss 1.34634042 epoch total loss 1.40636873
Trained batch 687 batch loss 1.25354803 epoch total loss 1.40614629
Trained batch 688 batch loss 1.36826015 epoch total loss 1.40609133
Trained batch 689 batch loss 1.41327405 epoch total loss 1.4061017
Trained batch 690 batch loss 1.4755702 epoch total loss 1.40620244
Trained batch 691 batch loss 1.47712517 epoch total loss 1.40630507
Trained batch 692 batch loss 1.45458746 epoch total

Trained batch 799 batch loss 1.36642122 epoch total loss 1.40158379
Trained batch 800 batch loss 1.34988964 epoch total loss 1.40151918
Trained batch 801 batch loss 1.47719812 epoch total loss 1.40161359
Trained batch 802 batch loss 1.35014927 epoch total loss 1.40154934
Trained batch 803 batch loss 1.30681467 epoch total loss 1.40143132
Trained batch 804 batch loss 1.54411125 epoch total loss 1.40160871
Trained batch 805 batch loss 1.46279037 epoch total loss 1.40168476
Trained batch 806 batch loss 1.43344688 epoch total loss 1.40172422
Trained batch 807 batch loss 1.54844546 epoch total loss 1.40190601
Trained batch 808 batch loss 1.38843989 epoch total loss 1.40188932
Trained batch 809 batch loss 1.39759374 epoch total loss 1.40188396
Trained batch 810 batch loss 1.37917137 epoch total loss 1.40185595
Trained batch 811 batch loss 1.41252589 epoch total loss 1.40186906
Trained batch 812 batch loss 1.35933256 epoch total loss 1.40181673
Trained batch 813 batch loss 1.37026715 epoch to

Trained batch 920 batch loss 1.10148668 epoch total loss 1.39453375
Trained batch 921 batch loss 1.05901456 epoch total loss 1.39416945
Trained batch 922 batch loss 1.20944285 epoch total loss 1.39396906
Trained batch 923 batch loss 1.48772764 epoch total loss 1.39407063
Trained batch 924 batch loss 1.45422912 epoch total loss 1.39413571
Trained batch 925 batch loss 1.41944659 epoch total loss 1.39416301
Trained batch 926 batch loss 1.34890592 epoch total loss 1.39411414
Trained batch 927 batch loss 1.25492096 epoch total loss 1.39396393
Trained batch 928 batch loss 1.30895 epoch total loss 1.39387238
Trained batch 929 batch loss 1.2463392 epoch total loss 1.39371359
Trained batch 930 batch loss 1.20804739 epoch total loss 1.39351392
Trained batch 931 batch loss 1.40982473 epoch total loss 1.39353132
Trained batch 932 batch loss 1.31428409 epoch total loss 1.39344633
Trained batch 933 batch loss 1.33827615 epoch total loss 1.3933872
Trained batch 934 batch loss 1.37165809 epoch total l

Trained batch 1041 batch loss 1.30172133 epoch total loss 1.39044952
Trained batch 1042 batch loss 1.34138107 epoch total loss 1.39040256
Trained batch 1043 batch loss 1.34346926 epoch total loss 1.39035761
Trained batch 1044 batch loss 1.41951013 epoch total loss 1.39038551
Trained batch 1045 batch loss 1.3677702 epoch total loss 1.39036393
Trained batch 1046 batch loss 1.42885637 epoch total loss 1.39040065
Trained batch 1047 batch loss 1.38524449 epoch total loss 1.39039576
Trained batch 1048 batch loss 1.23689675 epoch total loss 1.39024937
Trained batch 1049 batch loss 1.30853736 epoch total loss 1.39017153
Trained batch 1050 batch loss 1.21622646 epoch total loss 1.39000583
Trained batch 1051 batch loss 1.24399483 epoch total loss 1.38986695
Trained batch 1052 batch loss 1.27706861 epoch total loss 1.38975966
Trained batch 1053 batch loss 1.30008006 epoch total loss 1.38967454
Trained batch 1054 batch loss 1.29644918 epoch total loss 1.38958609
Trained batch 1055 batch loss 1.215

Trained batch 1161 batch loss 1.20726585 epoch total loss 1.38410449
Trained batch 1162 batch loss 1.26672 epoch total loss 1.38400352
Trained batch 1163 batch loss 1.45684826 epoch total loss 1.3840661
Trained batch 1164 batch loss 1.52525568 epoch total loss 1.38418746
Trained batch 1165 batch loss 1.41512752 epoch total loss 1.38421404
Trained batch 1166 batch loss 1.27862608 epoch total loss 1.38412356
Trained batch 1167 batch loss 1.57602787 epoch total loss 1.38428807
Trained batch 1168 batch loss 1.5148437 epoch total loss 1.38439989
Trained batch 1169 batch loss 1.47073781 epoch total loss 1.38447368
Trained batch 1170 batch loss 1.299878 epoch total loss 1.38440144
Trained batch 1171 batch loss 1.33805549 epoch total loss 1.38436174
Trained batch 1172 batch loss 1.30142629 epoch total loss 1.38429093
Trained batch 1173 batch loss 1.40224397 epoch total loss 1.38430631
Trained batch 1174 batch loss 1.49375856 epoch total loss 1.38439953
Trained batch 1175 batch loss 1.42117262 

Trained batch 1281 batch loss 1.39404726 epoch total loss 1.38021183
Trained batch 1282 batch loss 1.31679416 epoch total loss 1.38016236
Trained batch 1283 batch loss 1.2846899 epoch total loss 1.38008797
Trained batch 1284 batch loss 1.42605627 epoch total loss 1.38012373
Trained batch 1285 batch loss 1.37957072 epoch total loss 1.38012326
Trained batch 1286 batch loss 1.28403449 epoch total loss 1.38004851
Trained batch 1287 batch loss 1.41074467 epoch total loss 1.38007247
Trained batch 1288 batch loss 1.21648049 epoch total loss 1.3799454
Trained batch 1289 batch loss 1.24477 epoch total loss 1.37984049
Trained batch 1290 batch loss 1.26046348 epoch total loss 1.37974799
Trained batch 1291 batch loss 1.25589621 epoch total loss 1.37965202
Trained batch 1292 batch loss 1.3164655 epoch total loss 1.37960303
Trained batch 1293 batch loss 1.51770949 epoch total loss 1.37970984
Trained batch 1294 batch loss 1.52899635 epoch total loss 1.37982523
Trained batch 1295 batch loss 1.52532208

Validated batch 20 batch loss 1.38090563
Validated batch 21 batch loss 1.35661793
Validated batch 22 batch loss 1.38749623
Validated batch 23 batch loss 1.31820154
Validated batch 24 batch loss 1.331038
Validated batch 25 batch loss 1.372931
Validated batch 26 batch loss 1.23100567
Validated batch 27 batch loss 1.39439321
Validated batch 28 batch loss 1.36232567
Validated batch 29 batch loss 1.35804176
Validated batch 30 batch loss 1.45214748
Validated batch 31 batch loss 1.39438379
Validated batch 32 batch loss 1.39966047
Validated batch 33 batch loss 1.35993862
Validated batch 34 batch loss 1.42638552
Validated batch 35 batch loss 1.34632051
Validated batch 36 batch loss 1.30395007
Validated batch 37 batch loss 1.42641294
Validated batch 38 batch loss 1.41528106
Validated batch 39 batch loss 1.3277477
Validated batch 40 batch loss 1.52216959
Validated batch 41 batch loss 1.15463972
Validated batch 42 batch loss 1.41742599
Validated batch 43 batch loss 1.16645622
Validated batch 44 ba

Trained batch 19 batch loss 1.36958146 epoch total loss 1.33206713
Trained batch 20 batch loss 1.28870034 epoch total loss 1.32989883
Trained batch 21 batch loss 1.33737087 epoch total loss 1.33025467
Trained batch 22 batch loss 1.31773102 epoch total loss 1.32968545
Trained batch 23 batch loss 1.2166183 epoch total loss 1.3247695
Trained batch 24 batch loss 1.27658892 epoch total loss 1.32276189
Trained batch 25 batch loss 1.21640658 epoch total loss 1.31850767
Trained batch 26 batch loss 1.26509154 epoch total loss 1.31645322
Trained batch 27 batch loss 1.27298391 epoch total loss 1.31484318
Trained batch 28 batch loss 1.23505044 epoch total loss 1.31199348
Trained batch 29 batch loss 1.13507092 epoch total loss 1.30589271
Trained batch 30 batch loss 1.21413672 epoch total loss 1.30283415
Trained batch 31 batch loss 1.23776174 epoch total loss 1.30073512
Trained batch 32 batch loss 1.23057723 epoch total loss 1.29854262
Trained batch 33 batch loss 1.24426913 epoch total loss 1.296898

Trained batch 141 batch loss 1.24430084 epoch total loss 1.29954803
Trained batch 142 batch loss 1.15024638 epoch total loss 1.2984966
Trained batch 143 batch loss 1.26942742 epoch total loss 1.29829335
Trained batch 144 batch loss 1.33885038 epoch total loss 1.29857492
Trained batch 145 batch loss 1.28184831 epoch total loss 1.29845965
Trained batch 146 batch loss 1.35541177 epoch total loss 1.2988497
Trained batch 147 batch loss 1.25859761 epoch total loss 1.29857576
Trained batch 148 batch loss 1.19986653 epoch total loss 1.29790878
Trained batch 149 batch loss 1.15386057 epoch total loss 1.296942
Trained batch 150 batch loss 1.09030604 epoch total loss 1.29556441
Trained batch 151 batch loss 1.22328913 epoch total loss 1.29508567
Trained batch 152 batch loss 1.13171303 epoch total loss 1.29401088
Trained batch 153 batch loss 1.33465552 epoch total loss 1.29427648
Trained batch 154 batch loss 1.32782471 epoch total loss 1.29449439
Trained batch 155 batch loss 1.32336092 epoch total 

Trained batch 263 batch loss 1.19131339 epoch total loss 1.30241179
Trained batch 264 batch loss 1.28108025 epoch total loss 1.30233097
Trained batch 265 batch loss 1.24599266 epoch total loss 1.30211842
Trained batch 266 batch loss 1.28159571 epoch total loss 1.30204117
Trained batch 267 batch loss 1.29941928 epoch total loss 1.30203128
Trained batch 268 batch loss 1.14912415 epoch total loss 1.30146086
Trained batch 269 batch loss 1.18232977 epoch total loss 1.301018
Trained batch 270 batch loss 1.02555037 epoch total loss 1.29999769
Trained batch 271 batch loss 1.07812619 epoch total loss 1.29917908
Trained batch 272 batch loss 1.37474656 epoch total loss 1.29945683
Trained batch 273 batch loss 1.36953199 epoch total loss 1.29971361
Trained batch 274 batch loss 1.32065642 epoch total loss 1.29979
Trained batch 275 batch loss 1.30643201 epoch total loss 1.29981411
Trained batch 276 batch loss 1.29033113 epoch total loss 1.29977977
Trained batch 277 batch loss 1.1387651 epoch total lo

Trained batch 384 batch loss 1.31721342 epoch total loss 1.30221593
Trained batch 385 batch loss 1.40134072 epoch total loss 1.30247343
Trained batch 386 batch loss 1.2818085 epoch total loss 1.3024199
Trained batch 387 batch loss 1.38391304 epoch total loss 1.30263042
Trained batch 388 batch loss 1.31736183 epoch total loss 1.30266833
Trained batch 389 batch loss 1.23571169 epoch total loss 1.30249631
Trained batch 390 batch loss 1.23926282 epoch total loss 1.30233407
Trained batch 391 batch loss 1.27878153 epoch total loss 1.30227387
Trained batch 392 batch loss 1.18965816 epoch total loss 1.30198658
Trained batch 393 batch loss 1.14689517 epoch total loss 1.30159199
Trained batch 394 batch loss 1.13568389 epoch total loss 1.30117083
Trained batch 395 batch loss 1.20574939 epoch total loss 1.30092931
Trained batch 396 batch loss 1.24861765 epoch total loss 1.3007971
Trained batch 397 batch loss 1.25230908 epoch total loss 1.30067503
Trained batch 398 batch loss 1.35877824 epoch total

Trained batch 505 batch loss 1.44091535 epoch total loss 1.29967165
Trained batch 506 batch loss 1.31654668 epoch total loss 1.29970491
Trained batch 507 batch loss 1.27563095 epoch total loss 1.29965746
Trained batch 508 batch loss 1.29330468 epoch total loss 1.29964507
Trained batch 509 batch loss 1.37132251 epoch total loss 1.29978585
Trained batch 510 batch loss 1.29179049 epoch total loss 1.29977024
Trained batch 511 batch loss 1.29530895 epoch total loss 1.29976141
Trained batch 512 batch loss 1.31688142 epoch total loss 1.29979491
Trained batch 513 batch loss 1.25610089 epoch total loss 1.2997098
Trained batch 514 batch loss 1.33202064 epoch total loss 1.29977262
Trained batch 515 batch loss 1.28457665 epoch total loss 1.29974318
Trained batch 516 batch loss 1.38601065 epoch total loss 1.29991031
Trained batch 517 batch loss 1.3694458 epoch total loss 1.30004478
Trained batch 518 batch loss 1.42709184 epoch total loss 1.30029
Trained batch 519 batch loss 1.27228379 epoch total l

Trained batch 626 batch loss 1.32782984 epoch total loss 1.3025254
Trained batch 627 batch loss 1.31463909 epoch total loss 1.30254471
Trained batch 628 batch loss 1.2143867 epoch total loss 1.3024044
Trained batch 629 batch loss 1.49069417 epoch total loss 1.30270386
Trained batch 630 batch loss 1.40320039 epoch total loss 1.30286336
Trained batch 631 batch loss 1.38638985 epoch total loss 1.30299568
Trained batch 632 batch loss 1.27012646 epoch total loss 1.30294371
Trained batch 633 batch loss 1.27658296 epoch total loss 1.3029021
Trained batch 634 batch loss 1.2082696 epoch total loss 1.30275285
Trained batch 635 batch loss 1.36263049 epoch total loss 1.30284715
Trained batch 636 batch loss 1.18352556 epoch total loss 1.30265951
Trained batch 637 batch loss 1.19106185 epoch total loss 1.30248427
Trained batch 638 batch loss 1.16766131 epoch total loss 1.30227304
Trained batch 639 batch loss 1.12237608 epoch total loss 1.30199146
Trained batch 640 batch loss 1.19252455 epoch total l

Trained batch 748 batch loss 1.22196388 epoch total loss 1.29844487
Trained batch 749 batch loss 1.11435235 epoch total loss 1.29819906
Trained batch 750 batch loss 1.23360145 epoch total loss 1.29811299
Trained batch 751 batch loss 1.16174603 epoch total loss 1.29793131
Trained batch 752 batch loss 1.25380707 epoch total loss 1.29787266
Trained batch 753 batch loss 1.3338927 epoch total loss 1.29792047
Trained batch 754 batch loss 1.56665862 epoch total loss 1.2982769
Trained batch 755 batch loss 1.44712412 epoch total loss 1.29847407
Trained batch 756 batch loss 1.30496573 epoch total loss 1.29848266
Trained batch 757 batch loss 1.2617135 epoch total loss 1.29843414
Trained batch 758 batch loss 1.22755861 epoch total loss 1.29834056
Trained batch 759 batch loss 1.40372181 epoch total loss 1.29847944
Trained batch 760 batch loss 1.11581016 epoch total loss 1.29823911
Trained batch 761 batch loss 1.37338936 epoch total loss 1.29833782
Trained batch 762 batch loss 1.43193 epoch total lo

Trained batch 870 batch loss 1.22309649 epoch total loss 1.29686248
Trained batch 871 batch loss 1.29426718 epoch total loss 1.29685962
Trained batch 872 batch loss 1.33716774 epoch total loss 1.29690576
Trained batch 873 batch loss 1.31533217 epoch total loss 1.29692686
Trained batch 874 batch loss 1.36401391 epoch total loss 1.29700363
Trained batch 875 batch loss 1.34477353 epoch total loss 1.29705822
Trained batch 876 batch loss 1.40596485 epoch total loss 1.29718256
Trained batch 877 batch loss 1.41844904 epoch total loss 1.29732084
Trained batch 878 batch loss 1.34042764 epoch total loss 1.29737
Trained batch 879 batch loss 1.23668075 epoch total loss 1.29730093
Trained batch 880 batch loss 1.31316066 epoch total loss 1.29731894
Trained batch 881 batch loss 1.38165307 epoch total loss 1.29741466
Trained batch 882 batch loss 1.27661896 epoch total loss 1.29739106
Trained batch 883 batch loss 1.25753486 epoch total loss 1.297346
Trained batch 884 batch loss 1.18922925 epoch total l

Trained batch 991 batch loss 1.23657858 epoch total loss 1.29570615
Trained batch 992 batch loss 1.14849281 epoch total loss 1.29555774
Trained batch 993 batch loss 1.14401078 epoch total loss 1.29540515
Trained batch 994 batch loss 1.22315323 epoch total loss 1.29533243
Trained batch 995 batch loss 1.22651756 epoch total loss 1.29526329
Trained batch 996 batch loss 1.11639273 epoch total loss 1.29508364
Trained batch 997 batch loss 1.2242558 epoch total loss 1.29501259
Trained batch 998 batch loss 1.2394315 epoch total loss 1.2949568
Trained batch 999 batch loss 1.46963811 epoch total loss 1.29513168
Trained batch 1000 batch loss 1.32005179 epoch total loss 1.2951566
Trained batch 1001 batch loss 1.34655666 epoch total loss 1.29520798
Trained batch 1002 batch loss 1.51502454 epoch total loss 1.29542732
Trained batch 1003 batch loss 1.33881152 epoch total loss 1.2954706
Trained batch 1004 batch loss 1.32411695 epoch total loss 1.29549921
Trained batch 1005 batch loss 1.26688623 epoch t

Trained batch 1111 batch loss 1.13699663 epoch total loss 1.29391766
Trained batch 1112 batch loss 1.2657553 epoch total loss 1.29389226
Trained batch 1113 batch loss 1.39132512 epoch total loss 1.29397988
Trained batch 1114 batch loss 1.33639061 epoch total loss 1.29401791
Trained batch 1115 batch loss 1.33965969 epoch total loss 1.2940588
Trained batch 1116 batch loss 1.24737763 epoch total loss 1.29401708
Trained batch 1117 batch loss 1.26699162 epoch total loss 1.29399288
Trained batch 1118 batch loss 1.11324787 epoch total loss 1.29383123
Trained batch 1119 batch loss 1.08145607 epoch total loss 1.29364145
Trained batch 1120 batch loss 1.2240591 epoch total loss 1.29357922
Trained batch 1121 batch loss 1.43374741 epoch total loss 1.29370427
Trained batch 1122 batch loss 1.34610391 epoch total loss 1.29375088
Trained batch 1123 batch loss 1.24282718 epoch total loss 1.29370546
Trained batch 1124 batch loss 1.22428274 epoch total loss 1.29364371
Trained batch 1125 batch loss 1.18473

Trained batch 1231 batch loss 1.37439358 epoch total loss 1.28953969
Trained batch 1232 batch loss 1.18293047 epoch total loss 1.28945315
Trained batch 1233 batch loss 1.19468689 epoch total loss 1.28937638
Trained batch 1234 batch loss 1.2700659 epoch total loss 1.28936064
Trained batch 1235 batch loss 1.22876668 epoch total loss 1.28931153
Trained batch 1236 batch loss 1.29092717 epoch total loss 1.28931284
Trained batch 1237 batch loss 1.28209758 epoch total loss 1.289307
Trained batch 1238 batch loss 1.35155046 epoch total loss 1.2893573
Trained batch 1239 batch loss 1.37686443 epoch total loss 1.28942788
Trained batch 1240 batch loss 1.3474735 epoch total loss 1.28947473
Trained batch 1241 batch loss 1.40752864 epoch total loss 1.28956985
Trained batch 1242 batch loss 1.33089578 epoch total loss 1.28960311
Trained batch 1243 batch loss 1.27138972 epoch total loss 1.28958845
Trained batch 1244 batch loss 1.32977271 epoch total loss 1.28962076
Trained batch 1245 batch loss 1.2651844

Trained batch 1351 batch loss 1.24347329 epoch total loss 1.28709066
Trained batch 1352 batch loss 1.17835665 epoch total loss 1.28701019
Trained batch 1353 batch loss 1.22820735 epoch total loss 1.28696668
Trained batch 1354 batch loss 1.35998654 epoch total loss 1.28702068
Trained batch 1355 batch loss 1.2648468 epoch total loss 1.28700435
Trained batch 1356 batch loss 1.24756455 epoch total loss 1.28697526
Trained batch 1357 batch loss 1.29260814 epoch total loss 1.28697944
Trained batch 1358 batch loss 1.23489654 epoch total loss 1.28694105
Trained batch 1359 batch loss 1.17838359 epoch total loss 1.28686106
Trained batch 1360 batch loss 1.1891278 epoch total loss 1.28678918
Trained batch 1361 batch loss 1.24834549 epoch total loss 1.28676093
Trained batch 1362 batch loss 1.14636111 epoch total loss 1.28665781
Trained batch 1363 batch loss 1.15628445 epoch total loss 1.2865622
Trained batch 1364 batch loss 1.17660725 epoch total loss 1.28648162
Trained batch 1365 batch loss 1.21535

Validated batch 136 batch loss 1.39796066
Validated batch 137 batch loss 1.2643292
Validated batch 138 batch loss 1.17313933
Validated batch 139 batch loss 1.14950991
Validated batch 140 batch loss 1.29006219
Validated batch 141 batch loss 1.23511934
Validated batch 142 batch loss 1.15193772
Validated batch 143 batch loss 1.27443206
Validated batch 144 batch loss 1.2553972
Validated batch 145 batch loss 1.31942081
Validated batch 146 batch loss 1.33995485
Validated batch 147 batch loss 1.28945577
Validated batch 148 batch loss 1.22002661
Validated batch 149 batch loss 1.29362869
Validated batch 150 batch loss 1.28513622
Validated batch 151 batch loss 1.25363159
Validated batch 152 batch loss 1.30911875
Validated batch 153 batch loss 1.36427903
Validated batch 154 batch loss 1.25447094
Validated batch 155 batch loss 1.36983299
Validated batch 156 batch loss 1.21113729
Validated batch 157 batch loss 1.19432867
Validated batch 158 batch loss 1.2179482
Validated batch 159 batch loss 1.1644

### 둠칫둠칫 댄스타임

#### 예측 엔진 만들기


이제 학습이 끝난 모델이 얼마나 잘 예측하는지 확인해 볼 시간입니다. 미리 학습된 모델을 불러옵시다.

In [None]:
model = StackedHourglassNetwork(IMAGE_SHAPE, 4, 1)

# WEIGHTS_PATH = os.path.join(PROJECT_PATH, 'models', 'model-v0.0.1-epoch-2-loss-1.3072.h5')
# model.load_weights(WEIGHTS_PATH)

# 이전의 학습하는 코드 블럭을 통해 학습하고 그 모델을 사용할 경우 아래 주석 처리된 코드를 사용하면 됩니다

# model.load_weights(best_model_file) # best_model_file생성 후 바로 사용 시
BEST_MODEL_PATH = os.path.join(PROJECT_PATH, 'best_models','model-epoch-3-loss-1.2536.h5')
model.load_weights(BEST_MODEL_PATH)

In [None]:
model.summary()

학습에 사용했던 keypoint 들을 사용해야 하기 때문에 필요한 변수를 지정해 줍니다. 변수에 저장되는 것은 해당 부위를 나타내는 인덱스예요.

In [None]:
R_ANKLE = 0
R_KNEE = 1
R_HIP = 2
L_HIP = 3
L_KNEE = 4
L_ANKLE = 5
PELVIS = 6
THORAX = 7
UPPER_NECK = 8
HEAD_TOP = 9
R_WRIST = 10
R_ELBOW = 11
R_SHOULDER = 12
L_SHOULDER = 13
L_ELBOW = 14
L_WRIST = 15

MPII_BONES = [
    [R_ANKLE, R_KNEE],
    [R_KNEE, R_HIP],
    [R_HIP, PELVIS],
    [L_HIP, PELVIS],
    [L_HIP, L_KNEE],
    [L_KNEE, L_ANKLE],
    [PELVIS, THORAX],
    [THORAX, UPPER_NECK],
    [UPPER_NECK, HEAD_TOP],
    [R_WRIST, R_ELBOW],
    [R_ELBOW, R_SHOULDER],
    [THORAX, R_SHOULDER],
    [THORAX, L_SHOULDER],
    [L_SHOULDER, L_ELBOW],
    [L_ELBOW, L_WRIST]
]

print('슝=3')

모델을 학습할 때 라벨이 되는 좌표를 heatmap으로 바꿨던 것이 기억 나시나요? 학습을 heatmap으로 했기 때문에 모델이 추론해 내놓은 결과도 heatmap입니다. 그래서 이 heatmap으로부터 좌표를 추출해야해요. heatmap중에 최대값을 갖는 지점을 찾아내면 되겠네요. heatmap에서 최대값을 찾는 함수를 만들어 줍니다.

In [None]:
def find_max_coordinates(heatmaps):
    flatten_heatmaps = tf.reshape(heatmaps, (-1, 16))
    indices = tf.math.argmax(flatten_heatmaps, axis=0)
    y = tf.cast(indices / 64, dtype=tf.int64)
    x = indices - 64 * y
    return tf.stack([x, y], axis=1).numpy()

print('슝=3')

위 함수만으로는 256x256 이미지에 64x64 heatmap max 값을 표현할 때 quantization 오차가 발생하기 때문에 실제 계산에서는 3x3 필터를 이용해서 근사치를 구해줍니다.

In [None]:
def extract_keypoints_from_heatmap(heatmaps):
    max_keypoints = find_max_coordinates(heatmaps)

    padded_heatmap = np.pad(heatmaps, [[1,1],[1,1],[0,0]], mode='constant')
    adjusted_keypoints = []
    for i, keypoint in enumerate(max_keypoints):
        max_y = keypoint[1]+1
        max_x = keypoint[0]+1
        
        patch = padded_heatmap[max_y-1:max_y+2, max_x-1:max_x+2, i]
        patch[1][1] = 0
        
        index = np.argmax(patch)
        
        next_y = index // 3
        next_x = index - next_y * 3
        delta_y = (next_y - 1) / 4
        delta_x = (next_x - 1) / 4
        
        adjusted_keypoint_x = keypoint[0] + delta_x
        adjusted_keypoint_y = keypoint[1] + delta_y
        adjusted_keypoints.append((adjusted_keypoint_x, adjusted_keypoint_y))
        
    adjusted_keypoints = np.clip(adjusted_keypoints, 0, 64)
    normalized_keypoints = adjusted_keypoints / 64
    return normalized_keypoints

print('슝=3')

이제 모델과 이미지 경로를 입력하면 이미지와 keypoint를 출력하는 함수를 만들어 줍니다. 편리하게 사용할 수 있도록이요.

In [None]:
def predict(model, image_path):
    encoded = tf.io.read_file(image_path)
    image = tf.io.decode_jpeg(encoded)
    inputs = tf.image.resize(image, (256, 256))
    inputs = tf.cast(inputs, tf.float32) / 127.5 - 1
    inputs = tf.expand_dims(inputs, 0)
    outputs = model(inputs, training=False)
    if type(outputs) != list:
        outputs = [outputs]
    heatmap = tf.squeeze(outputs[-1], axis=0).numpy()
    kp = extract_keypoints_from_heatmap(heatmap)
    return image, kp

print('슝=3')

거의 다 온 것 같습니다. 이제 그림만 그려주면 완성될 것 같네요. 그림을 그릴 때는 두 가지 그림을 그려볼 겁니다. keypoint들과 뼈대입니다. keypoint들은 관절 역할을 하고 keypoint들을 연결시킨 것이 뼈대가 되겠네요.

두 가지 함수를 각각 작성해 줍니다.

In [None]:
def draw_keypoints_on_image(image, keypoints, index=None):
    fig,ax = plt.subplots(1)
    ax.imshow(image)
    joints = []
    for i, joint in enumerate(keypoints):
        joint_x = joint[0] * image.shape[1]
        joint_y = joint[1] * image.shape[0]
        if index is not None and index != i:
            continue
        plt.scatter(joint_x, joint_y, s=10, c='red', marker='o')
    plt.show()

def draw_skeleton_on_image(image, keypoints, index=None):
    fig,ax = plt.subplots(1)
    ax.imshow(image)
    joints = []
    for i, joint in enumerate(keypoints):
        joint_x = joint[0] * image.shape[1]
        joint_y = joint[1] * image.shape[0]
        joints.append((joint_x, joint_y))
    
    for bone in MPII_BONES:
        joint_1 = joints[bone[0]]
        joint_2 = joints[bone[1]]
        plt.plot([joint_1[0], joint_2[0]], [joint_1[1], joint_2[1]], linewidth=5, alpha=0.7)
    plt.show()

print('슝=3')

자 이제 테스트 이미지를 이용해 모델의 성능을 확인해 봅시다.

In [None]:
test_image = os.path.join(PROJECT_PATH, 'test_image.jpg')

image, keypoints = predict(model, test_image)
draw_keypoints_on_image(image, keypoints)
draw_skeleton_on_image(image, keypoints)