# Pose Detection with AlphaPhose

This notebook uses an open source project [MVIG-SJTU/AlphaPose](https://github.com/MVIG-SJTU/AlphaPose) to detect/track multi person poses on a given youtube video.

For other deep-learning Colab notebooks, visit [tugstugi/dl-colab-notebooks](https://github.com/tugstugi/dl-colab-notebooks).


## Install AlphaPose

In [1]:
import os
from os.path import exists, join, basename, splitext

git_repo_url = 'https://github.com/MVIG-SJTU/AlphaPose.git'
project_name = splitext(basename(git_repo_url))[0]
if not exists(project_name):
  # clone and install dependencies
  !git clone -q -b pytorch --depth 1 $git_repo_url
  !cd $project_name && pip install -q -r requirements.txt
  !pip install -q youtube-dl visdom

import sys
sys.path.append(project_name)
import time
import matplotlib
import matplotlib.pylab as plt
plt.rcParams["axes.grid"] = False

from IPython.display import YouTubeVideo

[31mERROR: Could not find a version that satisfies the requirement torch==0.4.0 (from versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.6.0, 2.7.0, 2.7.1, 2.8.0, 2.9.0)[0m[31m
[0m[31mERROR: No matching distribution found for torch==0.4.0[0m[31m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m70.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m37.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for visdom (setup.py) ... [?25l[?25hdone


## Download pretrained models

In [3]:
!pip -q install gdown

import os
from os.path import join, exists, dirname
import gdown

def gdrive_download(file_id: str, dst_path: str):
    os.makedirs(dirname(dst_path), exist_ok=True)
    url = f"https://drive.google.com/uc?id={file_id}"
    # fuzzy=True 로 폴더/공유링크 등도 유연 대응
    gdown.download(url, dst_path, quiet=False, fuzzy=True)

# 사용 예시
pretrained_model_path = join(project_name, 'models/sppe/duc_se.pth')
if not exists(pretrained_model_path):
    gdrive_download('1OPORTWB2cwd5YTVBX-NE8fsauZJWsrtW', pretrained_model_path)

yolo_pretrained_model_path = join(project_name, 'models/yolo/yolov3-spp.weights')
if not exists(yolo_pretrained_model_path):
    gdrive_download('1D47msNOOiJKvPOXlnpyzdKA3k6E97NTC', yolo_pretrained_model_path)


Downloading...
From (original): https://drive.google.com/uc?id=1OPORTWB2cwd5YTVBX-NE8fsauZJWsrtW
From (redirected): https://drive.google.com/uc?id=1OPORTWB2cwd5YTVBX-NE8fsauZJWsrtW&confirm=t&uuid=80fdcea9-8620-4cb8-833a-46af5d81d15b
To: /content/AlphaPose/models/sppe/duc_se.pth
100%|██████████| 239M/239M [00:10<00:00, 23.9MB/s]
Downloading...
From (original): https://drive.google.com/uc?id=1D47msNOOiJKvPOXlnpyzdKA3k6E97NTC
From (redirected): https://drive.google.com/uc?id=1D47msNOOiJKvPOXlnpyzdKA3k6E97NTC&confirm=t&uuid=2d21aed7-028e-4790-87d7-8b0b26e75724
To: /content/AlphaPose/models/yolo/yolov3-spp.weights
100%|██████████| 252M/252M [00:05<00:00, 47.2MB/s]


In [14]:
def download_from_google_drive(file_id, file_name):
  # download a file from the Google Drive link
  !rm -f ./cookie
  !curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=$file_id" > /dev/null
  confirm_text = !awk '/download/ {print $NF}' ./cookie
  confirm_text = confirm_text[0]
  !curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=$confirm_text&id=$file_id" -o $file_name



pretrained_model_path = join(project_name, 'models/sppe/duc_se.pth')
if not exists(pretrained_model_path):
  # download the pretrained model
  download_from_google_drive('1OPORTWB2cwd5YTVBX-NE8fsauZJWsrtW', pretrained_model_path)

yolo_pretrained_model_path = join(project_name, 'models/yolo/yolov3-spp.weights')
if not exists(yolo_pretrained_model_path):
  # download the YOLO weights
  download_from_google_drive('1D47msNOOiJKvPOXlnpyzdKA3k6E97NTC', yolo_pretrained_model_path)

## Detect poses on a test video

We are going to detect poses on the following youtube video:

Download the above youtube video, cut the first 5 seconds and do the pose detection on that 5 seconds:

In [18]:
import pathlib, re
base = pathlib.Path('/content/AlphaPose')
for path in base.rglob("*.py"):
    s = path.read_text()
    s = s.replace("from collections import Iterable", "from collections.abc import Iterable")
    s = s.replace("from collections import Mapping", "from collections.abc import Mapping")
    path.write_text(s)
print("collections.* -> collections.abc 패치 완료")


collections.* -> collections.abc 패치 완료


In [24]:
import pathlib, re

p = pathlib.Path('/content/AlphaPose/fn.py')
s = p.read_text()

# 1) try-except 블록( torch._six )이 있으면 통째로 제거하고 대체
s = re.sub(
    r"try:\s*?\n\s*from\s+torch\._six\s+import\s+string_classes,\s*int_classes\s*?\n\s*except\s+.*?:\s*?\n\s*string_classes\s*=\s*\(str,\)\s*?\n\s*int_classes\s*=\s*\(int,\)",
    "string_classes = (str,)\nint_classes = (int,)",
    s,
    flags=re.DOTALL
)

# 2) 단일 import 라인이 있으면 그것도 대체
s = re.sub(
    r"from\s+torch\._six\s+import\s+string_classes,\s*int_classes",
    "string_classes = (str,)\nint_classes = (int,)",
    s
)

p.write_text(s)
print("✅ torch._six 의존 제거 완료 (fn.py)")

# .pyc 캐시 제거(혹시 모를 캐시 문제 방지)
import subprocess, os
subprocess.run(['bash','-lc','find /content/AlphaPose -name "*.pyc" -delete'])


✅ torch._six 의존 제거 완료 (fn.py)


CompletedProcess(args=['bash', '-lc', 'find /content/AlphaPose -name "*.pyc" -delete'], returncode=0)

In [29]:
# 5초 클립 생성 (이미 youtube.mp4가 있을 때)
!ffmpeg -y -loglevel error -i /content/youtube.mp4 /content/video.mp4


# 이전 결과 삭제(파일만)
!rm -f /content/AlphaPose_video.avi

# AlphaPose 실행
!cd "$project_name" && python3 video_demo.py --sp --video ../video.mp4 --outdir .. --save_video

# 결과 변환
!ffmpeg -y -loglevel error -i /content/AlphaPose_video.avi /content/AlphaPose_video.mp4


Loading YOLO model..
Loading pose model from ./models/sppe/duc_se.pth
100% 451/451 [00:49<00:00,  9.18it/s]


Finally, visualize the result:

In [30]:
def show_local_mp4_video(file_name, width=640, height=480):
  import io
  import base64
  from IPython.display import HTML
  video_encoded = base64.b64encode(io.open(file_name, 'rb').read())
  return HTML(data='''<video width="{0}" height="{1}" alt="test" controls>
                        <source src="data:video/mp4;base64,{2}" type="video/mp4" />
                      </video>'''.format(width, height, video_encoded.decode('ascii')))

show_local_mp4_video('AlphaPose_video.mp4', width=960, height=720)

use mocodad

In [35]:
!cd /content/AlphaPose && python3 video_demo.py \
    --sp \
    --video ../youtube.mp4 \
    --outdir ../ \
    --save_video \
    --vis_fast

Loading YOLO model..
Loading pose model from ./models/sppe/duc_se.pth
100% 451/451 [00:46<00:00,  9.60it/s]
✅ Saved keypoints (per-frame, variable persons) to: ../youtube_keypoints_per_frame.npy
✅ Saved keypoints (T,1,K,2) for MoCoDAD to: ../youtube_keypoints_Tx1xKx2.npy


In [None]:
/content/youtube_keypoints_Tx1xKx2.npy

In [None]:
/content/youtube_keypoints_per_frame.npy