# PIFuHD Demo: https://shunsukesaito.github.io/PIFuHD/

![](https://shunsukesaito.github.io/PIFuHD/resources/images/pifuhd.gif)

Made by [![Follow](https://img.shields.io/twitter/follow/psyth91?style=social)](https://twitter.com/psyth91)

To see how the model works, visit the project repository.

[![GitHub stars](https://img.shields.io/github/stars/facebookresearch/pifuhd?style=social)](https://github.com/facebookresearch/pifuhd)

## Note
Make sure that your runtime type is 'Python 3 with GPU acceleration'. To do so, go to Edit > Notebook settings > Hardware Accelerator > Select "GPU".

## More Info
- Paper: https://arxiv.org/pdf/2004.00452.pdf
- Repo: https://github.com/facebookresearch/pifuhd
- Project Page: https://shunsukesaito.github.io/PIFuHD/
- 1-minute/5-minute Presentation (see below)

In [None]:
# PyTorch 및 CUDA 11.7 설치
!pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 -f https://download.pytorch.org/whl/torch_stable.html
# CUDA 11.7 및 PyTorch 1.13.1 버전에 맞는 pytorch3d 설치
!pip install pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py310_cu117_pyt1131/download.html

Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.13.1+cu117
  Downloading https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp310-cp310-linux_x86_64.whl (1801.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 GB[0m [31m843.2 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting torchvision==0.14.1+cu117
  Downloading https://download.pytorch.org/whl/cu117/torchvision-0.14.1%2Bcu117-cp310-cp310-linux_x86_64.whl (24.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.3/24.3 MB[0m [31m53.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting torchaudio==0.13.1
  Downloading https://download.pytorch.org/whl/rocm5.2/torchaudio-0.13.1%2Brocm5.2-cp310-cp310-linux_x86_64.whl (3.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.9/3.9 MB[0m [31m62.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: torch, torchvision, torchaudio
  Attempting uninstall: torch
    Found exis

In [None]:
import IPython
IPython.display.HTML('<h2>1-Minute Presentation</h2><iframe width="720" height="405" src="https://www.youtube.com/embed/-1XYTmm8HhE" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><br><h2>5-Minute Presentation</h2><iframe width="720" height="405" src="https://www.youtube.com/embed/uEDqCxvF5yc" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>')

## Requirements
- Python 3
- PyTorch tested on 1.4.0
- json
- PIL
- skimage
- tqdm
- numpy
- cv2

## Help! I'm new to Google Colab

You can check out the following youtube video on how to upload your own picture and run PIFuHD. **Note that with new update, you can upload your own picture more easily with GUI down below.**


In [None]:
import IPython
IPython.display.HTML('<iframe width="720" height="405" src="https://www.youtube.com/embed/LWDGR5v3-3o" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>')



## Clone PIFuHD repository

In [None]:
!git clone https://github.com/facebookresearch/pifuhd

Cloning into 'pifuhd'...
remote: Enumerating objects: 222, done.[K
remote: Counting objects: 100% (126/126), done.[K
remote: Compressing objects: 100% (44/44), done.[K
remote: Total 222 (delta 92), reused 82 (delta 82), pack-reused 96 (from 1)[K
Receiving objects: 100% (222/222), 399.39 KiB | 1.66 MiB/s, done.
Resolving deltas: 100% (114/114), done.


## Configure input data

In [None]:
cd /content/pifuhd/sample_images

/content/pifuhd/sample_images


**If you want to upload your own picture, run the next cell**. Otherwise, go to the next next cell. Currently PNG, JPEG files are supported.

In [None]:
from google.colab import files

filename = list(files.upload().keys())[0]

Saving 201608070500730.jpg to 201608070500730.jpg


In [None]:
import os

try:
    # filename 변수를 사용하여 이미지 경로를 설정합니다.
    image_path = '/content/pifuhd/sample_images/%s' % filename
except:
    # filename이 설정되지 않았을 경우 기본 이미지 경로를 사용합니다.
    image_path = '/content/pifuhd/sample_images/test.png'  # 예제 이미지
# 이미지 파일의 디렉터리 경로를 추출합니다.
image_dir = os.path.dirname(image_path)
# 이미지 파일명에서 확장자를 제거하여 순수 파일 이름만 추출합니다.
file_name = os.path.splitext(os.path.basename(image_path))[0]

# 출력 경로 설정
# 결과 OBJ 파일 경로 설정
obj_path = '/content/pifuhd/results/pifuhd_final/recon/result_%s_256.obj' % file_name
# 결과 PNG 이미지 경로 설정
out_img_path = '/content/pifuhd/results/pifuhd_final/recon/result_%s_256.png' % file_name
# 결과 MP4 비디오 경로 설정
video_path = '/content/pifuhd/results/pifuhd_final/recon/result_%s_256.mp4' % file_name
# 디스플레이용 MP4 비디오 경로 설정
video_display_path = '/content/pifuhd/results/pifuhd_final/result_%s_256_display.mp4' % file_name


In [None]:
cd /content

/content


## Preprocess (for cropping image)

In [None]:
!git clone https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch.git

Cloning into 'lightweight-human-pose-estimation.pytorch'...
remote: Enumerating objects: 124, done.[K
remote: Counting objects: 100% (34/34), done.[K
remote: Compressing objects: 100% (16/16), done.[K
remote: Total 124 (delta 21), reused 19 (delta 18), pack-reused 90 (from 1)[K
Receiving objects: 100% (124/124), 230.77 KiB | 5.77 MiB/s, done.
Resolving deltas: 100% (53/53), done.


In [None]:
cd /content/lightweight-human-pose-estimation.pytorch/

/content/lightweight-human-pose-estimation.pytorch


In [None]:
!wget https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth

--2024-11-03 15:54:09--  https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth
Resolving download.01.org (download.01.org)... 92.122.14.20, 2600:1409:9800:1689::a87, 2600:1409:9800:168c::a87
Connecting to download.01.org (download.01.org)|92.122.14.20|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87959810 (84M) [application/octet-stream]
Saving to: ‘checkpoint_iter_370000.pth’


2024-11-03 15:54:10 (75.0 MB/s) - ‘checkpoint_iter_370000.pth’ saved [87959810/87959810]



In [None]:
import torch
import cv2
import numpy as np
from models.with_mobilenet import PoseEstimationWithMobileNet
from modules.keypoints import extract_keypoints, group_keypoints
from modules.load_state import load_state
from modules.pose import Pose, track_poses
import demo

# 사람의 사각형 영역을 추출하는 함수
def get_rect(net, images, height_size):
    net = net.eval()  # 네트워크를 평가 모드로 설정

    stride = 8  # 네트워크 stride 설정
    upsample_ratio = 4  # 업샘플 비율 설정
    num_keypoints = Pose.num_kpts  # 키포인트 개수 (사람의 관절 위치 개수)
    previous_poses = []  # 이전 포즈 데이터를 저장할 리스트
    delay = 33  # 프레임 간 딜레이 설정
    for image in images:
        # 이미지 파일명에서 확장자를 제거하고 '_rect.txt' 텍스트 파일 경로 생성
        rect_path = image.replace('.%s' % (image.split('.')[-1]), '_rect.txt')
        img = cv2.imread(image, cv2.IMREAD_COLOR)  # 이미지를 컬러로 읽음
        orig_img = img.copy()  # 원본 이미지 복사
        heatmaps, pafs, scale, pad = demo.infer_fast(net, img, height_size, stride, upsample_ratio, cpu=False)
        # 네트워크를 통해 heatmap과 PAF(Part Affinity Fields) 계산

        total_keypoints_num = 0  # 총 키포인트 수 초기화
        all_keypoints_by_type = []  # 키포인트를 타입별로 저장할 리스트
        for kpt_idx in range(num_keypoints):  # 각 키포인트 타입에 대해 반복
            # 해당 타입의 키포인트를 추출하여 리스트에 저장
            total_keypoints_num += extract_keypoints(heatmaps[:, :, kpt_idx], all_keypoints_by_type, total_keypoints_num)

        # 추출된 키포인트들을 연결하여 포즈 정보를 구성
        pose_entries, all_keypoints = group_keypoints(all_keypoints_by_type, pafs)
        for kpt_id in range(all_keypoints.shape[0]):
            # 스케일과 패딩을 보정하여 실제 이미지 좌표로 변환
            all_keypoints[kpt_id, 0] = (all_keypoints[kpt_id, 0] * stride / upsample_ratio - pad[1]) / scale
            all_keypoints[kpt_id, 1] = (all_keypoints[kpt_id, 1] * stride / upsample_ratio - pad[0]) / scale
        current_poses = []  # 현재 프레임의 포즈 리스트

        rects = []  # 추출된 사각형 영역을 저장할 리스트
        for n in range(len(pose_entries)):
            if len(pose_entries[n]) == 0:
                continue
            # 각 포즈에 대해 키포인트 좌표를 초기화
            pose_keypoints = np.ones((num_keypoints, 2), dtype=int) * -1
            valid_keypoints = []  # 유효한 키포인트 좌표 저장
            for kpt_id in range(num_keypoints):
                if pose_entries[n][kpt_id] != -1.0:  # 유효한 키포인트가 존재할 때
                    pose_keypoints[kpt_id, 0] = int(all_keypoints[int(pose_entries[n][kpt_id]), 0])
                    pose_keypoints[kpt_id, 1] = int(all_keypoints[int(pose_entries[n][kpt_id]), 1])
                    valid_keypoints.append([pose_keypoints[kpt_id, 0], pose_keypoints[kpt_id, 1]])
            valid_keypoints = np.array(valid_keypoints)

            # 허리나 다리를 중심으로 사각형 범위를 계산
            if pose_entries[n][10] != -1.0 or pose_entries[n][13] != -1.0:
                pmin = valid_keypoints.min(0)  # 최소 좌표
                pmax = valid_keypoints.max(0)  # 최대 좌표

                center = (0.5 * (pmax[:2] + pmin[:2])).astype(int)  # 사각형 중심 계산
                radius = int(0.65 * max(pmax[0]-pmin[0], pmax[1]-pmin[1]))  # 사각형 크기 계산
            elif pose_entries[n][10] == -1.0 and pose_entries[n][13] == -1.0 and pose_entries[n][8] != -1.0 and pose_entries[n][11] != -1.0:
                # 다리가 보이지 않는 경우 골반을 중심으로 사각형 계산
                center = (0.5 * (pose_keypoints[8] + pose_keypoints[11])).astype(int)
                radius = int(1.45 * np.sqrt(((center[None, :] - valid_keypoints) ** 2).sum(1)).max(0))
                center[1] += int(0.05 * radius)  # 중심을 아래로 약간 이동
            else:
                # 기준점이 없는 경우 이미지 중앙을 사각형 중심으로 설정
                center = np.array([img.shape[1] // 2, img.shape[0] // 2])
                radius = max(img.shape[1] // 2, img.shape[0] // 2)

            # 사각형의 시작 좌표와 크기를 계산하여 rects 리스트에 추가
            x1 = center[0] - radius
            y1 = center[1] - radius

            rects.append([x1, y1, 2 * radius, 2 * radius])

        # 추출된 사각형 영역을 텍스트 파일로 저장
        np.savetxt(rect_path, np.array(rects), fmt='%d')


In [None]:
# MobileNet 기반의 포즈 추정 네트워크 객체를 생성합니다.
net = PoseEstimationWithMobileNet()

# 저장된 모델 체크포인트를 로드합니다.
# 'checkpoint_iter_370000.pth' 파일을 CPU에 로드합니다.
checkpoint = torch.load('checkpoint_iter_370000.pth', map_location='cpu')

# 로드된 체크포인트의 파라미터를 네트워크에 적용하여 가중치를 설정합니다.
load_state(net, checkpoint)

# get_rect 함수를 호출하여 주어진 이미지를 기반으로 사각형 영역을 추출합니다.
# 네트워크를 GPU로 옮기고 이미지 경로와 크기(512)를 인자로 사용합니다.
get_rect(net.cuda(), [image_path], 512)


## Download the Pretrained Model

In [None]:
cd /content/pifuhd/

/content/pifuhd


In [None]:
!sh ./scripts/download_trained_model.sh

+ mkdir -p checkpoints
+ cd checkpoints
+ wget https://dl.fbaipublicfiles.com/pifuhd/checkpoints/pifuhd.pt pifuhd.pt
--2024-11-03 15:54:15--  https://dl.fbaipublicfiles.com/pifuhd/checkpoints/pifuhd.pt
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 3.163.189.96, 3.163.189.14, 3.163.189.51, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|3.163.189.96|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1548375177 (1.4G) [application/octet-stream]
Saving to: ‘pifuhd.pt’


2024-11-03 15:54:21 (234 MB/s) - ‘pifuhd.pt’ saved [1548375177/1548375177]

--2024-11-03 15:54:21--  http://pifuhd.pt/
Resolving pifuhd.pt (pifuhd.pt)... failed: Name or service not known.
wget: unable to resolve host address ‘pifuhd.pt’
FINISHED --2024-11-03 15:54:21--
Total wall clock time: 6.4s
Downloaded: 1 files, 1.4G in 6.3s (234 MB/s)


## Run PIFuHD!


In [None]:
# Warning: all images with the corresponding rectangle files under -i will be processed.
!python -m apps.simple_test -r 256 --use_rect -i $image_dir

# seems that 256 is the maximum resolution that can fit into Google Colab.
# If you want to reconstruct a higher-resolution mesh, please try with your own machine.

Resuming from  ./checkpoints/pifuhd.pt
test data size:  1
initialize network with normal
initialize network with normal
generate mesh (test) ...
  0% 0/1 [00:00<?, ?it/s]./results/pifuhd_final/recon/result_201608070500730_256.obj
100% 1/1 [00:07<00:00,  7.80s/it]


## Render the result

In [None]:
from lib.colab_util import generate_video_from_obj, set_renderer, video

# 렌더러 설정 함수 호출
renderer = set_renderer()

# OBJ 파일로부터 영상을 생성합니다.
# 3D 객체(obj_path)를 사용하여 이미지(out_img_path)를 만들고,
# 비디오 파일(video_path)로 변환합니다.
generate_video_from_obj(obj_path, out_img_path, video_path, renderer)

# cv2로 생성한 mp4 영상을 바로 재생할 수 없으므로, ffmpeg로 변환합니다.
# 기존 비디오 파일을 h264 코덱으로 변환하여 새로운 비디오 파일(video_display_path)을 생성합니다.
# '-y' 옵션은 기존 파일을 덮어쓰며, '-loglevel quiet' 옵션은 로그 출력을 억제합니다.
!ffmpeg -i $video_path -vcodec libx264 $video_display_path -y -loglevel quiet

# 변환된 비디오 파일을 재생합니다.
video(video_display_path)




  0%|          | 0/90 [00:00<?, ?it/s]

## Tips for Inputs: My results are broken!

(Kudos to those who share results on twitter with [#pifuhd](https://twitter.com/search?q=%23pifuhd&src=recent_search_click&f=live) tag!!!!)

Due to the limited variation in the training data, your results might be broken sometimes. Here I share some useful tips to get resonable results.

*   Use high-res image. The model is trained with 1024x1024 images. Use at least 512x512 with fine-details. Low-res images and JPEG artifacts may result in unsatisfactory results.
*   Use an image with a single person. If the image contain multiple people, reconstruction quality is likely degraded.
*   Front facing with standing works best (or with fashion pose)
*   The entire body is covered within the image. (Note: now missing legs is partially supported)
*   Make sure the input image is well lit. Exteremy dark or bright image and strong shadow often create artifacts.
*   I recommend nearly parallel camera angle to the ground. High camera height may result in distorted legs or high heels.
*   If the background is cluttered, use less complex background or try removing it using https://www.remove.bg/ before processing.
*   It's trained with human only. Anime characters may not work well (To my surprise, indeed many people tried it!!).
*   Search on twitter with [#pifuhd](https://twitter.com/search?q=%23pifuhd&src=recent_search_click&f=live) tag to get a better sense of what succeeds and what fails.


## Share your result!
Please share your results with[ #pifuhd](https://twitter.com/search?q=%23pifuhd&src=recent_search_click&f=live) tag on Twitter. Sharing your good/bad results helps and encourages the authors to further push towards producition-quality human digitization at home.
**As the tweet buttom below doesn't add the result video automatically, please download the result video above and manually add it to the tweet.**

In [None]:
import IPython
IPython.display.HTML('<a href="https://twitter.com/intent/tweet?button_hashtag=pifuhd&ref_src=twsrc%5Etfw" class="twitter-hashtag-button" data-size="large" data-text="Google Colab Link: " data-url="https://bit.ly/37sfogZ" data-show-count="false">Tweet #pifuhd</a><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>  (Don\'t forget to add your result to the tweet!)')

## Cool Applications
Special thanks to those who play with PIFuHD and came up with many creative applications!! If you made any cool applications, please tweet your demo with [#pifuhd](https://twitter.com/search?q=%23pifuhd&src=recent_search_click&f=live). I'm constantly checking results there.
If you need complete texture on the mesh, please try my previous work [PIFu](https://github.com/shunsukesaito/PIFu) as well! It supports 3D reconstruction + texturing from a single image although the geometry quality may not be as good as PIFuHD.

In [None]:
IPython.display.HTML('<h2>Rigging (Mixamo) + Photoreal Rendering (Blender)</h2><blockquote class="twitter-tweet"><p lang="pt" dir="ltr">vcs ainda tem a PACHORRA de me dizer que eu não sei dançar<a href="https://twitter.com/hashtag/b3d?src=hash&amp;ref_src=twsrc%5Etfw">#b3d</a> <a href="https://twitter.com/hashtag/pifuhd?src=hash&amp;ref_src=twsrc%5Etfw">#pifuhd</a> <a href="https://t.co/kHCnLh6zxH">pic.twitter.com/kHCnLh6zxH</a></p>&mdash; lukas arendero (@lukazvd) <a href="https://twitter.com/lukazvd/status/1274810484798128131?ref_src=twsrc%5Etfw">June 21, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script><h2>FaceApp + Rigging (Mixamo)</h2><blockquote class="twitter-tweet"><p lang="ja" dir="ltr">カツラかぶってる自分に見える <a href="https://twitter.com/hashtag/pifuhd?src=hash&amp;ref_src=twsrc%5Etfw">#pifuhd</a> <a href="https://t.co/V8o7VduTiG">pic.twitter.com/V8o7VduTiG</a></p>&mdash; Shuhei Tsuchida (@shuhei2306) <a href="https://twitter.com/shuhei2306/status/1274507242910314498?ref_src=twsrc%5Etfw">June 21, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script><h2>Rigging (Mixamo) + AR (Adobe Aero)</AR><blockquote class="twitter-tweet"><p lang="ja" dir="ltr">写真→PIFuHD→Mixamo→AdobeAeroでサウンド付きARを作成。Zip化してLINEでARコンテンツを共有。<br>写真が1枚あれば簡単にARの3Dアニメーションが作れる時代…凄い。<a href="https://twitter.com/hashtag/PIFuHD?src=hash&amp;ref_src=twsrc%5Etfw">#PIFuHD</a> <a href="https://twitter.com/hashtag/AdobeAero?src=hash&amp;ref_src=twsrc%5Etfw">#AdobeAero</a> <a href="https://twitter.com/hashtag/Mixamo?src=hash&amp;ref_src=twsrc%5Etfw">#Mixamo</a> <a href="https://t.co/CbiMi4gZ0K">pic.twitter.com/CbiMi4gZ0K</a></p>&mdash; モジョン (@mojon1) <a href="https://twitter.com/mojon1/status/1273217947872317441?ref_src=twsrc%5Etfw">June 17, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script><h2>3D Printing</h2><blockquote class="twitter-tweet"><p lang="ja" dir="ltr"><a href="https://twitter.com/hashtag/pifuhd?src=hash&amp;ref_src=twsrc%5Etfw">#pifuhd</a> 楽しい〜<br>小さい自分プリントした <a href="https://t.co/4qyWuij0Hs">pic.twitter.com/4qyWuij0Hs</a></p>&mdash; isb (@vxzxzxzxv) <a href="https://twitter.com/vxzxzxzxv/status/1273136266406694913?ref_src=twsrc%5Etfw">June 17, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>')