Custom data #1

weihaosky · 2023-11-30T09:48:51Z

Congratulations on this excellent work!
I wonder how to run this work on my own data. For example, after capturing a monocular video, how to run your method? How should I process the data for training?
Many thanks!

JiahuiLei · 2023-11-30T21:35:31Z

Thanks for your interest. Our code was developed guided by InstantAvatar https://github.com/tijiang13/InstantAvatar, and indeed, we have a data loader for their preprocessing data format. Our data loading subroutine for UBC data uses the InstantAvatarWildDataset class, which you can modify a little bit and use for the preprocessed data from InstantAvatar preprocessing. (you may take some time to compile the openpose .etc for their preprocessing)

weihaosky · 2023-12-01T02:15:54Z

Thanks for your reply. In the paper you write you use ReFit to obtain the human pose. May I ask how to process the results from ReFit for the training of GART?

JiahuiLei · 2023-12-01T13:55:41Z

We tried both using InstantAvatar pre-processing to estimate video poses (with temporal optimization), and ReFit https://yufu-wang.github.io/refit_humans/ to estimate per-frame poses. It turns out that under the challenging UBC sequences, Ours and InstantAvatar will work better with ReFit poses. So we first estimate per-frame poses and I manually turn the poses into the same format as InstantAvatar preprocessing. So the data loader is actually loading in the Instant-Avatar preprocessing format.

weihaosky · 2023-12-04T07:35:58Z

We tried both using InstantAvatar pre-processing to estimate video poses (with temporal optimization), and ReFit https://yufu-wang.github.io/refit_humans/ to estimate per-frame poses. It turns out that under the challenging UBC sequences, Ours and InstantAvatar will work better with ReFit poses. So we first estimate per-frame poses and I manually turn the poses into the same format as InstantAvatar preprocessing. So the data loader is actually loading in the Instant-Avatar preprocessing format.

Thanks. Could you share the script for transferring the ReFit pose into InstantAvatar format?

weihaosky · 2023-12-04T13:11:03Z

I have tried to perform the conversion from ReFit result to InstantAvatar format, but the converted data does not work. The code for conversion is as follows:

# self.smpl_poses is the results of ReFit
# K is computed according to focal = np.sqrt(height**2 + width**2)
pose = []
for x in self.smpl_poses['pred_rotmat'][idx]:
    d, angle = mat2axangle(x)
    pose.append(d * angle)
pose = np.stack(pose).astype(np.float32)
# pose[0] = -pose[0]

ret = {
    "rgb": img.astype(np.float32),
    "mask": msk,
    "K": self.K.copy(),
    "smpl_beta": self.smpl_poses['pred_shape'][idx],  # ! use the first beta ???
    "smpl_pose": pose,
    "smpl_trans": self.smpl_poses["trans_full"][idx, 0],
    "idx": idx,
}

May I ask where is the problem? Many thanks!

JiahuiLei · 2023-12-04T18:26:22Z

# convert our mono-pose estimation to ubc fashion dataset
import numpy as np
import os, os.path as osp
import imageio
from pytorch3d.transforms import matrix_to_axis_angle
import torch
from tqdm import tqdm
from pycocotools import mask as masktool
import cv2


def process(seq):
    img_src = f"../data/ubcfashion/train_frames/{seq}/"
    msk_fn = f"../data/ubcfashion/train_mask/{seq}.npy"
    pose_fn = f"../data/ubcfashion/train_smpl/{seq}.npz"
    dst = f"../data/insav_wild/ourpose_ubc_{seq}/"
    os.makedirs(dst, exist_ok=True)

    pose_data = np.load(pose_fn)
    smpl_shape = pose_data["pred_shape"].mean(0)  # Use the average shape
    smpl_pose_list, smpl_global_trans = (
        pose_data["pred_rotmat"],
        pose_data["pred_trans"],
    )
    smpl_pose_list = matrix_to_axis_angle(torch.from_numpy(smpl_pose_list))
    smpl_pose_list = smpl_pose_list.numpy()

    focal, center = pose_data["img_focal"], pose_data["img_center"]
    K = np.eye(3)
    K[0, 0], K[1, 1] = focal, focal
    K[0, 2], K[1, 2] = center[0], center[1]

    pose_save_dict = {
        "betas": smpl_shape,
        "global_orient": smpl_pose_list[:, 0],
        "body_pose": smpl_pose_list[:, 1:].reshape(-1, 69),
        "transl": smpl_global_trans.squeeze(1),
    }
    np.savez_compressed(osp.join(dst, "poses_optimized.npz"), **pose_save_dict)

    image_save_dst = osp.join(dst, "images")
    mask_save_dst = osp.join(dst, "masks")
    os.makedirs(image_save_dst, exist_ok=True)
    os.makedirs(mask_save_dst, exist_ok=True)

    mask_data = np.load(msk_fn, allow_pickle=True)
    masks = [masktool.decode(m).astype(np.bool).astype(np.float32) for m in mask_data]

    for img_fn in tqdm(sorted(os.listdir(img_src))):
        image_id = int(img_fn.split(".")[0])
        mask = masks[image_id]
        img = cv2.imread(osp.join(img_src, img_fn))
        cv2.imwrite(osp.join(image_save_dst, img_fn), img)
        cv2.imwrite(osp.join(mask_save_dst, img_fn), mask * 255)

    cam_save_dict = {
        "intrinsic": K,
        "extrinsic": np.eye(4),
        "height": img.shape[0],
        "width": img.shape[1],
    }
    np.savez_compressed(osp.join(dst, "cameras.npz"), **cam_save_dict)


if __name__ == "__main__":
    # seqs = sorted(os.listdir("../data/ubcfashion/train_frames"))
    seqs = ["91+bCFG1jOS"]

    for seq in seqs:
        process(seq)

# def load_insav_smpl_param(path):
#     smpl_params = dict(np.load(str(path)))
#     if "thetas" in smpl_params:
#         smpl_params["body_pose"] = smpl_params["thetas"][..., 3:]
#         smpl_params["global_orient"] = smpl_params["thetas"][..., :3]
#     return {
#         "betas": smpl_params["betas"].astype(np.float32).reshape(1, 10),
#         "body_pose": smpl_params["body_pose"].astype(np.float32),
#         "global_orient": smpl_params["global_orient"].astype(np.float32),
#         "transl": smpl_params["transl"].astype(np.float32),
#     }


# insav_pose_fn = "../data/insav_wild/91+20mY7UJS/poses_optimized.npz"
# load_insav_smpl_param(insav_pose_fn)
# insav_cam_fn = "../data/insav_wild/91+20mY7UJS/cameras.npz"
# insav_cam = dict(np.load(insav_cam_fn, allow_pickle=True))
# for k, v in insav_cam.items():
#     print(k, v.shape, v.dtype)

# print()

This is the script I used to convert the poses into instant-avatar format, hope it may help you.

uniBruce · 2023-12-19T03:36:38Z

Hi, I am trying to use Refit to get SMPL and camera parameters from monocular videos and it seems that the estimation of focal deeply relies on some assumption and cannot be used directly. The result is shown below, could you please provide some advice about this case? Or how did you estimate the focal for the UBC dataset?

yufu-wang · 2023-12-21T05:50:17Z

@uniBruce Hi. When the ground truth focal is unavailable, we estimate it from the dimension of the image as $\sqrt{h^2+w^2}$ as in here. Your image seems to be cropped from another image, so the focal estimation won't be accurate (this may not affect GART though). However, your ReFit result looks expected from my experience because it's not trained with crop-augmentation so the accuracy will drop a bit when the human is cropped like this.

I also added a script in the ReFit repo (here) that runs on a folder of images and save pose results compatiable for GART. Please give it a try.

muximuxi · 2024-01-30T08:25:40Z

We tried both using InstantAvatar pre-processing to estimate video poses (with temporal optimization), and ReFit https://yufu-wang.github.io/refit_humans/ to estimate per-frame poses. It turns out that under the challenging UBC sequences, Ours and InstantAvatar will work better with ReFit poses. So we first estimate per-frame poses and I manually turn the poses into the same format as InstantAvatar preprocessing. So the data loader is actually loading in the Instant-Avatar preprocessing format.

hi! when I use the data which processed by the instantAvatar, after ./scripts/fit.sh , the resutl is bad, like this: can you give me some suggestions?

anhnb206110 · 2024-09-04T08:55:45Z

Hi, when I use real person video with da_pose pose, data preprocessed like InstantAvatar and trained GART for more than 25000 steps, the result is quite good but when you zoom in the image below, you can see RGB noise in the pants and shoes (pants should be really black, sleeves and shirt should be "smooth"), how should I edit the config or loss function to make the image sharper or the problem lies in my ground truth image? There is a problem in animate stage as the second image.

This is my config in ubc_mlp.yaml

TOTAL_steps: 50000 #15000 #30000
SEED: 12345

VIZ_INTERVAL: 500

CANO_POSE_TYPE: da_pose #da_pose #t_pose #da_pose
VOXEL_DEFORMER_RES: 64 #128 #64 #64 #128 #64 #128 #64

W_CORRECTION_FLAG: True
W_REST_DIM: 32 #0 #16
W_REST_MODE: pose-mlp #delta-list #pose-mlp
W_MEMORY_TYPE: voxel #voxel #point

F_LOCALCODE_DIM: 0

MAX_SCALE: 1.0
MIN_SCALE: 0.0 #0.0003 #0.003 #3
MAX_SPH_ORDER: 4
INCREASE_SPH_STEP: [3000, 5000, 6000, 7000] #[3000, 5000, 6000, 7000] #[1000, 2000, 3000]

INIT_MODE: on_mesh #near_mesh #near_mesh
OPACITY_INIT_VALUE: 0.99

ONMESH_INIT_SUBDIVIDE_NUM: 1
ONMESH_INIT_SCALE_FACTOR: 1.0
ONMESH_INIT_THICKNESS_FACTOR: 0.5

NEARMESH_INIT_NUM: 10000
NEARMESH_INIT_STD: 0.1
SCALE_INIT_VALUE: 0.01 # only used for random init

###########################

LR_P: 0.00016
LR_P_FINAL: 0.0000016
LR_Q: 0.001
LR_S: 0.005
LR_O: 0.05

LR_SPH: 0.0025
LR_SPH_REST: 0.0005

W_START_STEP: 500 #1000 #500 #2000 #300 #2000
LR_W: 0.0002 # 1 # 0.00001
LR_W_FINAL: 0.00002

LR_W_REST: 0.0002
LR_W_REST_FINAL: 0.00002
LR_W_REST_BONES: 0.0003 # for mlp

LR_F_LOCAL: 0.0

# Pose Optimize
POSE_R_BASE_LR: 0.0001
POSE_R_BASE_LR_FINAL: 0.00001
POSE_R_REST_LR: 0.0003
POSE_R_REST_LR_FINAL: 0.00001
POSE_T_LR: 0.0001
POSE_T_LR_FINAL: 0.00001
POSE_OPTIMIZE_START_STEP: 500 #1000

# Reg Terms
LAMBDA_MASK: 0.0 #0.01
MASK_LOSS_PAUSE_AFTER_RESET: 100

# other optim
N_POSES_PER_STEP: 1 #50 #1 #3 # increasing this does not help
RAND_BG_FLAG: True #True #True #True
# DEFAULT_BG: [0.0, 0.0, 0.0]
NOVEL_VIEW_PITCH: 0.0
IMAGE_ZOOM_RATIO: 1.0
VIEW_BALANCE_FLAG: True #True # True #True #False
BOX_CROP_PAD: 50

# GS Control
# densify
MAX_GRAD: 0.0002 #0.0003 #0.0005 #0.0006 # 0.0002
PERCENT_DENSE: 0.005 #0.01
DENSIFY_START: 500
DENSIFY_INTERVAL: 100 #300 #500 #1000 #300
DENSIFY_END: 9000 #10000 #15000
# prune
PRUNE_START: 500
PRUNE_INTERVAL: 300
OPACIT_PRUNE_TH: 0.01
RESET_OPACITY_STEPS: [3000, 5000] #[3000, 5000] #5000 #3000
OPACIT_RESET_VALUE: 0.01
# regaussian
REGAUSSIAN_STD: 0.015 #0.02 #0.02 #0.01
REGAUSSIAN_STEPS: [7000. 14000]

CANONICAL_SPACE_REG_K: 6
LAMBDA_STD_Q: 0.01
LAMBDA_STD_S: 0.01

LAMBDA_STD_O: 0.01
LAMBDA_STD_CD: 0.03
LAMBDA_STD_CH: 0.03
# LAMBDA_STD_W: 0.3
# LAMBDA_STD_W_REST: 0.3
LAMBDA_STD_W: 0.3
LAMBDA_STD_W_REST: 0.1
LAMBDA_KNN_DIST: 0.00

LAMBDA_W_NORM: 0.01
LAMBDA_W_REST_NORM: 0.1

START_END_SKIP: [0, 400, 1]

The walking animation is as below, how to fix the error when animate?

JiahuiLei mentioned this issue Jun 7, 2024

How to process my own dataset for training #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom data #1

Custom data #1

weihaosky commented Nov 30, 2023

JiahuiLei commented Nov 30, 2023

weihaosky commented Dec 1, 2023 •

edited

Loading

JiahuiLei commented Dec 1, 2023

weihaosky commented Dec 4, 2023

weihaosky commented Dec 4, 2023 •

edited

Loading

JiahuiLei commented Dec 4, 2023

uniBruce commented Dec 19, 2023

yufu-wang commented Dec 21, 2023 •

edited

Loading

muximuxi commented Jan 30, 2024 •

edited

Loading

anhnb206110 commented Sep 4, 2024 •

edited

Loading

Custom data #1

Custom data #1

Comments

weihaosky commented Nov 30, 2023

JiahuiLei commented Nov 30, 2023

weihaosky commented Dec 1, 2023 • edited Loading

JiahuiLei commented Dec 1, 2023

weihaosky commented Dec 4, 2023

weihaosky commented Dec 4, 2023 • edited Loading

JiahuiLei commented Dec 4, 2023

uniBruce commented Dec 19, 2023

yufu-wang commented Dec 21, 2023 • edited Loading

muximuxi commented Jan 30, 2024 • edited Loading

anhnb206110 commented Sep 4, 2024 • edited Loading

weihaosky commented Dec 1, 2023 •

edited

Loading

weihaosky commented Dec 4, 2023 •

edited

Loading

yufu-wang commented Dec 21, 2023 •

edited

Loading

muximuxi commented Jan 30, 2024 •

edited

Loading

anhnb206110 commented Sep 4, 2024 •

edited

Loading