# Text2GART

In [1]:
%load_ext autoreload
%autoreload 2
import os, os.path as osp
from solver import TGFitter
import torch
from viz_utils import viz_human_all

In [2]:
from lib_guidance.mvdream.mvdream_guidance import MVDream

# guidance = MVDream(torch.device("cuda"), fp16=True, n_view=4, t_range=[0.02, 0.98]) # seems reduce 20% time
guidance = MVDream(torch.device("cuda"), fp16=False, n_view=4, t_range=[0.02, 0.98])

Loading MVDream
Loading model from config: sd-v2-base.yaml
No module 'xformers'. Proceeding without it.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Loading model from cache file: /mnt/kostas_home/leijh/.cache/huggingface/hub/models--MVDream--MVDream/snapshots/d14ac9d78c48c266005729f2d5633f6c265da467/sd-v2.1-base-4view.pt


Loading pipeline components...:   0%|          | 0/6 [00:00<?, ?it/s]

In [3]:
texts = [
    "A yellow CyberPunk robot, silver skeleton",
    "A frog character from a game",
    # "A red Gundam Zaku, with a jet backpack",
    # "A doctor in green surgical uniform",
    # "A policeman in blue uniform",
    # "A white humanoid robot from 23 century with red eyes",
    # "A silver robot with single red eye like hal9000",
    # "An energetic computer science professor, wit blue t-shirt, black jeans",
    # "A knight with a sword on his hand",
    # "Skywalker",
    # "A pink cute ellipsoid robot",
    # "Spiderman with batman suit",
    # "A walking Banana",
    # "A dinosaur",
    # "A dog",
]

In [4]:
profile_fn = "./profiles/text2gart/gen.yaml"
base_name = "text2gart"
# # or you could try this, adding camera location near hands, but seems not that helpful
# profile_fn = "./profiles/text2gart/gen_hand.yaml"
# base_name = "text2gart_hand_enhance"

os.makedirs(f"./logs/{base_name}", exist_ok=True)

for txt in texts:
    torch.cuda.empty_cache()
    print(txt)
    guidance.set_text_embeds(
        [txt + "; smooth texture, clean image, single subject image, black background"],
        [
            "ugly, bad anatomy, blurry, pixelated obscure, unnatural colors, poor lighting, dull, and unclear, lowres, low quality, artifacts, duplicate, morbid, mutilated, poorly drawn face, dehydrated, bad proportions"
        ],  # remove the cropped, deformed
    )

    log_id = len(os.listdir(f"./logs/{base_name}")) + 1
    log_dir = f"./logs/{base_name}/{log_id}_{txt}"

    solver = TGFitter(
        log_dir=log_dir,
        profile_fn=profile_fn,
        mode="human",
        template_model_path="./data/smpl_model/SMPL_MALE.pkl",
        device=torch.device("cuda:0"),
        NO_TB=True,
    )

    data_provider = solver.prepare_fake_data(
        mode="amass",
        pose_dirs=["./data/amass/BMLrub/"],
        da_pose_prob=0.0,
    )

    _, _ = solver.run(fake_data_provider=data_provider, guidance=guidance)
    viz_human_all(solver, None, training_skip=1)

A yellow CyberPunk robot, silver skeleton


cp: target 'skeleton/backup/' is not a directory


Caching ./data/amass/BMLrub/ ...


100%|██████████| 112/112 [00:13<00:00,  8.12it/s]


Database has poses 3763367 with DA-pose prob 0.0 and range [0.0, 0.7853981633974483]
Using predefined pose: a_pose


| kd-a40-0 | INFO | Nov-27-20:08:35 | Init 20000 Components inside the mesh   [init_helpers.py:152]
| kd-a40-0 | INFO | Nov-27-20:08:35 | Init with 20000 Gaussians   [solver.py:225]
| kd-a40-0 | INFO | Nov-27-20:08:35 | Start training at 1701115715.580202   [solver.py:797]
  6%|▌         | 75/1300 [00:31<18:07,  1.13it/s]| kd-a40-0 | INFO | Nov-27-20:09:07 | Densify: Clone[+] 5066, Split[+] 0   [model.py:682]
| kd-a40-0 | INFO | Nov-27-20:09:07 | Densify: 20000->25066   [solver.py:914]
 12%|█▏        | 150/1300 [01:00<13:50,  1.38it/s]| kd-a40-0 | INFO | Nov-27-20:09:36 | Densify: Clone[+] 6838, Split[+] 0   [model.py:682]
| kd-a40-0 | INFO | Nov-27-20:09:36 | Densify: 25066->31904   [solver.py:914]
 15%|█▌        | 200/1300 [01:21<14:46,  1.24it/s]| kd-a40-0 | INFO | Nov-27-20:09:57 | Prune: 231   [model.py:734]
| kd-a40-0 | INFO | Nov-27-20:09:57 | Prune: 31904->31673   [solver.py:923]
 17%|█▋        | 225/1300 [01:31<12:55,  1.39it/s]| kd-a40-0 | INFO | Nov-27-20:10:07 | Densify: Cl

Using predefined pose: a_pose


| kd-a40-0 | INFO | Nov-27-20:18:51 | Init 20000 Components inside the mesh   [init_helpers.py:152]
| kd-a40-0 | INFO | Nov-27-20:18:51 | Init with 20000 Gaussians   [solver.py:225]
| kd-a40-0 | INFO | Nov-27-20:18:51 | After loading:   [solver.py:174]
| kd-a40-0 | INFO | Nov-27-20:18:51 | [_xyz:280.1K] [_rotation:373.4K] [_scaling:280.1K] [_opacity:93.4K] [_features_dc:280.1K] [_features_rest:840.2K] [_w_correction_dc:0.0K] [_w_correction_rest:0.0K] [_features_localcode:0.0K] [template.voxel_deformer.voxel_w_correction:6000.0K] [template.voxel_deformer.additional_correction:2000.0K]    [model.py:261]


Viz spinning...
Viz novel cano-pose...
Viz novel t-pose...
Viz novel da-pose...
Viz novel seq...
A frog character from a game


cp: target 'game/backup/' is not a directory


Caching ./data/amass/BMLrub/ ...


100%|██████████| 112/112 [00:09<00:00, 12.26it/s]


Database has poses 3763367 with DA-pose prob 0.0 and range [0.0, 0.7853981633974483]
Using predefined pose: a_pose


| kd-a40-0 | INFO | Nov-27-20:20:11 | Init 20000 Components inside the mesh   [init_helpers.py:152]
| kd-a40-0 | INFO | Nov-27-20:20:11 | Init with 20000 Gaussians   [solver.py:225]
| kd-a40-0 | INFO | Nov-27-20:20:11 | Start training at 1701116411.243235   [solver.py:797]
  6%|▌         | 75/1300 [00:30<15:57,  1.28it/s]| kd-a40-0 | INFO | Nov-27-20:20:41 | Densify: Clone[+] 2899, Split[+] 0   [model.py:682]
| kd-a40-0 | INFO | Nov-27-20:20:41 | Densify: 20000->22899   [solver.py:914]
 12%|█▏        | 150/1300 [00:59<13:33,  1.41it/s]| kd-a40-0 | INFO | Nov-27-20:21:11 | Densify: Clone[+] 4482, Split[+] 80   [model.py:682]
| kd-a40-0 | INFO | Nov-27-20:21:11 | Densify: 22899->27421   [solver.py:914]
 15%|█▌        | 200/1300 [01:19<15:31,  1.18it/s]| kd-a40-0 | INFO | Nov-27-20:21:31 | Prune: 203   [model.py:734]
| kd-a40-0 | INFO | Nov-27-20:21:31 | Prune: 27421->27218   [solver.py:923]
 17%|█▋        | 225/1300 [01:29<12:48,  1.40it/s]| kd-a40-0 | INFO | Nov-27-20:21:41 | Densify: C

Using predefined pose: a_pose


| kd-a40-0 | INFO | Nov-27-20:30:04 | Init 20000 Components inside the mesh   [init_helpers.py:152]
| kd-a40-0 | INFO | Nov-27-20:30:04 | Init with 20000 Gaussians   [solver.py:225]
| kd-a40-0 | INFO | Nov-27-20:30:04 | After loading:   [solver.py:174]
| kd-a40-0 | INFO | Nov-27-20:30:04 | [_xyz:225.7K] [_rotation:300.9K] [_scaling:225.7K] [_opacity:75.2K] [_features_dc:225.7K] [_features_rest:677.1K] [_w_correction_dc:0.0K] [_w_correction_rest:0.0K] [_features_localcode:0.0K] [template.voxel_deformer.voxel_w_correction:6000.0K] [template.voxel_deformer.additional_correction:2000.0K]    [model.py:261]


Viz spinning...
Viz novel cano-pose...
Viz novel t-pose...
Viz novel da-pose...
Viz novel seq...
