Skip to content

Questions about visualizing predicted image. #4

@w4nanch1

Description

@w4nanch1

Hi,

Thank you for sharing your source code publicly. I appreciate your contribution. However, I encountered an issue while trying to reproduce the future frame prediction results as described in the paper. The predicted images do not seem to perform well, as illustrated as following.

Image

To visualize the predictions, I added the following code in the rollout function of eval_utils_calvin.py. Some parts of the code are omitted for brevity:

for step in range(EP_LEN):
    action, img_pred = model.step(obs, lang_annotation, step)
    os.makedirs('./saved_img/', exist_ok=True)
    if step % 50 == 0:
        img1 = unpatchify(img_pred[:, 0, ...], patch_size=args.patch_size, img_size=(args.calvin_input_image_size, args.calvin_input_image_size))
        img2 = unpatchify(img_pred[:, 1, ...], patch_size=args.patch_size, img_size=(args.calvin_input_image_size, args.calvin_input_image_size))
        for i in range(img1.shape[0]):
            torchvision.utils.save_image(img1[i, ...], f"./saved_img/img_{step}_t{i}_1.png")
            torchvision.utils.save_image(img2[i, ...], f"./saved_img/img_{step}_t{i}_2.png")
    if len(planned_actions) == 0: ...

Additionally, here is the unpatchify function I used:

def unpatchify(x, patch_size, img_size):
    N, L, _ = x.shape
    H, W = img_size
    h = H // patch_size
    w = W // patch_size

    x = x.view(N, h, w, 3, patch_size, patch_size)
    x = x.permute(0, 3, 1, 4, 2, 5).contiguous()
    x = x.view(N, 3, H, W)
    return x 

I used the pre-trained weights located in the folder finetune_bs=640_lr1e-4_atten_goal_state4_atten_only_obs_sv10_abc_reset_act_obs_ep5_abc, specifically 19.pth.

Here is a part of my eval.sh script:

calvin_dataset_path="calvin/dataset/task_ABC_D"
calvin_conf_path="calvin/calvin_models/conf"
vit_checkpoint_path="checkpoints/vit_mae/mae_pretrain_vit_base.pth" # downloaded from https://drive.google.com/file/d/1bSsvRI4mDM3Gg51C6xO0l9CbojYw3OEt/view?usp=sharing
save_checkpoint_path="checkpoints/"
### NEED TO CHANGE the checkpoint path ###
resume_from_checkpoint="checkpoints/calvin/19.pth"

Could you help me understand why the predicted images are not performing well? Am I missing something in the evaluation setup?
Thank you again for your time and assistance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions