Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Question on semi-supervised training #165

Closed
lawy623 opened this issue Nov 3, 2020 · 4 comments
Closed

Question on semi-supervised training #165

lawy623 opened this issue Nov 3, 2020 · 4 comments

Comments

@lawy623
Copy link

lawy623 commented Nov 3, 2020

I have some questions on the semi-supervised training.

(1)When I am training from coco 2d to h36m 3d(use detectron_pt_coco as 2d input), it seems that the order of 17 joints of two skeletons are different. But in the code, after projecting the h36m 3d-points back onto the image, it seems that you directly use
reconstruction_semi = projection_func(predicted_semi + predicted_traj_cat[split_idx:], cam_semi);
loss_reconstruction = mpjpe(reconstruction_semi, target_semi)
to calculate the 2d loss.
Seems that target_semi has not been transformed such that the different orders can be matched. Is here a bug, or do I miss something?

(2) I use the traj model you provided in #145 , and I found its trajectory and 3d pose are relatively good to provide a close 2d reprojection. But I am not sure how did you train it. Did you train with -subject-train with all 5 person, or just some of them for -subject-train and some for -subject-unlabelled? Also I found that we can actually train the trajectory only supervised on -subject-train by setting -warmup as -epoch. Do you train with only supervised trajectory, or also use -subject-unlabeled together in the progress? Which could be better if we want the trajectory and reprojection to be better? Also at the end, which level of traj_valid loss and 2d_valid did you get?

(3) Does the batchsize have a big impact on the semi-supervised training? I see you set -b 64 in the instruction, but in supervised by default it should be 1024.

(4) For the 2d input in semi-supervised training, I found that you use target = inputs_2d_cat[:split_idx, pad:-pad, :, :2].contiguous() if pad>0 to reduce the input shape to the output 1 frame result. But is here a bug since causal_shift also affects which frame to match. Should it be target = inputs_2d_cat[:split_idx, pad+causal_shift:, :, :2].contiguous() in causal model?

Thanks so much for your help.

@dariopavllo
Copy link
Contributor

Hi, sorry for the late answer.

  1. For semi-supervised training, the set of input and output joints must match. In our experiments, we use 2D detections fine-tuned on Human3.6m joints, not COCO detections. The fact that H3.6M and COCO both have 17 joints is just a coincidence. COCO comprises keypoints that are not available in Human3.6M (e.g. eyes) and vice versa, so it's not possible to match them unless you use some sort of learned matching function.

  2. I trained for sure using all supervised subjects. I don't remember if I included the unlabeled ones (S2, S3, S4), but it's not going to make a huge difference. The trick of increasing warmup makes sense.

  3. Large batch sizes generally help with batch normalization, but the dataset must be large enough (which is the case when you use all 3.6M frames). In the particular experiment mentioned in the README, we are training on 10% of Subject 1 (a small fraction of the dataset), so it makes sense to lower the batch size to improve generalization.

  4. Can you point me to the line you're referring to?

@lawy623
Copy link
Author

lawy623 commented Nov 13, 2020

target_semi = inputs_2d_semi[:, pad:-pad, :, :2].contiguous()

The line should be here but occurs several times for semi-loss.

@dariopavllo
Copy link
Contributor

The combination of semi-supervision + causal model has never been tested/evaluated, but I briefly checked the code and I think you are right! Your proposed change should work.

@lawy623
Copy link
Author

lawy623 commented Nov 14, 2020

Great. No other question. Thanks for sharing the great work!

@lawy623 lawy623 closed this as completed Nov 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants