Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you please provide the traning log? #4

Closed
flyyyyer opened this issue Apr 5, 2021 · 13 comments
Closed

Could you please provide the traning log? #4

flyyyyer opened this issue Apr 5, 2021 · 13 comments

Comments

@flyyyyer
Copy link

flyyyyer commented Apr 5, 2021

I follow the setting as your paper and train the model with 27 frame, but I can't get the performance as your paper said. This is my traning log
[1] time 20.00 lr 0.000200 3d_train 79.10 3d_valid 64.31
[2] time 19.75 lr 0.000196 3d_train 42.98 3d_valid 59.67
[3] time 19.77 lr 0.000192 3d_train 34.62 3d_valid 58.55
[4] time 19.88 lr 0.000188 3d_train 29.96 3d_valid 57.28
[5] time 19.83 lr 0.000184 3d_train 26.81 3d_valid 56.05
[6] time 19.86 lr 0.000181 3d_train 24.47 3d_valid 55.73
[7] time 19.82 lr 0.000177 3d_train 22.63 3d_valid 54.94
[8] time 19.88 lr 0.000174 3d_train 21.14 3d_valid 54.19
[9] time 19.84 lr 0.000170 3d_train 19.91 3d_valid 54.72
[10] time 19.86 lr 0.000167 3d_train 18.87 3d_valid 53.88
[11] time 19.77 lr 0.000163 3d_train 17.99 3d_valid 54.01
[12] time 19.74 lr 0.000160 3d_train 17.25 3d_valid 54.11
[13] time 19.68 lr 0.000157 3d_train 16.62 3d_valid 54.26
[14] time 19.74 lr 0.000154 3d_train 16.08 3d_valid 54.30
[15] time 19.66 lr 0.000151 3d_train 15.60 3d_valid 54.15
[16] time 19.80 lr 0.000148 3d_train 15.17 3d_valid 54.57
[17] time 19.85 lr 0.000145 3d_train 14.79 3d_valid 54.10
[18] time 19.95 lr 0.000142 3d_train 14.45 3d_valid 53.76
[19] time 19.80 lr 0.000139 3d_train 14.15 3d_valid 53.93
[20] time 19.84 lr 0.000136 3d_train 13.86 3d_valid 54.03
[21] time 19.84 lr 0.000134 3d_train 13.60 3d_valid 54.89
[22] time 19.86 lr 0.000131 3d_train 13.36 3d_valid 54.32
[23] time 19.87 lr 0.000128 3d_train 13.14 3d_valid 53.97
[24] time 19.82 lr 0.000126 3d_train 12.94 3d_valid 54.72
[25] time 19.78 lr 0.000123 3d_train 12.75 3d_valid 54.38
[26] time 19.86 lr 0.000121 3d_train 12.58 3d_valid 54.50
[27] time 19.85 lr 0.000118 3d_train 12.42 3d_valid 54.18
[28] time 19.82 lr 0.000116 3d_train 12.25 3d_valid 54.38
[29] time 19.81 lr 0.000114 3d_train 12.13 3d_valid 54.09
[30] time 19.83 lr 0.000111 3d_train 11.99 3d_valid 55.16
[31] time 19.82 lr 0.000109 3d_train 11.87 3d_valid 54.22
[32] time 19.84 lr 0.000107 3d_train 11.75 3d_valid 54.56
[33] time 19.81 lr 0.000105 3d_train 11.64 3d_valid 54.03
[34] time 19.84 lr 0.000103 3d_train 11.54 3d_valid 54.39
[35] time 19.81 lr 0.000101 3d_train 11.43 3d_valid 55.32
[36] time 19.80 lr 0.000099 3d_train 11.33 3d_valid 54.34
[37] time 19.81 lr 0.000097 3d_train 11.24 3d_valid 55.05
[38] time 19.80 lr 0.000095 3d_train 11.15 3d_valid 54.58
[39] time 19.85 lr 0.000093 3d_train 11.07 3d_valid 54.72
[40] time 19.79 lr 0.000091 3d_train 10.99 3d_valid 55.01
[41] time 19.74 lr 0.000089 3d_train 10.91 3d_valid 54.71
[42] time 19.81 lr 0.000087 3d_train 10.83 3d_valid 54.52
[43] time 19.83 lr 0.000086 3d_train 10.76 3d_valid 54.47
[44] time 19.87 lr 0.000084 3d_train 10.69 3d_valid 54.51
[45] time 19.86 lr 0.000082 3d_train 10.63 3d_valid 54.54
[46] time 19.80 lr 0.000081 3d_train 10.56 3d_valid 55.12
[47] time 19.86 lr 0.000079 3d_train 10.50 3d_valid 55.35
[48] time 19.78 lr 0.000077 3d_train 10.44 3d_valid 55.37
[49] time 19.68 lr 0.000076 3d_train 10.39 3d_valid 54.62
[50] time 19.70 lr 0.000074 3d_train 10.32 3d_valid 55.17

@zczcwh
Copy link
Owner

zczcwh commented Apr 7, 2021

@flyyyyer The lr is too large for your setting. You may adjust based on # of GPU used and batch size. Also, the loss here is frame-wise MPJPE. The action-wise MJPJE(reported in the paper) would be smaller than this.
I didn't keep the original log, here is what I can find. You may get the best result after 100 epochs.

[1] time 43.30 lr 0.000040 3d_train 88.134033 3d_eval 63.541016 3d_valid 67.047745
save best checkpoint
[2] time 42.85 lr 0.000040 3d_train 52.574448 3d_eval 51.973287 3d_valid 60.846769
save best checkpoint
[3] time 43.11 lr 0.000039 3d_train 43.446893 3d_eval 46.730750 3d_valid 57.600512
save best checkpoint
[4] time 42.98 lr 0.000039 3d_train 38.523466 3d_eval 41.859904 3d_valid 56.776868
save best checkpoint
[5] time 44.07 lr 0.000038 3d_train 35.122552 3d_eval 39.267369 3d_valid 55.464280
save best checkpoint
[6] time 44.26 lr 0.000038 3d_train 32.561677 3d_eval 37.278198 3d_valid 56.194792
[7] time 43.05 lr 0.000038 3d_train 30.496691 3d_eval 35.207880 3d_valid 55.060880
save best checkpoint
[8] time 42.87 lr 0.000037 3d_train 28.799342 3d_eval 33.555473 3d_valid 53.485547
save best checkpoint
[9] time 42.85 lr 0.000037 3d_train 27.334583 3d_eval 31.429358 3d_valid 53.506819
[10] time 42.88 lr 0.000037 3d_train 26.053745 3d_eval 30.532276 3d_valid 52.772135
save best checkpoint
[11] time 42.92 lr 0.000036 3d_train 24.963426 3d_eval 28.963978 3d_valid 52.699806
save best checkpoint
[12] time 42.89 lr 0.000036 3d_train 24.003360 3d_eval 27.877669 3d_valid 51.829082
save best checkpoint
[13] time 42.86 lr 0.000035 3d_train 23.152311 3d_eval 26.950195 3d_valid 51.293599
save best checkpoint
[14] time 42.88 lr 0.000035 3d_train 22.396428 3d_eval 26.504486 3d_valid 51.307282
[15] time 42.91 lr 0.000035 3d_train 21.712582 3d_eval 25.881128 3d_valid 50.811456
save best checkpoint
[16] time 42.90 lr 0.000034 3d_train 21.098003 3d_eval 25.102216 3d_valid 51.562999
[17] time 42.88 lr 0.000034 3d_train 20.546750 3d_eval 24.539008 3d_valid 50.635478
save best checkpoint
[18] time 42.93 lr 0.000034 3d_train 20.034142 3d_eval 23.775496 3d_valid 50.546515
save best checkpoint
[19] time 42.85 lr 0.000033 3d_train 19.560787 3d_eval 23.434541 3d_valid 50.991170
[20] time 42.89 lr 0.000033 3d_train 19.132629 3d_eval 22.710669 3d_valid 50.842965
[21] time 42.87 lr 0.000033 3d_train 18.732965 3d_eval 22.415315 3d_valid 50.840368
[22] time 42.91 lr 0.000032 3d_train 18.357312 3d_eval 21.983834 3d_valid 50.822052
[23] time 42.91 lr 0.000032 3d_train 18.019280 3d_eval 21.513001 3d_valid 50.384931
save best checkpoint
[24] time 42.87 lr 0.000032 3d_train 17.700399 3d_eval 20.899166 3d_valid 50.247679
save best checkpoint
[25] time 42.87 lr 0.000031 3d_train 17.401278 3d_eval 20.525096 3d_valid 50.441784
[26] time 42.86 lr 0.000031 3d_train 17.125005 3d_eval 20.765650 3d_valid 50.597848
[27] time 42.87 lr 0.000031 3d_train 16.860035 3d_eval 20.186748 3d_valid 50.809448
[28] time 42.85 lr 0.000030 3d_train 16.608819 3d_eval 20.046163 3d_valid 50.436903
[29] time 42.85 lr 0.000030 3d_train 16.382554 3d_eval 19.185565 3d_valid 50.197757
save best checkpoint
[30] time 42.85 lr 0.000030 3d_train 16.154390 3d_eval 18.843101 3d_valid 49.799016
save best checkpoint
[31] time 42.86 lr 0.000030 3d_train 15.952497 3d_eval 19.015874 3d_valid 50.695172
[32] time 42.89 lr 0.000029 3d_train 15.750949 3d_eval 18.361437 3d_valid 49.718099
save best checkpoint
[33] time 42.86 lr 0.000029 3d_train 15.568251 3d_eval 18.328863 3d_valid 50.225716
[34] time 42.87 lr 0.000029 3d_train 15.382617 3d_eval 17.967106 3d_valid 49.896671
[35] time 42.85 lr 0.000028 3d_train 15.220156 3d_eval 17.529078 3d_valid 50.113246
[36] time 42.86 lr 0.000028 3d_train 15.057232 3d_eval 17.416918 3d_valid 49.799598
[37] time 42.86 lr 0.000028 3d_train 14.906076 3d_eval 16.988557 3d_valid 50.152376
[38] time 42.90 lr 0.000028 3d_train 14.753793 3d_eval 17.058883 3d_valid 50.164568
[39] time 42.84 lr 0.000027 3d_train 14.610176 3d_eval 17.043122 3d_valid 49.824760
[40] time 42.84 lr 0.000027 3d_train 14.479838 3d_eval 16.560833 3d_valid 49.471529
Saving checkpoint to checkpoint/epoch_40.bin
save best checkpoint
[41] time 42.88 lr 0.000027 3d_train 14.343974 3d_eval 16.511827 3d_valid 49.789071
[42] time 42.87 lr 0.000026 3d_train 14.220869 3d_eval 16.425912 3d_valid 49.394231
save best checkpoint
[43] time 42.80 lr 0.000026 3d_train 14.105485 3d_eval 16.223125 3d_valid 49.689540
[44] time 42.82 lr 0.000026 3d_train 13.993403 3d_eval 16.001966 3d_valid 49.983096
[45] time 42.88 lr 0.000026 3d_train 13.883140 3d_eval 16.340899 3d_valid 49.880556
[46] time 42.91 lr 0.000025 3d_train 13.775383 3d_eval 16.198288 3d_valid 49.838619
[47] time 42.88 lr 0.000025 3d_train 13.679699 3d_eval 15.895548 3d_valid 49.798829
[48] time 42.88 lr 0.000025 3d_train 13.584142 3d_eval 15.757708 3d_valid 49.363614
save best checkpoint
[49] time 42.89 lr 0.000025 3d_train 13.488471 3d_eval 15.703400 3d_valid 49.332396
save best checkpoint
[50] time 42.89 lr 0.000024 3d_train 13.406798 3d_eval 15.487668 3d_valid 49.592302

@zczcwh zczcwh closed this as completed Apr 9, 2021
@flyyyyer
Copy link
Author

The lr is too large for your setting. You may adjust based on # of GPU used and batch size. Also, the loss here is frame-wise MPJPE. The action-wise MJPJE(reported in the paper) would be smaller than this.
I didn't keep the original log, here is what I can find. You may get the best result after 100 epochs.

Thanks for your reply. Why the action-wise MJPJE is smaller ?

@zczcwh
Copy link
Owner

zczcwh commented Apr 16, 2021

@flyyyyer Because each action has a different number of frames. Action-wise average MJPJE is the mean of MPJPE of 15 actions.
We follow the calculation as in previous papers.

@leechaonan
Copy link

@zczcwh could you please provide the detailed training log for training a model from scratch (Ground truth 2D pose as input) with 81 frames. I could not reproduce the performance as your paper said.

@leechaonan
Copy link

@flyyyyer Have you reproduced the results in the paper?

@flyyyyer
Copy link
Author

flyyyyer commented May 5, 2021

@flyyyyer Have you reproduced the results in the paper?

No, you reproduced the results in the paper?

@leechaonan
Copy link

@flyyyyer I also can't reproduce the results in the paper. did you try 81 frames and GT keypoints as input? I got 3d_valid:45 mpjpe after 15 epochs, and the training loss looks like didn't decrease anymore and is far away from 31.3 which is reported on the paper.

@hankhuynh1011
Copy link

hankhuynh1011 commented May 22, 2021

@flyyyyer could you reproduced the results 27f or 81f as the paper? I am trying with 27f, just ~28e (my GPU too weak), but the loss seems very high ~51mm, it is very hard to converge.

@flyyyyer
Copy link
Author

flyyyyer commented Jun 3, 2021

I also can't reproduce the results. @zczcwh could you give some help and provide the entire trainning log?

@ClimberY
Copy link

@flyyyyer I also can't reproduce the results in the paper. did you try 81 frames and GT keypoints as input? I got 3d_valid:45 mpjpe after 15 epochs, and the training loss looks like didn't decrease anymore and is far away from 31.3 which is reported on the paper.

Have you seen the error when using 2D GT keypoints?
"PoseFormer-main/run_poseformer.py", line 312, in
loss_3d_pos = mpjpe(predicted_3d_pos, inputs_3d)
"PoseFormer-main/common/loss.py", line 16, in mpjpe
assert predicted.shape == target.shape
AssertionError

@alecda573
Copy link

@flyyyyer were you able to ever reproduce the results? Was this training log for detections or for the ground truth?
@zczcwh Can you please let us know how many epochs you trained for and also what your hyperparameters were for drop path, and projection and attention drop?

@SleepEarlyLiveLong
Copy link

I got 50.3mm with f=9 in training and 49.8mm using its evaluation code (even 0.1 mm smaller than what was repotered in paper). However, I get larger error when it comes to f=27 (53.6mm) and f=81(even larger).

@TomCatOnline
Copy link

I got 50.3mm with f=9 in training and 49.8mm using its evaluation code (even 0.1 mm smaller than what was repotered in paper). However, I get larger error when it comes to f=27 (53.6mm) and f=81(even larger).

I got 57.3mm MPJPE with f=27 (cpn as input), looks like no one ever reproduced the results in paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants