Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing training code #10

Closed
eugenelyj opened this issue May 31, 2022 · 10 comments
Closed

Reproducing training code #10

eugenelyj opened this issue May 31, 2022 · 10 comments

Comments

@eugenelyj
Copy link

Hi,

Thanks for your amazing work!! I am currently trying to reproduce train.py (mainly copy from RAFT project). Some training details are listed as following:

  1. learning rate: 1e-4.
  2. epochs: 40.
  3. loss: sequence loss from RAFT (https://github.com/princeton-vl/RAFT/blob/master/train.py#L47) with gamma equals to 0.8.
  4. random crop size [288, 384]

However, the performance is not satisfying even on training set (e.g. 1.207 epe on croped image/events and 2.913 epe on full image/events).

Did this happen to you while training? Looking forward to your reply.

The following are screenshots of my training status. Notice that 1px/3px/5px means the ratio of pixels whose epe is smaller than xx px compared with all valid pixels.

image

image

@magehrig
Copy link
Contributor

magehrig commented Jul 22, 2022

Hi @eugenelyj

Sorry for the late reply. I reimplemented the method for a follow-up project and reached pretty much exactly the same performance but using a simpler One-Cycle LR schedule. Specifically, I used a learning rate of 0.0001, 250k steps, batch size of 3. In addition to the cropping you mentioned, I used 50% prob for horizontal flipping and also 10% prob for vertical flipping (have not tested without). These are the training metrics:

image

Nowadays, I would go with the One-Cycle LR schedule to keep it simple.

@eugenelyj
Copy link
Author

@magehrig
Thanks! Besides, how are the evaluation metrics in each epoch? This information will help me a lot :)

@magehrig
Copy link
Contributor

magehrig commented Jul 23, 2022

I don't have an evaluation in the loop because I am training on the full training set here. In general, the checkpoint at the last step is performing the best. In this case, the last checkpoint achieves 0.786 EPE and 2.74 AE on the test set.
It's important that you have a low learning rate in the final stages of your training, otherwise, your test score fluctuates. That's why a One-Cycle LR schedule is quite convenient.

@eugenelyj
Copy link
Author

Actually i also employ One-Cycle LR schedule because my training code is migrated from RAFT. I will re-train once again because some of my settings are different. For example, my batch size is 6 and i ignore the data augmentation of flipping. Thansk for the information.

@magehrig
Copy link
Contributor

I still think something else must be wrong with your code, because your training EPE is so high.
Another thing to check: In the original RAFT, if you want Flow from img1 to img2, you would give the context network the features from img1. In the case of event data, we provide the features from event representation 2, not 1.

@HTLeoo
Copy link

HTLeoo commented Aug 17, 2022

@eugenelyj Hi, have you reproduced the results on the benchmark ?

@eugenelyj
Copy link
Author

eugenelyj commented Aug 22, 2022

@HTLeoo
Hi, I tried again, and the following are the latest result i got.
After convergence, it achieves about 1.5 epe on the full-resolution training data (trained on the cropped data) and 12.713 epe on average on the testining data (submitted on the DSEC website).

Some details:
lr: 1e-4
lr schedule: one-cycle
batch size:8
epochs: 150
random crop size: [288, 384]
prob of y-flipping: 0.1
prob of x-flipping: 0.5

Screenshot from 2022-08-22 20-00-07
Screenshot from 2022-08-22 19-59-59
Screenshot from 2022-08-22 20-00-38

@eugenelyj
Copy link
Author

@magehrig
Hi, i fixed some mistake in my code and currently the training epe is low (about 0.58).
However, the epe on the testing data (12.7 on average) and also on the full-resolution training data (1.4 on average) are still high.
"In the case of event data, we provide the features from event representation 2, not 1." -> It should not be a problem for me as the code is from this repo. What i do is just adding a training script.

@magehrig
Copy link
Contributor

Are you correctly augmenting the flow labels when flipping? E.g. horizontal flipping -> invert x channel direction, and vertical flipping -> invert y channel? I think wrong data is the most likely explanation for your train-test difference in performance. Such overfitting is basically impossible with this model on this dataset.

@eugenelyj
Copy link
Author

@magehrig Thank you, i did mistaken the flow image during data augmentation (that is, forget to modify flow value but only flipping the flow image as what i do to a RGB image).
@HTLeoo Now i reproduce the result in e-raft, results on the benchmark are shown below, it even performs slightly better than the original paper. I guess it is because i also use slight scaling for data augmentation (i use the augmentor.py from raft and remove the photomentric augmentation).
Screenshot from 2022-09-27 21-28-02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants