Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trained your network locally, but my eval. result is not as good as yours #12

Closed
jeongc opened this issue Jun 30, 2017 · 7 comments
Closed

Comments

@jeongc
Copy link

jeongc commented Jun 30, 2017

Hi,

Thanks for sharing your wonderful works.
I followed your readme file, and it seems that everything goes well until when I see different evaluation result compared with yours (Ours(K) in table 1 of your paper) when I do evaluate based on locally trained weight. During training, I used the same parameters as suggested on this webpage.

The followings metrics which I got:
abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.2621, 3.6171, 8.2036, 0.3806, 0.0000, 0.6577, 0.8520, 0.9258

I checked and reviewed procedure, but I cannot find any hint for the above different evaluation result.
Let me know what do you think about it.

Regards,
CJ

@tinghuiz
Copy link
Owner

tinghuiz commented Jul 1, 2017

How long did you train it for?

@jeongc
Copy link
Author

jeongc commented Jul 5, 2017

Hi,

I ran 20 epochs, and tried training locally twice, but got almost same results.

-- Training messages
Epoch: [20] [ 8522/10062] time: 1.0036/it loss: 0.767
Epoch: [20] [ 8622/10062] time: 0.9790/it loss: 0.807
Epoch: [20] [ 8722/10062] time: 0.9521/it loss: 0.773

When I ran inference with your weight or your ".npy", I got almost the same results as your paper.

CJ

@tinghuiz
Copy link
Owner

tinghuiz commented Jul 5, 2017

This is consistent with my observation too. The system could start overfitting on KITTI if trained for too long. You could use the validation set to decide the stopping point.

@tinghuiz tinghuiz closed this as completed Jul 5, 2017
@Yuliang-Zou
Copy link

I have the same problem. According to the paper (section4), I chose the check point in 150930 steps to test the result (eigen test split). But the performance is really bad, almost double the errors.

Do you just resize the image to 128x416 first, then feed it to the network? Or you perform some pre-processing?

@tinghuiz
Copy link
Owner

The best model I had was trained on Cityscapes for ~100K steps and then fine-tuned on KITTI for ~50K steps. This should have been made more clear in the paper.

@Huang-Jin
Copy link
Contributor

Huang-Jin commented Oct 8, 2018

I faced similar problem too. And there are a more serious problem about the explainability mask.
The model trained without mask have a result,
abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.2333, 3.4107, 7.4939, 0.3107, 0.0000, 0.6883, 0.8785, 0.9449

while the model trained with mask (weight is 0.2 from the paper) have a result,
abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.2474, 4.0113, 7.7726, 0.3197, 0.0000, 0.6722, 0.8746, 0.9430

The both two model are selected around the 200k steps, model_191178.
So this is not consistent with the paper, the result is bad after applying mask.
Could you explain it for me?

@tinghuiz
Copy link
Owner

tinghuiz commented Oct 8, 2018

I have gotten reports that if you use a Tensorflow version later than 1.0, the results could get somewhat worse (not sure why). Also, I have done some tweaks to the code that improve the results for the non-mask model (see Notes section in README), which might have made the explainability weight non-optimal any more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants