Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce paper results #3

Closed
rig8f opened this issue Jun 1, 2019 · 7 comments
Closed

Cannot reproduce paper results #3

rig8f opened this issue Jun 1, 2019 · 7 comments

Comments

@rig8f
Copy link

rig8f commented Jun 1, 2019

I've gone through the paper and run the code for MT-PNet with the commands specified in the Readme file and the same configuration present in s3dis.json but I cannot obtain the same results evidenced in the paper.

After 100 epochs of training, evaluation gives me

"mean_accuracy": 0.4461
"mean_IoU": 0.3441
"mean_precision": 0.2504
"mean_recall": 0.2118

and going on to 200 epochs results in

"mean_accuracy": 0.4617
"mean_IoU": 0.3600
"mean_precision": 0.2795
"mean_recall": 0.2330

Am I doing something wrong? Do I need to change some parts or adjust parameters?
Let me know if you need more information.
Thank you!

@pqhieu
Copy link
Owner

pqhieu commented Jun 4, 2019

Hi @rig8f,

Sorry for the late reply.
Could you help me attach the train.log and eval.json files in your log folder?

Thank you.

@rig8f
Copy link
Author

rig8f commented Jun 4, 2019

Here are the requested logs (I appended .log to eval.json files due to GitHub file restrictions) for both tests. Let me know if you need more files, information or have suggestions.

Thank you.

train100e.log
eval100e.json.log
train200e.log
eval200e.json.log

@pqhieu
Copy link
Owner

pqhieu commented Jun 4, 2019

Did you change any parameters in the configuration (batch size, learning rate, etc)? Your per-class accuracy is quite low compared to the usual numbers that I get after 100 epochs.

I just retrained another model from scratch. Attached here is the log folder for reference.

You can try saving additional checkpoints instead of only the best model. In my experience, the best results are often achieved after ~40 epochs.

Hope this helps.

@rig8f
Copy link
Author

rig8f commented Jun 5, 2019

Yes, it definitely helped me understanding the problem. The numbers I mentioned in the first comment come from a basic modification of the final part of eval.py where I also save the mean value of all the metrics (simply using np.mean(accu) for example, as it is also displayed to stdout).

But if we look at the per-class accuracy values obtained in a training from scratch using the same parameters as in your s3dis.json (eval.json file) we can see they are comparable: the fact is some classes are better detected than others, and the latter lower the mean value, as I should have figured out way before opening this issue.

Anyway, I see now that you have updated the scripts to include the overall accuracy metric and several other interesting things. Thank you!

@ZhengdiYu
Copy link

I've gone through the paper and run the code for MT-PNet with the commands specified in the Readme file and the same configuration present in s3dis.json but I cannot obtain the same results evidenced in the paper.

After 100 epochs of training, evaluation gives me

"mean_accuracy": 0.4461
"mean_IoU": 0.3441
"mean_precision": 0.2504
"mean_recall": 0.2118

and going on to 200 epochs results in

"mean_accuracy": 0.4617
"mean_IoU": 0.3600
"mean_precision": 0.2795
"mean_recall": 0.2330

Am I doing something wrong? Do I need to change some parts or adjust parameters?
Let me know if you need more information.
Thank you!

@rig8f May I ask where is the inconsistency? Did I misunderstand it?

As I see in the paper the results on S3DIS is:
mAP
Ours (MT-PNet) 24.9 71.5 78.4 28.3 24.4 3.5 12.1 36.2 10 12.6 34.5 12.8. And the precision "mean_precision" you get is quite similar.

@rig8f
Copy link
Author

rig8f commented Feb 20, 2021

@ZhengdiYu No, it was my fault, as I explained in this previous comment.

@ZhengdiYu
Copy link

@ZhengdiYu No, it was my fault, as I explained in this previous comment.

Thank you for your quick reply.
By the way, but I don't really understand the results. hope you can help me as well.

  1. So the "mean_precision" stands for mAP in the paper ? But in the eval.json which the author gave to you, it said:
    "mean_recall": 0.26129484999175207,
    "mean_precision": 0.3318457691464621,
    "mean_IoU": 0.4134808521564693,
    "mean_accuracy": 0.49937982103730627

the mean precision here is 33%, which is higher than the reported result in the paper(24.9) . So I guess maybe it's not mAP?

  1. There are 11 class reported in the paper, but there are 13 class in the results log file?
  2. Dose the "overall accuracy" in eval.py stands for the "mAcc" for semantic segmentation in the paper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants