Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thank you for this implementation #100

Open
chenyzh28 opened this issue Nov 24, 2020 · 6 comments
Open

Thank you for this implementation #100

chenyzh28 opened this issue Nov 24, 2020 · 6 comments

Comments

@chenyzh28
Copy link

chenyzh28 commented Nov 24, 2020

I run the training code and test the performance of BiseNet-v1 and BiseNet-v2. The single mIOU is 75.30% for v1 and 74.18% for v2. As for the FPS, I didn't turn the model into tensorrt but directly use demo.py to test for 1000 iterations. I also remove the auxilary segmentation heads for computational efficiency. Finally, the FPS is 58.32 (768-1536) for v1 and 115 (512-1024) for v2, which is a litter inferior to the papers.
Thanks @CoinCheung for your work, which is very enlightening.
My implementation environment is as follows:
Python 3.7
torh 1.6.0
torchvision 0.7.0
cuda 10.1

@CoinCheung
Copy link
Owner

Thanks for sharing this !!

Please note that, in order to be compatible with tensorrt 7.0, I replaced interpolation with pixel-shuffle operation, which requires the previous conv layers to have more filters. This would bring more parameter and slow down the model a little bit. If python satisfies you, you can remove these pixel-shuffles and use interpolate back, which in theory would make the model more lightweighted.

@chenyzh28
Copy link
Author

Thank you! I have noticed that!
But recently I found another problem that I can't specify GPUs, which is very weird.
For example, when I run CUDA_VISIBLE_DEVICES=4,5 python -m torch.distributed.launch --nproc_per_node=2 tools/train.py --model bisenetv2, the program still runs on GPU 6,7.
I also tried setting os.environ["CUDA_VISIBLE_DEVICES"] = '4,5', but the problem still exists.
This problem has bothered me for two days. I don't really understand the working mechanism of torch.distributed.launch, so please advise me if you see any problems. Thanks!

@chenyzh28
Copy link
Author

It is an import problem. I have solved it.

@CoinCheung
Copy link
Owner

Good to know that your solved your problem, I left this open so that other people can see your performance test result.

@huohuotm
Copy link

I run the training code and test the performance of BiseNet-v1 and BiseNet-v2. The single mIOU is 75.30% for v1 and 74.18% for v2. As for the FPS, I didn't turn the model into tensorrt but directly use demo.py to test for 1000 iterations. I also remove the auxilary segmentation heads for computational efficiency. Finally, the FPS is 58.32 (768-1536) for v1 and 115 (512-1024) for v2, which is a litter inferior to the papers.
Thanks @CoinCheung for your work, which is very enlightening.
My implementation environment is as follows:
Python 3.7
torh 1.6.0
torchvision 0.7.0
cuda 10.1

May I ask which GPU you use to get "115 (512-1024) for v2 "?

@Asuna88
Copy link

Asuna88 commented Jul 30, 2024

I run the training code and test the performance of BiseNet-v1 and BiseNet-v2. The single mIOU is 75.30% for v1 and 74.18% for v2. As for the FPS, I didn't turn the model into tensorrt but directly use demo.py to test for 1000 iterations. I also remove the auxilary segmentation heads for computational efficiency. Finally, the FPS is 58.32 (768-1536) for v1 and 115 (512-1024) for v2, which is a litter inferior to the papers. Thanks @CoinCheung for your work, which is very enlightening. My implementation environment is as follows: Python 3.7 torh 1.6.0 torchvision 0.7.0 cuda 10.1

用的哪张卡得到的 115FPS?
1080Ti? 还是3070 Ti.

报告FPS的时候,麻烦请说明卡的类型哈,这样更有意义。
谢谢~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants