Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question #127

Open
qiulesun opened this issue Sep 27, 2018 · 17 comments
Open

question #127

qiulesun opened this issue Sep 27, 2018 · 17 comments
Labels
update package update to the newest version

Comments

@qiulesun
Copy link

qiulesun commented Sep 27, 2018

I use the released code without problem before updating. Now, I notice that the codes are updated newly. So I can not wait to try it, and successfully run the python setup.py install. Then when I run the import encoding, I get the error like this.

>>> import encoding
Traceback (most recent call last):
File "", line 1, in
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/init.py", line 13, in
from . import nn, functions, dilated, parallel, utils, models, datasets, optimizer
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/nn/init.py", line 12, in
from .encoding import *
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/nn/encoding.py", line 18, in
from ..functions import scaled_l2, aggregate, pairwise_cosine
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/functions/init.py", line 2, in
from .encoding import *
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/functions/encoding.py", line 14, in
from .. import lib
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/lib/init.py", line 25, in
], build_directory=gpu_path, verbose=False)
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 494, in load
with_cuda=with_cuda)
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 670, in _jit_compile
return _import_module_from_library(name, build_directory)
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 753, in _import_module_from_library
return imp.load_module(module_name, file, path, description)
File "/usr/local/anaconda3/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/local/anaconda3/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: /media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/lib/gpu/enclib_gpu.so: undefined symbol: _ZN2at4cuda20getCurrentCUDAStreamEv

@qiulesun qiulesun changed the title _ZN2at4cuda20getCurrentCUDAStreamEv error occur when import encoding Sep 27, 2018
@zhanghang1989
Copy link
Owner

Could you please pull the most recent version of this package or try:

pip install torch-encoding --upgrade

@zhanghang1989 zhanghang1989 added the update package update to the newest version label Sep 27, 2018
@qiulesun
Copy link
Author

qiulesun commented Oct 8, 2018

In the table, EncNet_ResNet50_ADE achieves 80.1 pixAcc and 41.5 mIoU (https://hangzhang.org/PyTorch-Encoding/experiments/segmentation.html). However, with more epochs (160 epochs) and much larger input size (base_size 608, crop_size 576), the corresponding log file shows it obtains 78.0 pixAcc and 40.2 mIoU lower than the results reported both in the table and paper.

@zhanghang1989
Copy link
Owner

zhanghang1989 commented Oct 8, 2018

  1. The performance get improved after paper publication.
  2. The log file is out-of-date.
  3. The validation score in the log file only uses single size with center crop, which is used for monitoring the training process. For correct multi-size evaluation, please use test.py

@zhanghang1989
Copy link
Owner

For the command reproducing the results, please click the cmd button.

@qiulesun
Copy link
Author

qiulesun commented Dec 4, 2018

the effectiveness of SyncBN
I hold the view that you have systematically evaluated the effectiveness of the proposed SyncBN. Could you show ablation results compared with standard BN (or Group Norm if possible) on Imagenet2012 or segmentation datasets you have done ?

@zhanghang1989
Copy link
Owner

SyncBN is different from standard BN or group BN, because the other method DO NOT compute across gpu. I don't think syncBN is helpful for batchsize > 16, such as imagenet 2012 training.

@qiulesun
Copy link
Author

qiulesun commented Dec 5, 2018

ablation study
SyncBN is outstanding work and I understand its underlying mechanic. However, unfortunately I can't see how much SyncBN works from the CVPR18 paper. Given the same batchsize (e.g., 16), do you have ablation experimental results to illustrate the performance of SyncBN compared with other BN on segmentation dataset?

@zhanghang1989
Copy link
Owner

zhanghang1989 commented Dec 6, 2018

Hi @qiulesun , thanks for interest in this work.

I do have a table in the paper supplementary material for benchmarking SyncBN on Pascal Context dataset:

method BN pixAcc mIoU
FCN (4 GPUs) standard BN 47.7 20.8
FCN fixed BN 72.5 40.5
FCN sync BN 73.4 41.0

Fixed BN means using ImageNet pretrained mean and variance. Note that: fixed BN won't work for ADE20K dataset due to large learning rate.

@qiulesun
Copy link
Author

qiulesun commented Dec 6, 2018

supplementary material
The compared results are meaningful. But the supplementary material is not found in your homepage and CVPR 2018 open access (http://openaccess.thecvf.com/CVPR2018_search.py). Would you mind showing me the link to downlond it?

@zhanghang1989
Copy link
Owner

The supplementary material consisting of some basic information and experimental studies was provided during the double blind review, but it is not included in the final copy due to not being polished in the writting. I can send you a copy, if you could provide an email address.

@qiulesun
Copy link
Author

qiulesun commented Dec 8, 2018

Thank you! My email address is qiulesun@163.com. This paper (http://bzhou.ie.cuhk.edu.hk/publication/ADE20K_IJCV.pdf) also did the ablation study about various normalization, i,e,. Synchronized BN, Unsynchronized BN and Frozen BN.

@qiulesun
Copy link
Author

qiulesun commented Dec 28, 2018

setting of workers
I'm sorry for bother you again. I have no idea to how to set the value of workers in option script.

  1. Should workers be equal to the batch size or to the number of CPU cores in my machine? Or to the
    number of GPUs? Is there a guideline for assigning workers?
  2. Will you plan to report results of cityscapes dataset?

@zhanghang1989
Copy link
Owner

  1. I usually set the number of workers as 16 (same as batch size). But it also depends on your cpu.
  2. I will release the training and test on cityscapes later.

@qiulesun
Copy link
Author

qiulesun commented Jan 4, 2019

trivial question

  1. I want to get results on PContext with backgroud, i,e,. 60 classes. Where do I modify the codes to achieve that? Only change NUM_CLASS from 59 to 60 ? (https://github.com/zhanghang1989/PyTorch-Encoding/blob/master/encoding/datasets/pcontext.py#L19)

@zhanghang1989
Copy link
Owner

The background IoU is considered as 0. Then the mIoU over 60 classes is equal to mIoU_59 * 59 / 60

@qiulesun
Copy link
Author

qiulesun commented Jan 27, 2019

It is an awesome and developing repo., containing more SOTA methods.
Question 1
For the last question, computing mIoU with backgroud, i,e,. 60 classes, fisrt need to get mIoU without backgroud (mIoU_59) as you did in repo, then mIoU_60 is directly equal to mIoU_59 * 59 / 60 and will be slightly weaker than mIoU_59. Do I understand correctly?
Question 2
For multisize evaluation, will you consider employing dense crop on feature map rather than on input image? It drastically reduces computational overhead and may further boost the performance.
Question 3
Do you consdier the usage of accumulation gradient strategy to update param. due to limited GPU memory (small batchsize)?
Question 4
Your work appeals to me. When will you release your cvpr2019 paper Co-occurrent Features in Semantic Segmentation ?

Thank you for your consideration and I am looking forward to your reply.

@Yuxiang1995
Copy link

I use the released code without problem before updating. Now, I notice that the codes are updated newly. So I can not wait to try it, and successfully run the python setup.py install. Then when I run the import encoding, I get the error like this.

>>> import encoding
Traceback (most recent call last):
File "", line 1, in
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/init.py", line 13, in
from . import nn, functions, dilated, parallel, utils, models, datasets, optimizer
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/nn/init.py", line 12, in
from .encoding import *
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/nn/encoding.py", line 18, in
from ..functions import scaled_l2, aggregate, pairwise_cosine
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/functions/init.py", line 2, in
from .encoding import *
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/functions/encoding.py", line 14, in
from .. import lib
File "/media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/lib/init.py", line 25, in
], build_directory=gpu_path, verbose=False)
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 494, in load
with_cuda=with_cuda)
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 670, in _jit_compile
return _import_module_from_library(name, build_directory)
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 753, in _import_module_from_library
return imp.load_module(module_name, file, path, description)
File "/usr/local/anaconda3/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/local/anaconda3/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: /media/rudycv/SSD500G/pytorch_dir/PyTorch-Encoding-Updated/PyTorch-Encoding/encoding/lib/gpu/enclib_gpu.so: undefined symbol: _ZN2at4cuda20getCurrentCUDAStreamEv

I meet the same error with you ,how do you solve it? Just to get the latest torch-encoding cannot fix it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
update package update to the newest version
Projects
None yet
Development

No branches or pull requests

3 participants