Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why i remove JPU,I also can train model? #47

Closed
E18301194 opened this issue Sep 17, 2019 · 17 comments
Closed

why i remove JPU,I also can train model? #47

E18301194 opened this issue Sep 17, 2019 · 17 comments

Comments

@E18301194
Copy link

Why does the code still execute without error when I delete the JPU module?(/FastFCN/encoding/nn/customize.py),I also can train model?
These are my commands :(I did load the JPU module) CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --dataset pcontext --model encnet --jpu --aux --se-loss --backbone resnet101 --checkname encnet_res101_pcontext

@wuhuikai
Copy link
Owner

Have you reinstalled FastFCN by python setup.py install?

@E18301194
Copy link
Author

Thank you very much.
when i reinstalled python setup.py install ,FastFCN have a error
so ,what is (python setup.py install) function?

@wuhuikai
Copy link
Owner

It did sth like pip install, which copies all the src code into the directory of python libs.
Then, our script in experiment can import the corresponding lib such as encoding.
Thus, all your modification in the folder encoding will not work unless you reinstall it.
If you don't want to reinstall every time, just run python setup.py develop.

@E18301194
Copy link
Author

Thank you very much for your positive response and for providing connection too.Now,I have a question. According to the tips of the paper, I set the following parameter as follow: CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --dataset ade20k --model encnet --jpu --aux --se-loss --backbone resnet101 --checkname encnet_res101_ade20k_train
but But I can't surface the accuracy of the paper of ade20k (Val and test) .in paper,the val is pixACC:80.99 miou:44.34 in resnet101
Can you tell me how to set the hyperparameters and the accuracy of the paper?

@wuhuikai
Copy link
Owner

Following the instructions in README.md should lead you to the performance in our paper.

@E18301194
Copy link
Author

`class SegmentationLosses(CrossEntropyLoss):
"""2D Cross Entropy Loss with Auxilary Loss"""
def init(self, se_loss=False, se_weight=0.2, nclass=-1,
aux=False, aux_weight=0.4, weight=None,
size_average=True, ignore_index=-1):
super(SegmentationLosses, self).init(weight, size_average, ignore_index)
self.se_loss = se_loss
self.aux = aux
self.nclass = nclass
self.se_weight = se_weight
self.aux_weight = aux_weight
self.bceloss = BCELoss(weight, size_average)

def forward(self, *inputs):
    if not self.se_loss and not self.aux:
        return super(SegmentationLosses, self).forward(*inputs)
    elif not self.se_loss:
        pred1, pred2, target = tuple(inputs)
        loss1 = super(SegmentationLosses, self).forward(pred1, target)
        loss2 = super(SegmentationLosses, self).forward(pred2, target)
        return loss1 + self.aux_weight * loss2
    elif not self.aux:
        pred, se_pred, target = tuple(inputs)
        se_target = self._get_batch_label_vector(target, nclass=self.nclass).type_as(pred)
        loss1 = super(SegmentationLosses, self).forward(pred, target)
        loss2 = self.bceloss(torch.sigmoid(se_pred), se_target)
        return loss1 + self.se_weight * loss2
    else:
        pred1, se_pred, pred2, target = tuple(inputs)
        se_target = self._get_batch_label_vector(target, nclass=self.nclass).type_as(pred1)
        loss1 = super(SegmentationLosses, self).forward(pred1, target)
        loss2 = super(SegmentationLosses, self).forward(pred2, target)
        loss3 = self.bceloss(torch.sigmoid(se_pred), se_target)
        return loss1 + self.aux_weight * loss2 + self.se_weight * loss3`

Thank you for your reply.the Se-loss here, I don't know what pred1, pred2 and se_pred are exactly?
I didn't find out exactly where they were generated? I read the paper carefully and found that I did not understand this.Can you tell me what these mean?

@wuhuikai
Copy link
Owner

wuhuikai commented Oct 8, 2019

Please read the EncNet paper for understanding what SE loss is. See here for all the outputs.

@E18301194
Copy link
Author

thx for you code. I want to use a module to you code(fastfcn).but my module is in pytorch(0.4.0).I don't know how to change fastfcn apple for pytorch(0.4.0)?Thank you very much for your reply.

@wuhuikai
Copy link
Owner

You can directly plug your module (0.4.0) in FastFCN (1..) without any modification. PyTorch 1.. can run code in 0.4.0 with few changes.

@E18301194
Copy link
Author

Thank you very much for your reply.Because I want to change is the convolution kernel, so you need to use the torch.util.ffi, but the code we wrote this in pytorch 0.4.0 using C language to write, is the use of C + + written in pytorch version 1.0, version can't compatible. I can only give my module into version 1.0, but I'm not familiar with C + + language, so I want to change your code to 0.4.0 version.Could you help me, please?Looking forward to your reply

@wuhuikai
Copy link
Owner

If you only want to use JPU module, I think no modification is needed. However, if you want to use sync_bn, there's a huge work to do fot adapting it into 0.4.0.

@E18301194
Copy link
Author

I want to know what sync_bn is, please?It's different from a regular bn, why do you use sync_bn?

@wuhuikai
Copy link
Owner

bn calc running_mean and running_std per GPU while sync_bn calc among all GPUs. Thus, sync_bn means more stable statics.

@E18301194
Copy link
Author

Hi, I don't see label in dataset pcontext, I would like to ask how the label in dataset pcontext is loaded?I see a trainval_merged. Json file that I don't quite understand.

@wuhuikai
Copy link
Owner

See here.

@E18301194
Copy link
Author

Your work has helped me immensely. I would like to ask, if I want to increase speed, how should I improve it? My current idea is to replace resnet with shufflenetV2, do you have any other suggestions? ,hope to get your reply

@wuhuikai
Copy link
Owner

wuhuikai commented Jun 2, 2020

One simple method is to prune the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants