Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train error: AttributeError: 'tuple' object has no attribute 'log_softmax' #6

Closed
lxy5513 opened this issue Jun 7, 2021 · 5 comments

Comments

@lxy5513
Copy link

lxy5513 commented Jun 7, 2021

Hi, thanks for you great work. When I train script, some error occurs: AttributeError: 'tuple' object has no attribute 'log_softmax'

with amp_autocast():   
            output = model(input)  
            loss = loss_fn(output, target)  # error occurs

and loss function is train_loss_fn = LabelSmoothingCrossEntropy(smoothing=0.0).cuda()

by the way: Could you please tell me why we need to specify smoothing=0.0?

@zihangJiang
Copy link
Owner

zihangJiang commented Jun 7, 2021

Hi,
Can you post your training script? The label smoothing is handled in Mixup & Cutmix here

TokenLabeling/main.py

Lines 484 to 488 in 9f71792

if mixup_active:
mixup_args = dict(
mixup_alpha=args.mixup, cutmix_alpha=args.cutmix, cutmix_minmax=args.cutmix_minmax,
prob=args.mixup_prob, switch_prob=args.mixup_switch_prob, mode=args.mixup_mode,
label_smoothing=args.smoothing, num_classes=args.num_classes)

and here

TokenLabeling/main.py

Lines 678 to 690 in 9f71792

if args.token_label and args.token_label_data:
target=create_token_label_target(target,num_classes=args.num_classes,
smoothing=args.smoothing, label_size=args.token_label_size)
if len(target.shape)==1:
target=create_token_label_target(target,num_classes=args.num_classes,
smoothing=args.smoothing)
else:
if args.token_label and args.token_label_data and not loader.mixup_enabled:
target=create_token_label_target(target,num_classes=args.num_classes,
smoothing=args.smoothing, label_size=args.token_label_size)
if len(target.shape)==1:
target=create_token_label_target(target,num_classes=args.num_classes,
smoothing=args.smoothing)

If you train without token labeling, We suggest you add --mixup 0.8 or --cutmix 1.0 as regularization, and use lvvit model. Model like lvvit_s returns all tokens by default which will cause error if you train without token labeling.
If you train with token labeling, please add --token-label flag.

@lxy5513
Copy link
Author

lxy5513 commented Jun 8, 2021

@zihangJiang Thanks for your response.
My training script is here:
python main.py /home/liuxingyu/data/patrol/comfort_classification/nursery_v16 --model lvvit_m -b 64 --apex-amp --img-size 384 --drop-path 0.2 --token-label-size 24 --model-ema --finetune lvvit_m-56M-384-85.4.pth.tar --token-label-data ''

I train without token labeling, because I have not generate token labels for my custome dataset.

Model like lvvit_s returns all tokens by default which will cause error if you train without token labeling

Is this mean if I want to train without token labeling, I can't use pretrain model ?

@zihangJiang
Copy link
Owner

We haven't tested on other datasets yet, but you can of course use the pre-trained model. I think you can add --token-label --dense-weight 0 flag. This can work without token label data.

However, some other bugs may occur. You can wait for our further update supporting transfer learning.

@lxy5513
Copy link
Author

lxy5513 commented Jun 8, 2021

Well, thanks, expect for further update.

@zihangJiang
Copy link
Owner

Hi @lxy5513 ,

We've updated and tested support for transfer learning. The dataset folder structure should be the same as ImageNet structure https://github.com/zihangJiang/TokenLabeling#requirements (with train and val split). You can clone the main branch, try to specify --num-classes flag, and add --token-label --dense-weight 0 for fine-tuning.

Let me know if you have any further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants