Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: __init__() got an unexpected keyword argument 'in_chans' #24

Closed
Celestial-Bai opened this issue Nov 25, 2021 · 7 comments
Closed

Comments

@Celestial-Bai
Copy link

分享一个issue,我一开始用的是11.22更新的requirements.txt,会报如下错误:

File"/MAE-pytorch/modeling_pretrain.py", line 319, in pretrain_mae_base_patch16_224
**kwargs)
**kwargs)
TypeError: init() got an unexpected keyword argument 'in_chans'

我发现好像是timm==0.3.2导致的,后来升至0.4.12可以解决这个问题。
测试了V100 (CentOS) 和 A100 (Ubuntu),都存在这个issue。
卸载timm好像也一样可以跑……我第一次看CV的code,了解的还不是很多。
顺便想问一下你们在V100选用的batch size是64吗

@pengzhiliang
Copy link
Owner

Ok, Thank you!
I will fix it in a minute.
Total batch size: 4096.

@Celestial-Bai
Copy link
Author

是不是使用你们给出的command,然后调用8张卡,就可以达到4096的batch size呀?大概是6488=4096?

@pengzhiliang
Copy link
Owner

# Set the path to save checkpoints
OUTPUT_DIR='output/pretrain_mae_base_patch16_224'
# path to imagenet-1k train set
DATA_PATH='/path/to/ImageNet_ILSVRC2012/train'


# batch_size can be adjusted according to the graphics card
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=8 run_mae_pretraining.py \
        --data_path ${DATA_PATH} \
        --mask_ratio 0.75 \
        --model pretrain_mae_base_patch16_224 \
        --batch_size 512 \
        --opt adamw \
        --opt_betas 0.9 0.95 \
        --warmup_epochs 40 \
        --epochs 1600 \ # if you need
        --output_dir ${OUTPUT_DIR}

@Celestial-Bai
Copy link
Author

哈哈哈感谢!我们之前很少用--nproc_per_node=8这个command,但是我用V100和A100都set过512,全部炸显存了hhh,不清楚是不是我们服务器调度的问题,非常感谢!

@Celestial-Bai
Copy link
Author

另外想问一下,你们pre-training使用的外存有多大?不好意思问题有点多了,实在不好意思

@pengzhiliang
Copy link
Owner

It's all right. You're welcome.

When batchsize is set to 512, the GPU memory occupies approximately 30G.

@Celestial-Bai
Copy link
Author

OK, get it! It is caused by our server.

And after you update the modeling_pretrain.py, we don't need to upgrade the timm. So you won't need to update the requirements.
Thank you so much for your efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants