Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batchnorm error #5

Closed
dirtycomputer opened this issue Mar 28, 2022 · 2 comments
Closed

batchnorm error #5

dirtycomputer opened this issue Mar 28, 2022 · 2 comments

Comments

@dirtycomputer
Copy link

dirtycomputer commented Mar 28, 2022

when I run python main.py --entry test --model-path runs/cutmix_r_pct_run_1/model_best_test.pth --exp-config configs/tent_cutmix/pct.yaml

I got this error:

ADAPT:
ITER: 10
METHOD: tent
AUG:
BETA: 1.0
MIXUPRATE: 0.4
NAME: none
PROB: 0.5
DATALOADER:
MODELNET40_C:
corruption: uniform
severity: 1
test_data_path: ./data/modelnet40_c/
MODELNET40_DGCNN:
num_points: 1024
test_data_path: ./data/modelnet40_ply_hdf5_2048/test_files.txt
train_data_path: ./data/modelnet40_ply_hdf5_2048/train_files.txt
valid_data_path: ./data/modelnet40_ply_hdf5_2048/train_files.txt
MODELNET40_PN2:
num_points: 1024
test_data_path: ./data/modelnet40_ply_hdf5_2048/test_files.txt
train_data_path: ./data/modelnet40_ply_hdf5_2048/train_files.txt
valid_data_path: ./data/modelnet40_ply_hdf5_2048/train_files.txt
MODELNET40_RSCNN:
data_path: ./data/
num_points: 1024
test_data_path: test_files.txt
train_data_path: train_files.txt
valid_data_path: train_files.txt
batch_size: 32
num_workers: 0
EXP:
DATASET: modelnet40_c
EXP_ID: c_pct_run_1
LOSS_NAME: smooth
METRIC: acc
MODEL_NAME: pct
OPTIMIZER: pct
SEED: 1
TASK: cls
EXP_EXTRA:
no_test: False
no_val: True
save_ckp: 25
test_eval_freq: 1
val_eval_freq: 1
MODEL:
MV:
backbone: resnet18
feat_size: 16
PN2:
version_cls: 1.0
RSCNN:
ssn_or_msn: True
TRAIN:
early_stop: 300
l2: 0.0001
learning_rate: 0.0001
lr_clip: 1e-05
lr_decay_factor: 0.5
lr_reduce_patience: 10
num_epochs: 300
Pct(
(model): Pct(
(conv1): Conv1d(3, 64, kernel_size=(1,), stride=(1,), bias=False)
(conv2): Conv1d(64, 64, kernel_size=(1,), stride=(1,), bias=False)
(bn1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn2): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(gather_local_0): Local_op(
(conv1): Conv1d(128, 128, kernel_size=(1,), stride=(1,), bias=False)
(conv2): Conv1d(128, 128, kernel_size=(1,), stride=(1,), bias=False)
(bn1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(gather_local_1): Local_op(
(conv1): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False)
(conv2): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False)
(bn1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(pt_last): Point_Transformer_Last(
(conv1): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False)
(conv2): Conv1d(256, 256, kernel_size=(1,), stride=(1,), bias=False)
(bn1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(sa1): SA_Layer(
(q_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(k_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(v_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(trans_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(after_norm): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): ReLU()
(softmax): Softmax(dim=-1)
)
(sa2): SA_Layer(
(q_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(k_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(v_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(trans_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(after_norm): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): ReLU()
(softmax): Softmax(dim=-1)
)
(sa3): SA_Layer(
(q_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(k_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(v_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(trans_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(after_norm): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): ReLU()
(softmax): Softmax(dim=-1)
)
(sa4): SA_Layer(
(q_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(k_conv): Conv1d(256, 64, kernel_size=(1,), stride=(1,), bias=False)
(v_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(trans_conv): Conv1d(256, 256, kernel_size=(1,), stride=(1,))
(after_norm): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): ReLU()
(softmax): Softmax(dim=-1)
)
)
(conv_fuse): Sequential(
(0): Conv1d(1280, 1024, kernel_size=(1,), stride=(1,), bias=False)
(1): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): LeakyReLU(negative_slope=0.2)
)
(linear1): Linear(in_features=1024, out_features=512, bias=False)
(bn6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(dp1): Dropout(p=0.5, inplace=False)
(linear2): Linear(in_features=512, out_features=256, bias=True)
(bn7): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(dp2): Dropout(p=0.5, inplace=False)
(linear3): Linear(in_features=256, out_features=40, bias=True)
)
)
Recovering model and checkpoint from /ModelNet40-C/runs/cutmix_r_pct_run_1/model_best_test.pth
Adaptation Done ...
N/A% (0 of 78) | | Elapsed Time: 0:00:00 ETA: --:--:--Adaptation Done ...
1% (1 of 78) | | Elapsed Time: 0:00:01 ETA: 0:01:52Adaptation Done ...
2% (2 of 78) | | Elapsed Time: 0:00:03 ETA: 0:01:55Adaptation Done ...
3% (3 of 78) | | Elapsed Time: 0:00:04 ETA: 0:01:56Adaptation Done ...
5% (4 of 78) |# | Elapsed Time: 0:00:06 ETA: 0:01:52Adaptation Done ...
6% (5 of 78) |# | Elapsed Time: 0:00:07 ETA: 0:01:51Adaptation Done ...
7% (6 of 78) |# | Elapsed Time: 0:00:09 ETA: 0:01:55Adaptation Done ...
8% (7 of 78) |## | Elapsed Time: 0:00:10 ETA: 0:01:55Adaptation Done ...
10% (8 of 78) |## | Elapsed Time: 0:00:12 ETA: 0:01:48Adaptation Done ...
11% (9 of 78) |## | Elapsed Time: 0:00:13 ETA: 0:01:44Adaptation Done ...
12% (10 of 78) |### | Elapsed Time: 0:00:15 ETA: 0:01:41Adaptation Done ...
14% (11 of 78) |### | Elapsed Time: 0:00:16 ETA: 0:01:35Adaptation Done ...
15% (12 of 78) |### | Elapsed Time: 0:00:18 ETA: 0:01:34Adaptation Done ...
16% (13 of 78) |#### | Elapsed Time: 0:00:19 ETA: 0:01:36Adaptation Done ...
17% (14 of 78) |#### | Elapsed Time: 0:00:21 ETA: 0:01:38Adaptation Done ...
19% (15 of 78) |#### | Elapsed Time: 0:00:22 ETA: 0:01:39Adaptation Done ...
20% (16 of 78) |#### | Elapsed Time: 0:00:24 ETA: 0:01:35Adaptation Done ...
21% (17 of 78) |##### | Elapsed Time: 0:00:25 ETA: 0:01:31Adaptation Done ...
23% (18 of 78) |##### | Elapsed Time: 0:00:27 ETA: 0:01:31Adaptation Done ...
24% (19 of 78) |##### | Elapsed Time: 0:00:28 ETA: 0:01:31Adaptation Done ...
25% (20 of 78) |###### | Elapsed Time: 0:00:30 ETA: 0:01:27Adaptation Done ...
26% (21 of 78) |###### | Elapsed Time: 0:00:31 ETA: 0:01:24Adaptation Done ...
28% (22 of 78) |###### | Elapsed Time: 0:00:33 ETA: 0:01:25Adaptation Done ...
29% (23 of 78) |####### | Elapsed Time: 0:00:34 ETA: 0:01:22Adaptation Done ...
30% (24 of 78) |####### | Elapsed Time: 0:00:36 ETA: 0:01:17Adaptation Done ...
32% (25 of 78) |####### | Elapsed Time: 0:00:37 ETA: 0:01:16Adaptation Done ...
33% (26 of 78) |######## | Elapsed Time: 0:00:39 ETA: 0:01:19Adaptation Done ...
34% (27 of 78) |######## | Elapsed Time: 0:00:40 ETA: 0:01:15Adaptation Done ...
35% (28 of 78) |######## | Elapsed Time: 0:00:42 ETA: 0:01:13Adaptation Done ...
37% (29 of 78) |######## | Elapsed Time: 0:00:43 ETA: 0:01:08Adaptation Done ...
38% (30 of 78) |######### | Elapsed Time: 0:00:45 ETA: 0:01:06Adaptation Done ...
39% (31 of 78) |######### | Elapsed Time: 0:00:46 ETA: 0:01:12Adaptation Done ...
41% (32 of 78) |######### | Elapsed Time: 0:00:48 ETA: 0:01:08Adaptation Done ...
42% (33 of 78) |########## | Elapsed Time: 0:00:49 ETA: 0:01:01Adaptation Done ...
43% (34 of 78) |########## | Elapsed Time: 0:00:50 ETA: 0:01:02Adaptation Done ...
44% (35 of 78) |########## | Elapsed Time: 0:00:52 ETA: 0:01:03Adaptation Done ...
46% (36 of 78) |########### | Elapsed Time: 0:00:53 ETA: 0:01:00Adaptation Done ...
47% (37 of 78) |########### | Elapsed Time: 0:00:55 ETA: 0:01:00Adaptation Done ...
48% (38 of 78) |########### | Elapsed Time: 0:00:56 ETA: 0:01:01Adaptation Done ...
50% (39 of 78) |############ | Elapsed Time: 0:00:58 ETA: 0:01:00Adaptation Done ...
51% (40 of 78) |############ | Elapsed Time: 0:01:00 ETA: 0:00:59Adaptation Done ...
52% (41 of 78) |############ | Elapsed Time: 0:01:01 ETA: 0:00:57Adaptation Done ...
53% (42 of 78) |############ | Elapsed Time: 0:01:03 ETA: 0:00:54Adaptation Done ...
55% (43 of 78) |############# | Elapsed Time: 0:01:04 ETA: 0:00:52Adaptation Done ...
56% (44 of 78) |############# | Elapsed Time: 0:01:06 ETA: 0:00:51Adaptation Done ...
57% (45 of 78) |############# | Elapsed Time: 0:01:07 ETA: 0:00:49Adaptation Done ...
58% (46 of 78) |############## | Elapsed Time: 0:01:09 ETA: 0:00:47Adaptation Done ...
60% (47 of 78) |############## | Elapsed Time: 0:01:10 ETA: 0:00:47Adaptation Done ...
61% (48 of 78) |############## | Elapsed Time: 0:01:12 ETA: 0:00:46Adaptation Done ...
62% (49 of 78) |############### | Elapsed Time: 0:01:13 ETA: 0:00:41Adaptation Done ...
64% (50 of 78) |############### | Elapsed Time: 0:01:14 ETA: 0:00:39Adaptation Done ...
65% (51 of 78) |############### | Elapsed Time: 0:01:16 ETA: 0:00:36Adaptation Done ...
66% (52 of 78) |################ | Elapsed Time: 0:01:17 ETA: 0:00:32Adaptation Done ...
67% (53 of 78) |################ | Elapsed Time: 0:01:18 ETA: 0:00:30Adaptation Done ...
69% (54 of 78) |################ | Elapsed Time: 0:01:19 ETA: 0:00:30Adaptation Done ...
70% (55 of 78) |################ | Elapsed Time: 0:01:21 ETA: 0:00:29Adaptation Done ...
71% (56 of 78) |################# | Elapsed Time: 0:01:22 ETA: 0:00:29Adaptation Done ...
73% (57 of 78) |################# | Elapsed Time: 0:01:24 ETA: 0:00:31Adaptation Done ...
74% (58 of 78) |################# | Elapsed Time: 0:01:25 ETA: 0:00:30Adaptation Done ...
75% (59 of 78) |################## | Elapsed Time: 0:01:27 ETA: 0:00:27Adaptation Done ...
76% (60 of 78) |################## | Elapsed Time: 0:01:28 ETA: 0:00:26Adaptation Done ...
78% (61 of 78) |################## | Elapsed Time: 0:01:30 ETA: 0:00:25Adaptation Done ...
79% (62 of 78) |################### | Elapsed Time: 0:01:31 ETA: 0:00:23Adaptation Done ...
80% (63 of 78) |################### | Elapsed Time: 0:01:33 ETA: 0:00:21Adaptation Done ...
82% (64 of 78) |################### | Elapsed Time: 0:01:34 ETA: 0:00:21Adaptation Done ...
83% (65 of 78) |#################### | Elapsed Time: 0:01:36 ETA: 0:00:19Adaptation Done ...
84% (66 of 78) |#################### | Elapsed Time: 0:01:37 ETA: 0:00:18Adaptation Done ...
85% (67 of 78) |#################### | Elapsed Time: 0:01:39 ETA: 0:00:16Adaptation Done ...
87% (68 of 78) |#################### | Elapsed Time: 0:01:40 ETA: 0:00:13Adaptation Done ...
88% (69 of 78) |##################### | Elapsed Time: 0:01:42 ETA: 0:00:13Adaptation Done ...
89% (70 of 78) |##################### | Elapsed Time: 0:01:43 ETA: 0:00:12Adaptation Done ...
91% (71 of 78) |##################### | Elapsed Time: 0:01:44 ETA: 0:00:10Adaptation Done ...
92% (72 of 78) |###################### | Elapsed Time: 0:01:46 ETA: 0:00:09Adaptation Done ...
93% (73 of 78) |###################### | Elapsed Time: 0:01:48 ETA: 0:00:07Adaptation Done ...
94% (74 of 78) |###################### | Elapsed Time: 0:01:49 ETA: 0:00:05Adaptation Done ...
96% (75 of 78) |####################### | Elapsed Time: 0:01:51 ETA: 0:00:04Adaptation Done ...
97% (76 of 78) |####################### | Elapsed Time: 0:01:52 ETA: 0:00:02Traceback (most recent call last):
File "main.py", line 684, in
entry_test(cfg, test_or_valid, cmd_args.model_path,cmd_args.confusion)
File "main.py", line 572, in entry_test
test_perf = validate(cfg.EXP.TASK, loader_test, model, cfg.EXP.DATASET, cfg.ADAPT, confusion)
File "main.py", line 228, in validate
model = adapt_tent(inp,model,adapt)
File "main.py", line 45, in adapt_tent
tent_helper.forward_and_adapt(data,model,optimizer_tent)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 96, in decorate_enable_grad
return func(*args, **kwargs)
File "/ModelNet40-C/third_party/tent_helper.py", line 52, in forward_and_adapt
outputs = model(**x)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/ModelNet40-C/models/pct.py", line 29, in forward
logit = self.model(pc)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/ModelNet40-C/PCT_Pytorch/model.py", line 69, in forward
x = F.leaky_relu(self.bn6(self.linear1(x)), negative_slope=0.2)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 107, in forward
exponential_average_factor, self.eps)
File "
/miniconda3/envs/modelnetc/lib/python3.7/site-packages/torch/nn/functional.py", line 1666, in batch_norm
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512])

@jiachens
Copy link
Owner

Thank you for your interest!

Could you please tell me the used PyTorch and CUDA versions in your experiment?

Best,

@jiachens
Copy link
Owner

Since there is no activity, I will close this issue. Feel free to ask other questions encountered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants