Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with loading checkpoints #12

Closed
sguinard opened this issue Jul 20, 2023 · 13 comments
Closed

Issue with loading checkpoints #12

sguinard opened this issue Jul 20, 2023 · 13 comments
Labels
good first issue Good for newcomers

Comments

@sguinard
Copy link

sguinard commented Jul 20, 2023

Hi Damien,

Thanks for the great work and the code!

I'm currently performing some experiments with a docker-ized SPT based on nvidia/cuda:11.8.0-devel-ubuntu22.04 , on Kitty360.

I have no problem running the training script (both standard and 11g configs run smoothly), however, the evaluation script automatically fails when reading the saved checkpoints with the following error:

Traceback (most recent call last):
  File "/app/superpoint_transformer/src/models/segmentation.py", line 545, in load_state_dict
    super().load_state_dict(state_dict, strict=strict)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PointSegmentationModule:
	Unexpected key(s) in state_dict: "criterion.criteria.0.weight", "criterion.criteria.1.weight". 

The same error happens with epoch-XXX.ckpt, latest.ckpt or the pretrained weights downloaded from your git,

A quick google search points to the loading modules that may try reading a distinct model from what has been stored, but since I didn't modify the config files except for adding the path to the data, this seems weid.

Any hints regarding this error?

Best regards,
Stephane

@drprojects drprojects added the good first issue Good for newcomers label Jul 21, 2023
@drprojects
Copy link
Owner

drprojects commented Jul 21, 2023

Hi Stéphane,

Thanks for your interest in the project and for catching this error !

I had already encountered this issue and thought I had completely fixed it but it came creeping back. If you are interested, the problem is simply that the model saves in the state_dict some attributes of criterion (ie the semantic losses in our case) which it cannot properly reload. Here, the problematic attribute is weight which is used to weight down the importance of each class in the semantic segmentation losses.

Long story short, I just pushed a new commit which should fix this. Would you mind testing it on your end and letting me know if it solves your issue ?

Best,

Damien

@sguinard
Copy link
Author

Hi Damien,

Thanks for your quick reply!
I'm testing this and will let you know if this works as soon as possible,

Best,
Stephane

@hyunkoome
Copy link

hyunkoome commented Jul 23, 2023

Hi @drprojects

Thank you for sharing your great work and codes.

I trained my server machine with nvidia a100 gpu (vram 80GB) and evaluated.
I have no problem for the training session, but, I have an issue for the eval session.

Especially, once I run your codes regarding s3dis and dales dataset are work well. but, the kitti360 has the same issue @sguinard mentioned before.

When run eval script with the kitti360, I got the errors with 'state_dict' and 'size mismatch' as follows:

[2023-07-23 15:59:37,100][src.utils.utils][ERROR] - 
Traceback (most recent call last):
  File "/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/src/utils/utils.py", line 45, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "src/eval.py", line 105, in evaluate
    trainer.test(model=model, datamodule=datamodule, ckpt_path=cfg.ckpt_path)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in test
    return call._call_and_handle_interrupt(
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 778, in _test_impl
    results = self._run(model, ckpt_path=ckpt_path)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 939, in _run
    self._checkpoint_connector._restore_modules_and_callbacks(ckpt_path)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 396, in _restore_modules_and_callbacks
    self.restore_model()
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 278, in restore_model
    trainer.strategy.load_model_state_dict(self._loaded_checkpoint)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 352, in load_model_state_dict
    self.lightning_module.load_state_dict(checkpoint["state_dict"])
  File "/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/src/models/segmentation.py", line 560, in load_state_dict
    super().load_state_dict(state_dict, strict=strict)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PointSegmentationModule:
	Missing key(s) in state_dict: "net.down_stages.0.transformer_blocks.0.ffn_norm.weight", "net.down_stages.0.transformer_blocks.0.ffn_norm.bias", "net.down_stages.0.transformer_blocks.0.ffn_norm.mean_scale", "net.down_stages.0.transformer_blocks.0.ffn.mlp.0.weight", "net.down_stages.0.transformer_blocks.0.ffn.mlp.0.bias", "net.down_stages.0.transformer_blocks.0.ffn.mlp.2.weight", "net.down_stages.0.transformer_blocks.0.ffn.mlp.2.bias", "net.down_stages.0.transformer_blocks.1.ffn_norm.weight", "net.down_stages.0.transformer_blocks.1.ffn_norm.bias", "net.down_stages.0.transformer_blocks.1.ffn_norm.mean_scale", "net.down_stages.0.transformer_blocks.1.ffn.mlp.0.weight", "net.down_stages.0.transformer_blocks.1.ffn.mlp.0.bias", "net.down_stages.0.transformer_blocks.1.ffn.mlp.2.weight", "net.down_stages.0.transformer_blocks.1.ffn.mlp.2.bias", "net.down_stages.0.transformer_blocks.2.ffn_norm.weight", "net.down_stages.0.transformer_blocks.2.ffn_norm.bias", "net.down_stages.0.transformer_blocks.2.ffn_norm.mean_scale", "net.down_stages.0.transformer_blocks.2.ffn.mlp.0.weight", "net.down_stages.0.transformer_blocks.2.ffn.mlp.0.bias", "net.down_stages.0.transformer_blocks.2.ffn.mlp.2.weight", "net.down_stages.0.transformer_blocks.2.ffn.mlp.2.bias", "net.down_stages.1.transformer_blocks.0.ffn_norm.weight", "net.down_stages.1.transformer_blocks.0.ffn_norm.bias", "net.down_stages.1.transformer_blocks.0.ffn_norm.mean_scale", "net.down_stages.1.transformer_blocks.0.ffn.mlp.0.weight", "net.down_stages.1.transformer_blocks.0.ffn.mlp.0.bias", "net.down_stages.1.transformer_blocks.0.ffn.mlp.2.weight", "net.down_stages.1.transformer_blocks.0.ffn.mlp.2.bias", "net.down_stages.1.transformer_blocks.1.ffn_norm.weight", "net.down_stages.1.transformer_blocks.1.ffn_norm.bias", "net.down_stages.1.transformer_blocks.1.ffn_norm.mean_scale", "net.down_stages.1.transformer_blocks.1.ffn.mlp.0.weight", "net.down_stages.1.transformer_blocks.1.ffn.mlp.0.bias", "net.down_stages.1.transformer_blocks.1.ffn.mlp.2.weight", "net.down_stages.1.transformer_blocks.1.ffn.mlp.2.bias", "net.down_stages.1.transformer_blocks.2.ffn_norm.weight", "net.down_stages.1.transformer_blocks.2.ffn_norm.bias", "net.down_stages.1.transformer_blocks.2.ffn_norm.mean_scale", "net.down_stages.1.transformer_blocks.2.ffn.mlp.0.weight", "net.down_stages.1.transformer_blocks.2.ffn.mlp.0.bias", "net.down_stages.1.transformer_blocks.2.ffn.mlp.2.weight", "net.down_stages.1.transformer_blocks.2.ffn.mlp.2.bias", "net.up_stages.0.transformer_blocks.0.ffn_norm.weight", "net.up_stages.0.transformer_blocks.0.ffn_norm.bias", "net.up_stages.0.transformer_blocks.0.ffn_norm.mean_scale", "net.up_stages.0.transformer_blocks.0.ffn.mlp.0.weight", "net.up_stages.0.transformer_blocks.0.ffn.mlp.0.bias", "net.up_stages.0.transformer_blocks.0.ffn.mlp.2.weight", "net.up_stages.0.transformer_blocks.0.ffn.mlp.2.bias". 
	size mismatch for net.down_stages.0.in_mlp.mlp.0.weight: copying a param with shape torch.Size([64, 132]) from checkpoint, the shape in current model is torch.Size([128, 132]).
	size mismatch for net.down_stages.0.in_mlp.mlp.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.1.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.3.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.4.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.0.weight: copying a param with shape torch.Size([64, 68]) from checkpoint, the shape in current model is torch.Size([128, 132]).
	size mismatch for net.down_stages.1.in_mlp.mlp.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.1.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.3.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.4.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.0.weight: copying a param with shape torch.Size([64, 132]) from checkpoint, the shape in current model is torch.Size([128, 260]).
	size mismatch for net.up_stages.0.in_mlp.mlp.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.1.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.3.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.4.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for head.0.classifier.weight: copying a param with shape torch.Size([13, 64]) from checkpoint, the shape in current model is torch.Size([15, 128]).
	size mismatch for head.0.classifier.bias: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([15]).
	size mismatch for head.1.classifier.weight: copying a param with shape torch.Size([13, 64]) from checkpoint, the shape in current model is torch.Size([15, 128]).
	size mismatch for head.1.classifier.bias: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([15]).
[2023-07-23 15:59:37,102][src.utils.utils][INFO] - Closing loggers...
Error executing job with overrides: ['experiment=kitti360', 'ckpt_path=/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/logs/train/runs/2023-07-23_04-52-26/checkpoints/epoch_1419.ckpt']
Traceback (most recent call last):
  File "src/eval.py", line 117, in main
    evaluate(cfg)
  File "/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/src/utils/utils.py", line 48, in wrap
    raise ex
  File "/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/src/utils/utils.py", line 45, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "src/eval.py", line 105, in evaluate
    trainer.test(model=model, datamodule=datamodule, ckpt_path=cfg.ckpt_path)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in test
    return call._call_and_handle_interrupt(
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 778, in _test_impl
    results = self._run(model, ckpt_path=ckpt_path)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 939, in _run
    self._checkpoint_connector._restore_modules_and_callbacks(ckpt_path)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 396, in _restore_modules_and_callbacks
    self.restore_model()
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 278, in restore_model
    trainer.strategy.load_model_state_dict(self._loaded_checkpoint)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 352, in load_model_state_dict
    self.lightning_module.load_state_dict(checkpoint["state_dict"])
  File "/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/src/models/segmentation.py", line 560, in load_state_dict
    super().load_state_dict(state_dict, strict=strict)
  File "/home/hyunkoo/anaconda3/envs/spt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PointSegmentationModule:
	Missing key(s) in state_dict: "net.down_stages.0.transformer_blocks.0.ffn_norm.weight", "net.down_stages.0.transformer_blocks.0.ffn_norm.bias", "net.down_stages.0.transformer_blocks.0.ffn_norm.mean_scale", "net.down_stages.0.transformer_blocks.0.ffn.mlp.0.weight", "net.down_stages.0.transformer_blocks.0.ffn.mlp.0.bias", "net.down_stages.0.transformer_blocks.0.ffn.mlp.2.weight", "net.down_stages.0.transformer_blocks.0.ffn.mlp.2.bias", "net.down_stages.0.transformer_blocks.1.ffn_norm.weight", "net.down_stages.0.transformer_blocks.1.ffn_norm.bias", "net.down_stages.0.transformer_blocks.1.ffn_norm.mean_scale", "net.down_stages.0.transformer_blocks.1.ffn.mlp.0.weight", "net.down_stages.0.transformer_blocks.1.ffn.mlp.0.bias", "net.down_stages.0.transformer_blocks.1.ffn.mlp.2.weight", "net.down_stages.0.transformer_blocks.1.ffn.mlp.2.bias", "net.down_stages.0.transformer_blocks.2.ffn_norm.weight", "net.down_stages.0.transformer_blocks.2.ffn_norm.bias", "net.down_stages.0.transformer_blocks.2.ffn_norm.mean_scale", "net.down_stages.0.transformer_blocks.2.ffn.mlp.0.weight", "net.down_stages.0.transformer_blocks.2.ffn.mlp.0.bias", "net.down_stages.0.transformer_blocks.2.ffn.mlp.2.weight", "net.down_stages.0.transformer_blocks.2.ffn.mlp.2.bias", "net.down_stages.1.transformer_blocks.0.ffn_norm.weight", "net.down_stages.1.transformer_blocks.0.ffn_norm.bias", "net.down_stages.1.transformer_blocks.0.ffn_norm.mean_scale", "net.down_stages.1.transformer_blocks.0.ffn.mlp.0.weight", "net.down_stages.1.transformer_blocks.0.ffn.mlp.0.bias", "net.down_stages.1.transformer_blocks.0.ffn.mlp.2.weight", "net.down_stages.1.transformer_blocks.0.ffn.mlp.2.bias", "net.down_stages.1.transformer_blocks.1.ffn_norm.weight", "net.down_stages.1.transformer_blocks.1.ffn_norm.bias", "net.down_stages.1.transformer_blocks.1.ffn_norm.mean_scale", "net.down_stages.1.transformer_blocks.1.ffn.mlp.0.weight", "net.down_stages.1.transformer_blocks.1.ffn.mlp.0.bias", "net.down_stages.1.transformer_blocks.1.ffn.mlp.2.weight", "net.down_stages.1.transformer_blocks.1.ffn.mlp.2.bias", "net.down_stages.1.transformer_blocks.2.ffn_norm.weight", "net.down_stages.1.transformer_blocks.2.ffn_norm.bias", "net.down_stages.1.transformer_blocks.2.ffn_norm.mean_scale", "net.down_stages.1.transformer_blocks.2.ffn.mlp.0.weight", "net.down_stages.1.transformer_blocks.2.ffn.mlp.0.bias", "net.down_stages.1.transformer_blocks.2.ffn.mlp.2.weight", "net.down_stages.1.transformer_blocks.2.ffn.mlp.2.bias", "net.up_stages.0.transformer_blocks.0.ffn_norm.weight", "net.up_stages.0.transformer_blocks.0.ffn_norm.bias", "net.up_stages.0.transformer_blocks.0.ffn_norm.mean_scale", "net.up_stages.0.transformer_blocks.0.ffn.mlp.0.weight", "net.up_stages.0.transformer_blocks.0.ffn.mlp.0.bias", "net.up_stages.0.transformer_blocks.0.ffn.mlp.2.weight", "net.up_stages.0.transformer_blocks.0.ffn.mlp.2.bias". 
	size mismatch for net.down_stages.0.in_mlp.mlp.0.weight: copying a param with shape torch.Size([64, 132]) from checkpoint, the shape in current model is torch.Size([128, 132]).
	size mismatch for net.down_stages.0.in_mlp.mlp.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.1.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.3.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.in_mlp.mlp.4.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.0.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.1.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.0.transformer_blocks.2.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.0.weight: copying a param with shape torch.Size([64, 68]) from checkpoint, the shape in current model is torch.Size([128, 132]).
	size mismatch for net.down_stages.1.in_mlp.mlp.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.1.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.3.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.in_mlp.mlp.4.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.0.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.1.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.down_stages.1.transformer_blocks.2.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.0.weight: copying a param with shape torch.Size([64, 132]) from checkpoint, the shape in current model is torch.Size([128, 260]).
	size mismatch for net.up_stages.0.in_mlp.mlp.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.1.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.3.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.in_mlp.mlp.4.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa_norm.mean_scale: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.qkv.weight: copying a param with shape torch.Size([192, 64]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.qkv.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.v_rpe.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([128, 32]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.v_rpe.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.out_proj.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([128, 128]).
	size mismatch for net.up_stages.0.transformer_blocks.0.sa.out_proj.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
	size mismatch for head.0.classifier.weight: copying a param with shape torch.Size([13, 64]) from checkpoint, the shape in current model is torch.Size([15, 128]).
	size mismatch for head.0.classifier.bias: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([15]).
	size mismatch for head.1.classifier.weight: copying a param with shape torch.Size([13, 64]) from checkpoint, the shape in current model is torch.Size([15, 128]).
	size mismatch for head.1.classifier.bias: copying a param with shape torch.Size([13]) from checkpoint, the shape in current model is torch.Size([15]).

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Best regards,
Hyunkoo

@drprojects
Copy link
Owner

drprojects commented Jul 23, 2023

Ho @hyunkoome , thanks for your interest and feedback.

The error you encountered seems different, I think you may be using an S3DIS checkpoint for the KITTI-360 dataset. This would explain all the feature dimensions mismatch, as well as the final classifier size mismatch. It seems you are using a checkpoint from a training you launched on your machine:
/home/hyunkoo/DATA/ssd1/Codes/SemanticSeg3D/superpoint_transformer/logs/train/runs/2023-07-23_04-52-26/checkpoints/epoch_1419.ckpt.
Are you certain you are using a KITTI-360 checkpoint ?

@jaswanthbjk
Copy link

I am still having the issues @drprojects

@drprojects
Copy link
Owner

drprojects commented Jul 23, 2023

@jaswanthbjk are you encountering the exact same error as @sguinard ? That is:

Traceback (most recent call last):
  File "/app/superpoint_transformer/src/models/segmentation.py", line 545, in load_state_dict
    super().load_state_dict(state_dict, strict=strict)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PointSegmentationModule:
	Unexpected key(s) in state_dict: "criterion.criteria.0.weight", "criterion.criteria.1.weight". 

Are you using the latest commit 6b9ac9aa0c96d843af7f50448a3fbf968263d56a ?

@jaswanthbjk
Copy link

jaswanthbjk commented Jul 23, 2023

My issue is also the same

File "/home/jba/learnings/superpoint_transformer/src/models/segmentation.py", line 560, in load_state_dict
    super().load_state_dict(state_dict, strict=strict)
  File "/home/jba/miniconda3/envs/spt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PointSegmentationModule:
        Missing key(s) in state_dict: "criterion.criteria.0.weight", "criterion.criteria.1.weight". 

I cloned the repo yesterday.

I assumed it would be with the updated commit.

@drprojects
Copy link
Owner

drprojects commented Jul 24, 2023

Hi @jaswanthbjk, that is strange, I can successfully run:

# Evaluate SPT on S3DIS Fold 5
python src/eval.py experiment=s3dis datamodule.fold=5 ckpt_path=/path/to/your/checkpoint.ckpt

# Evaluate SPT on KITTI-360 Val
python src/eval.py experiment=kitti360  ckpt_path=/path/to/your/checkpoint.ckpt 

# Evaluate SPT on DALES
python src/eval.py experiment=dales ckpt_path=/path/to/your/checkpoint.ckpt

Are you using one of our .ckpt provided here or some .ckpt from your own pretraining ?

Just to be 100% safe, please make sure you git pull the latest version.

@jaswanthbjk
Copy link

jaswanthbjk commented Jul 24, 2023

I trained a model on s3dis by myself.

Yes I am running Eval script as you mentioned.

I was only successful when I changed strict=False. But that would give a wrong loss value during evaluation.

@drprojects
Copy link
Owner

Yes setting strict=False would bypass this issue, but I agree it is not a satisfying fix, especially if we are running load_state_dict to resume training or fine tune.

Have you made any modification to the code other than that ?

Can you please share your pretrained S3DIS .ckpt (and specify the related datamodule.fold), so I can try loading them on my end ?

@sguinard
Copy link
Author

Hi @drprojects ,

Just wanted to confirm that your latest commit solved my ckpt reading issue,

Thanks a lot!

@drprojects
Copy link
Owner

Thanks for the feedback @sguinard ! I will wait a bit for @jaswanthbjk .ckpt to make sure things are in order before closing the issue.

@jaswanthbjk
Copy link

jaswanthbjk commented Jul 24, 2023

Hey @drprojects,

It worked, Sorry I didn't load the checkpoint properly. You can close the issue now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants