Skip to content
This repository was archived by the owner on Feb 7, 2025. It is now read-only.
This repository was archived by the owner on Feb 7, 2025. It is now read-only.

ValueError: hidden_size should be divisible by num_heads  #359

@CindyQi7788

Description

@CindyQi7788

while testing the anomaly_detction_with_transformers.ipynb from /GenerativeModels/tutorials/generative/anomaly_detection/anomaly_detection_with_transformers, inside the cell 34 of "Define network, inferer, optimizer, and loss function", the script encounters ValueError: hidden_size should be divisible by num_heads.

ValueError Traceback (most recent call last)
File :3
1 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
----> 3 transformer_model = DecoderOnlyTransformer(
4 num_tokens=16 + 1,
5 max_seq_len=spatial_shape[0] * spatial_shape[1],
6 attn_layers_dim=128,
7 attn_layers_depth=16,
8 attn_layers_heads=12,
9 )
10 transformer_model.to(device)
12 inferer = VQVAETransformerInferer()

File /dbfs/mnt/POC-data-science/MONAI_tutorials/GenerativeModels/generative/networks/nets/transformer.py:80, in DecoderOnlyTransformer.init(self, num_tokens, max_seq_len, attn_layers_dim, attn_layers_depth, attn_layers_heads, with_cross_attention, embedding_dropout_rate, use_flash_attention)
76 self.position_embeddings = AbsolutePositionalEmbedding(max_seq_len=max_seq_len, embedding_dim=attn_layers_dim)
77 self.embedding_dropout = nn.Dropout(embedding_dropout_rate)
79 self.blocks = nn.ModuleList(
---> 80 [
81 TransformerBlock(
82 hidden_size=attn_layers_dim,
83 mlp_dim=attn_layers_dim * 4,
84 num_heads=attn_layers_heads,
85 dropout_rate=0.0,
86 qkv_bias=False,
87 causal=True,
88 sequence_length=max_seq_len,
89 with_cross_attention=with_cross_attention,
90 use_flash_attention=use_flash_attention,
91 )
92 for _ in range(attn_layers_depth)
93 ]
94 )
96 self.to_logits = nn.Linear(attn_layers_dim, num_tokens)
...

How to fix the issue? Thank you very much.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions