Skip to content

Unused cls_token in PatchEmbeddingBlock #3454

@night-gale

Description

@night-gale

Describe the bug
When I was training the ViT with torch DistributedDataParallel, during backward, torch raises error and reports that

Parameters which did not receive grad for rank 0: vit.patch_embedding.cls_token

which means that the cls_token did not participate in the backward process.

I checked the implementation of ViT and PatchEmbeddingBlock and found the unused cls_token in monai.networks.blocks.patchembedding.py: PatchEmbeddingBlock.
image

To Reproduce
Steps to reproduce the behavior:

  1. set environment variable in shell TORCH_DISTRIBUTED_DEBUG=INFO
  2. train ViT with torch DistributedDataParallel

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions