Skip to content

Question / Confirmation on arange in encode_text #371

@AwePhD

Description

@AwePhD

Hello,

This is a minor question about the code, I want to be sure I do not let slip any subtleties.

In L354 of model.py, there is the final step to extract the text

# x.shape = [batch_size, n_ctx, transformer.width]
x = x[torch.arange(x.shape[0]), text.argmax(dim=-1)] @ self.text_projection
  • self.text_projection is the last projection to have the text features
  • text.argmax(dim=-1) picks the features of the EOT token.

Why there is a torch.arange(x.shape[0])? It could be x[:, text.argmax(dim=-1)], right?

Thanks for the work, code and model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions