Question / Confirmation on arange in encode_text

Hello,

This is a minor question about the code, I want to be sure I do not let slip any subtleties.

In L354 of `model.py`, there is the final step to extract the text

```python
# x.shape = [batch_size, n_ctx, transformer.width]
x = x[torch.arange(x.shape[0]), text.argmax(dim=-1)] @ self.text_projection
```

- `self.text_projection`  is the last projection to have the text features
- `text.argmax(dim=-1)` picks the features of the EOT token. 

Why there is a `torch.arange(x.shape[0])`? It could be `x[:, text.argmax(dim=-1)]`, right?

Thanks for the work, code and model.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question / Confirmation on arange in encode_text #371

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question / Confirmation on arange in encode_text #371

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions