Represent learnt concept in textual inversion with more than one token

### Describe the bug

As we discuss in #266 
> The original textual inversion support [using more than one vector](https://github.com/rinongal/textual_inversion/blob/main/ldm/modules/embedding_manager.py#L39) to represent the learnt concept.
For the current implementation, if we just extend the learned vocab and CLIP token embedding then it would use only one vector for it.

What could be the best way to support this? cc @patil-suraj 

### Reproduction

_No response_

### Logs

_No response_

### System Info

diffusers v0.2.4


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Represent learnt concept in textual inversion with more than one token #369

Describe the bug

Reproduction

Logs

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Represent learnt concept in textual inversion with more than one token #369

Description

Describe the bug

Reproduction

Logs

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions