(Textual Inversion) Initialise Vector For New Token From Multiple Existing Tokens

I'd like to propose an idea analagous to https://github.com/huggingface/diffusers/issues/369.

The current fine tuning script for textual inversion initialises the new `placeholder_token`'s embedding with an existing `initializer_token` (and enforces that the init token is exactly one token).

https://github.com/huggingface/diffusers/blob/84b9df57a7c78e1cd9c132d286341451a4e6a80b/examples/textual_inversion/textual_inversion.py#L409-L411

I was curious if we could initialise a new token from _multiple_ existing ones. Let me give an example for a use case. Say I'm trying to add the concept of a "low camera angle". The existing model does have some semblance of this concept, but it's far from concrete. However, it's existing knowledge is not captured by any single token in isolation.

---

My first thought was to get the embeddings of each token from
```python
tokenizer.encode("low camera angle", add_special_tokens=False)
```
and average them but that doesn't quite smell right. As I understand it, it's the `text_encoder` that's responsible for relationships between sequences of words. I wonder what the best strategy might be to initialise a new token from multiple existing ones.

Thanks!

cc @patil-suraj @isamu-isozaki 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(Textual Inversion) Initialise Vector For New Token From Multiple Existing Tokens #673

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	# Initialise the newly added placeholder token with the embeddings of the initializer token
	token_embeds = text_encoder.get_input_embeddings().weight.data
	token_embeds[placeholder_token_id] = token_embeds[initializer_token_id]

(Textual Inversion) Initialise Vector For New Token From Multiple Existing Tokens #673

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions