Skip to content

(Textual Inversion) Initialise Vector For New Token From Multiple Existing Tokens #673

@rsomani95

Description

@rsomani95

I'd like to propose an idea analagous to #369.

The current fine tuning script for textual inversion initialises the new placeholder_token's embedding with an existing initializer_token (and enforces that the init token is exactly one token).

# Initialise the newly added placeholder token with the embeddings of the initializer token
token_embeds = text_encoder.get_input_embeddings().weight.data
token_embeds[placeholder_token_id] = token_embeds[initializer_token_id]

I was curious if we could initialise a new token from multiple existing ones. Let me give an example for a use case. Say I'm trying to add the concept of a "low camera angle". The existing model does have some semblance of this concept, but it's far from concrete. However, it's existing knowledge is not captured by any single token in isolation.


My first thought was to get the embeddings of each token from

tokenizer.encode("low camera angle", add_special_tokens=False)

and average them but that doesn't quite smell right. As I understand it, it's the text_encoder that's responsible for relationships between sequences of words. I wonder what the best strategy might be to initialise a new token from multiple existing ones.

Thanks!

cc @patil-suraj @isamu-isozaki

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions