Extracting EVO representations rather than logits #32

amoskalev · 2024-03-08T17:20:45Z

Hi, thanks for your amazing work!

How can I extract representations rather than logits from the model?

I am using the huggingface version, and I see the model returns logits' and 'past_key_values. Could you please explain what's in past_key_values and if anything of those can be used as a sequence representation? Or maybe you can suggest other ways to access representations of a model?

The text was updated successfully, but these errors were encountered:

davidkell · 2024-03-16T22:06:41Z

Here's how I'm currently solving this (adapted from usage in README) :

from evo import Evo
import torch

device = 'cuda:0'

evo_model = Evo('evo-1-131k-base')
model, tokenizer = evo_model.model, evo_model.tokenizer
model.to(device)
model.eval()

# monkey patch the unembed function with identity
# this removes the final projection back from the embedding space into tokens
# so the "logits" of the model is now the final layer embedding
# see source for unembed - https://huggingface.co/togethercomputer/evo-1-131k-base/blob/main/model.py#L339

from torch import nn

class CustomEmbedding(nn.Module):
  def unembed(self, u):
    return u

model.unembed = CustomEmbedding()

# end custom code

sequence = 'ACGT'
input_ids = torch.tensor(
    tokenizer.tokenize(sequence),
    dtype=torch.int,
).to(device).unsqueeze(0)

embed, _ = model(input_ids) # (batch, length, embed dim)

print('Embed: ', embed)
print('Shape (batch, length, embed dim): ', embed.shape)

# you can now use embedding for downstream classification tasks
# you probably want to aggregate over position dimension
# e.g. mean value = embed.mean(dim=1) or final token embedding = embed[:, -1, :]

Note that this is for the model object returned by evo-model, which is an instance of StripedHyena. If you are using Huggingface directly, this is wrapped with StripedHyenaModelForCausalLM, so you need to do model.backbone.unembed = CustomEmbedding()

seyonechithrananda · 2024-04-06T22:41:15Z

Thanks @davidkell !

zhongwang · 2024-04-11T17:29:38Z

@davidkell I tried your code on a A100 40GB using the evo-8k model, embedding the 4-letter sequence in the example costs over 400MB GPU RAM, the model itself needs 13GB. The embedding dimension is 4096. I don't understand why it cost so much memory. 4x4096 BF16 should only take 32KB, right? I tried to embed a 2kb sequence but always ran out of cuda memory. Anyone has a similar problem?

davidkell · 2024-04-16T15:46:44Z

I had a similar experience. I was able to get inference working for 2k sequences on A100 80GB (e.g. available on Paperspace), although around 2.5-3k I would get OOM. I haven't looked in depth on what is driving the memory requirement

davidkell · 2024-04-16T16:07:10Z

Quoting from this issue #24:

Prompting with longer sequences requires sharding for the model, which is currently not supported

So I think if you want to generate embeddings for longer sequences, you will need to manually shard on GPUs or setup CPU offloading or something like that

davidkell mentioned this issue Apr 8, 2024

Is it posisble for us to access the embeddings layer? #47

Closed

brianhie mentioned this issue Jun 21, 2024

Limiting attention radius and extracting embeddings #58

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting EVO representations rather than logits #32

Extracting EVO representations rather than logits #32

amoskalev commented Mar 8, 2024

davidkell commented Mar 16, 2024 •

edited

Loading

seyonechithrananda commented Apr 6, 2024

zhongwang commented Apr 11, 2024

davidkell commented Apr 16, 2024 •

edited

Loading

davidkell commented Apr 16, 2024 •

edited

Loading

Extracting EVO representations rather than logits #32

Extracting EVO representations rather than logits #32

Comments

amoskalev commented Mar 8, 2024

davidkell commented Mar 16, 2024 • edited Loading

seyonechithrananda commented Apr 6, 2024

zhongwang commented Apr 11, 2024

davidkell commented Apr 16, 2024 • edited Loading

davidkell commented Apr 16, 2024 • edited Loading

davidkell commented Mar 16, 2024 •

edited

Loading

davidkell commented Apr 16, 2024 •

edited

Loading

davidkell commented Apr 16, 2024 •

edited

Loading