Max Seq length for inference #24

JunboShen · 2024-03-04T21:59:22Z

May I ask the proper range for input sequence length to do the inference using the evo-1-131k-base model?
I tried to use a single A100 and got CUDA Out of Memory when inputting a single sequence longer than 1000.
Thank you!

Zymrael · 2024-03-06T17:17:04Z

Prompting with longer sequences requires sharding for the model, which is currently not supported. However, you can generate much longer, up to 500k and beyond on a single 80Gb GPU.

If you'd like to test the model with longer prompt I recommend Together's API.

pan-genome · 2024-06-07T14:12:06Z

Prompting with longer sequences requires sharding for the model, which is currently not supported. However, you can generate much longer, up to 500k and beyond on a single 80Gb GPU.

If you'd like to test the model with longer prompt I recommend Together's API.

could you elaborate how to generate 500k on a single 80Gb GPU, I got OOM on A100 with 3kb sequence. Thank you

brianhie · 2024-06-07T15:31:43Z

@pan-genome we were able to just use the standard HuggingFace sampling API (e.g., loading with AutoModelForCausalLM.from_pretrained(), sampling with model.generate()) to generate 500k+ on an 80 Gb GPU.

pan-genome · 2024-06-20T22:17:02Z

@pan-genome we were able to just use the standard HuggingFace sampling API (e.g., loading with AutoModelForCausalLM.from_pretrained(), sampling with model.generate()) to generate 500k+ on an 80 Gb GPU.

could you provide a working code example? thank you

brianhie · 2024-06-21T02:19:18Z

Something like

model_config = AutoConfig.from_pretrained(
    'togethercomputer/evo-1-131k-base',
    trust_remote_code=True,
    revision="1.1_fix",
)
model_config.max_seqlen = 500_000

model = AutoModelForCausalLM.from_pretrained(
    'togethercomputer/evo-1-131k-base',
    config=model_config,
    trust_remote_code=True,
    revision="1.1_fix",
)

outputs = model.generate(
    input_ids,
    max_new_tokens=500_000,
    temperature=1.,
    top_k=4,
)

davidkell mentioned this issue Apr 16, 2024

Extracting EVO representations rather than logits #32

Open

brianhie closed this as completed Jun 21, 2024

evo-design locked and limited conversation to collaborators Jun 21, 2024

brianhie converted this issue into discussion #73 Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Max Seq length for inference #24

Max Seq length for inference #24

JunboShen commented Mar 4, 2024

Zymrael commented Mar 6, 2024

pan-genome commented Jun 7, 2024

brianhie commented Jun 7, 2024

pan-genome commented Jun 20, 2024

brianhie commented Jun 21, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

Max Seq length for inference #24

Max Seq length for inference #24

Comments

JunboShen commented Mar 4, 2024

Zymrael commented Mar 6, 2024

pan-genome commented Jun 7, 2024

brianhie commented Jun 7, 2024

pan-genome commented Jun 20, 2024

brianhie commented Jun 21, 2024

This issue was moved to a discussion.