Context window and max_tokens management #17

synw · 2023-08-20T14:19:23Z

Running the "Write unit tests" command with a local Llama 2 model I get an error message because of the default 1000 max_tokens param:

llama_predict: error: prompt is too long (1133 tokens, max 1020)

I would like to be able to set the context window size of the model (Llama 2 is 4096 tokens). This way the max_tokensparam value could be automatically calculated using the llama-tokenizer-js lib:

import llamaTokenizer from 'llama-tokenizer-js';

const promptNtokens = llamaTokenizer.encode(prompt).length;
const maxTokens = model_context_window_size_param - promptNTokens

[Edit]: it would need another tokenizer for the OpenAi, this one is for local models

The text was updated successfully, but these errors were encountered:

This was referenced Aug 26, 2023

New Goinfer provider #20

Closed

New Koboldcpp provider #21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context window and max_tokens management #17

Context window and max_tokens management #17

synw commented Aug 20, 2023 •

edited

Loading

Context window and max_tokens management #17

Context window and max_tokens management #17

Comments

synw commented Aug 20, 2023 • edited Loading

synw commented Aug 20, 2023 •

edited

Loading