Probably wrong kind of prompt template used for token count calculation in longCEPO

Hello. Thank you for your work.

In this line:
https://github.com/codelion/optillm/blob/12ac7863cf713c5cd417f81ec1e52ed28017bc16/optillm/plugins/longcepo/mapreduce.py#L224

We use `max_context_window - tok_len(collapse_prompt) - max_output_tokens` to estimate how much tokens we have to fit all the answers we previously had from MAP stage. But we probably want to estimate it as: `max_context_window - tok_len(reduce_prompt) - max_output_tokens` because we will actually pass combined answers **to reduce prompt in the next step**.

Here is a "pseudopatch" to demonstrate what I mean:
```
     num_tokens = get_prompt_length(format_chunk_list(context_chunks), tokenizer)
     token_budget = (
         longcepo_config.max_context_window
-        - get_prompt_length(longcepo_config.collapse_prompt, tokenizer)
+        - get_prompt_length(longcepo_config.reduce_prompt, tokenizer)
         - longcepo_config.max_output_tokens
     )
     logger.info(f"Pre-collapse length of chunks {num_tokens}, allowed {token_budget}")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Probably wrong kind of prompt template used for token count calculation in longCEPO #235

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Probably wrong kind of prompt template used for token count calculation in longCEPO #235

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions