Skip to content

Conversation

@mitya52
Copy link
Contributor

@mitya52 mitya52 commented Jun 8, 2025

No description provided.

@mitya52 mitya52 requested a review from humbertoyusta June 8, 2025 17:29
Copy link
Contributor

@humbertoyusta humbertoyusta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but look at the comment of max allowed context


ALLOWED_N_CTX = [2 ** p for p in range(10, 20)]
# ALLOWED_N_CTX = [2 ** p for p in range(10, 20)]
ALLOWED_N_CTX = [1024, 2048, 4096] + [8192 * (t + 1) for t in range(0, 16)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we reducing max allowed n_ctx from 2^20 to 2^17 ? there are models with around 200000 context and 1000000 context already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll increase it but in fact we don't have models with context > 40k
large context requires some hacks for it

@mitya52 mitya52 merged commit a59042e into dev Jun 9, 2025
9 checks passed
@mitya52 mitya52 deleted the qwen3-thinking-08-06-25 branch June 9, 2025 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants