Skip to content

System becomes unresponsive using large context size limit #125

@preshing

Description

@preshing

Thank you for this fine project!

On DGX Spark, as of commit eb3c0b8, if the server is launched with --ctx 1000000, and a prompt is sent from Pi, the entire system becomes unresponsive and needs a hard reset. (Last line logged to the console is "ds4-server: chat ctx=0..5153:5153 TOOLS prefill chunk 0/5153 (0.0%) chunk=0.00 t/s avg=0.00 t/s 0.222s".)

Issue doesn't occur when passing smaller values like --ctx 250000.

Issue doesn't occur in the previous commit.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions