bug: Anthropic stream broken (potentially other endpoints) #744

0xSage · 2024-06-20T20:29:12Z

Add valid anthropic key
turn on stream
chat with claude
no response
see errors in logs

Q: is this happening with other endpoints as well?
lets round them up and fix 🙏

logs

 - llama_engine.cc:443
20240620 06:47:06.363966 UTC 1296850 INFO  Request 7: Streamed, waiting for respone - llama_engine.cc:565
20240620 06:47:06.364026 UTC 1296850 DEBUG [makeHeaderString] send stream with transfer-encoding chunked - HttpResponseImpl.cc:535
20240620 06:47:06.364179 UTC 1297078 DEBUG [LaunchSlotWithData] slot 0 is processing [task id: 10] - llama_server_context.cc:602
20240620 06:47:06.404551 UTC 1297078 INFO  kv cache rm [p0, end) -  id_slot: 0, task_id: 10, p0: 0 - llama_server_context.cc:1522
20240620 06:47:35.917041 UTC 1297078 DEBUG [PrintTimings] PrintTimings: prompt eval time = 9200.033ms / 596 tokens (15.4362969799 ms per token, 64.7823763241 tokens per second) - llama_client_slot.cc:79
20240620 06:47:35.917858 UTC 1297078 DEBUG [PrintTimings] PrintTimings:        eval time = 20352.056 ms / 425 runs   (47.8871905882 ms per token, 20.882411094 tokens per second)
 - llama_client_slot.cc:86
20240620 06:47:35.917861 UTC 1297078 DEBUG [PrintTimings] PrintTimings:       total time = 29552.089 ms - llama_client_slot.cc:92
20240620 06:47:35.917914 UTC 1297078 INFO  slot released: id_slot: 0, id_task: 10, n_ctx: 2048, n_past: 1021, n_system_tokens: 0, n_cache_tokens: 0, truncated: 0 - llama_server_context.cc:1282
20240620 06:47:35.917957 UTC 1297079 INFO  Request 7: End of result - llama_engine.cc:596
20240620 06:47:35.917994 UTC 1297079 INFO  Request 7: Task completed, release it - llama_engine.cc:629
20240620 06:47:35.917996 UTC 1297079 INFO  Request 7: Inference completed - llama_engine.cc:643
20240620 06:51:58.998618 UTC 1296851 INFO  Request 8, model llama3-8b-instruct: Generating reponse for inference request - llama_engine.cc:428
20240620 06:51:58.999279 UTC 1296851 INFO  Request 8: Stop words:[
	"<|end_of_text|>",
	"<|eot_id|>"
]
 - llama_engine.cc:443
20240620 06:51:59.000155 UTC 1296851 INFO  Request 8: Streamed, waiting for respone - llama_engine.cc:565
20240620 06:51:59.000455 UTC 1297078 INFO  slot released: id_slot: 0, id_task: 10, n_ctx: 2048, n_past: 1021, n_system_tokens: 0, n_cache_tokens: 0, truncated: 0 - llama_server_context.cc:1282
20240620 06:51:59.000975 UTC 1296851 DEBUG [makeHeaderString] send stream with transfer-encoding chunked - HttpResponseImpl.cc:535
20240620 06:51:59.000995 UTC 1297078 DEBUG [LaunchSlotWithData] slot 0 is processing [task id: 12] - llama_server_context.cc:602
20240620 06:51:59.037585 UTC 1297078 INFO  kv cache rm [p0, end) -  id_slot: 0, task_id: 12, p0: 0 - llama_server_context.cc:1522
20240620 06:52:27.962016 UTC 1297078 DEBUG [PrintTimings] PrintTimings: prompt eval time = 9886.169ms / 1089 tokens (9.07820844812 ms per token, 110.153892777 tokens per second) - llama_client_slot.cc:79
20240620 06:52:27.962061 UTC 1297078 DEBUG [PrintTimings] PrintTimings:        eval tim
2024-06-20T08:01:59.130Z [CORTEX]::Debug: Request to kill cortex

Van-QA · 2024-06-21T07:44:38Z

Resolved within Jan 0.5.1

0xSage added the type: bug Something isn't working label Jun 20, 2024

0xSage assigned Van-QA Jun 20, 2024

namchuai mentioned this issue Jun 21, 2024

Fix anthropic streaming error janhq/jan#3079

Merged

3 tasks

Van-QA closed this as completed Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Anthropic stream broken (potentially other endpoints) #744

bug: Anthropic stream broken (potentially other endpoints) #744

0xSage commented Jun 20, 2024

Van-QA commented Jun 21, 2024

bug: Anthropic stream broken (potentially other endpoints) #744

bug: Anthropic stream broken (potentially other endpoints) #744

Comments

0xSage commented Jun 20, 2024

Van-QA commented Jun 21, 2024