Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Anthropic stream broken (potentially other endpoints) #744

Closed
0xSage opened this issue Jun 20, 2024 · 1 comment
Closed

bug: Anthropic stream broken (potentially other endpoints) #744

0xSage opened this issue Jun 20, 2024 · 1 comment
Assignees
Labels
type: bug Something isn't working

Comments

@0xSage
Copy link
Contributor

0xSage commented Jun 20, 2024

  1. Add valid anthropic key
  2. turn on stream
  3. chat with claude
  4. no response
  5. see errors in logs

Q: is this happening with other endpoints as well?
lets round them up and fix 🙏

logs

 - llama_engine.cc:443
20240620 06:47:06.363966 UTC 1296850 INFO  Request 7: Streamed, waiting for respone - llama_engine.cc:565
20240620 06:47:06.364026 UTC 1296850 DEBUG [makeHeaderString] send stream with transfer-encoding chunked - HttpResponseImpl.cc:535
20240620 06:47:06.364179 UTC 1297078 DEBUG [LaunchSlotWithData] slot 0 is processing [task id: 10] - llama_server_context.cc:602
20240620 06:47:06.404551 UTC 1297078 INFO  kv cache rm [p0, end) -  id_slot: 0, task_id: 10, p0: 0 - llama_server_context.cc:1522
20240620 06:47:35.917041 UTC 1297078 DEBUG [PrintTimings] PrintTimings: prompt eval time = 9200.033ms / 596 tokens (15.4362969799 ms per token, 64.7823763241 tokens per second) - llama_client_slot.cc:79
20240620 06:47:35.917858 UTC 1297078 DEBUG [PrintTimings] PrintTimings:        eval time = 20352.056 ms / 425 runs   (47.8871905882 ms per token, 20.882411094 tokens per second)
 - llama_client_slot.cc:86
20240620 06:47:35.917861 UTC 1297078 DEBUG [PrintTimings] PrintTimings:       total time = 29552.089 ms - llama_client_slot.cc:92
20240620 06:47:35.917914 UTC 1297078 INFO  slot released: id_slot: 0, id_task: 10, n_ctx: 2048, n_past: 1021, n_system_tokens: 0, n_cache_tokens: 0, truncated: 0 - llama_server_context.cc:1282
20240620 06:47:35.917957 UTC 1297079 INFO  Request 7: End of result - llama_engine.cc:596
20240620 06:47:35.917994 UTC 1297079 INFO  Request 7: Task completed, release it - llama_engine.cc:629
20240620 06:47:35.917996 UTC 1297079 INFO  Request 7: Inference completed - llama_engine.cc:643
20240620 06:51:58.998618 UTC 1296851 INFO  Request 8, model llama3-8b-instruct: Generating reponse for inference request - llama_engine.cc:428
20240620 06:51:58.999279 UTC 1296851 INFO  Request 8: Stop words:[
	"<|end_of_text|>",
	"<|eot_id|>"
]
 - llama_engine.cc:443
20240620 06:51:59.000155 UTC 1296851 INFO  Request 8: Streamed, waiting for respone - llama_engine.cc:565
20240620 06:51:59.000455 UTC 1297078 INFO  slot released: id_slot: 0, id_task: 10, n_ctx: 2048, n_past: 1021, n_system_tokens: 0, n_cache_tokens: 0, truncated: 0 - llama_server_context.cc:1282
20240620 06:51:59.000975 UTC 1296851 DEBUG [makeHeaderString] send stream with transfer-encoding chunked - HttpResponseImpl.cc:535
20240620 06:51:59.000995 UTC 1297078 DEBUG [LaunchSlotWithData] slot 0 is processing [task id: 12] - llama_server_context.cc:602
20240620 06:51:59.037585 UTC 1297078 INFO  kv cache rm [p0, end) -  id_slot: 0, task_id: 12, p0: 0 - llama_server_context.cc:1522
20240620 06:52:27.962016 UTC 1297078 DEBUG [PrintTimings] PrintTimings: prompt eval time = 9886.169ms / 1089 tokens (9.07820844812 ms per token, 110.153892777 tokens per second) - llama_client_slot.cc:79
20240620 06:52:27.962061 UTC 1297078 DEBUG [PrintTimings] PrintTimings:        eval tim
2024-06-20T08:01:59.130Z [CORTEX]::Debug: Request to kill cortex
@0xSage 0xSage added the type: bug Something isn't working label Jun 20, 2024
@Van-QA
Copy link
Contributor

Van-QA commented Jun 21, 2024

Resolved within Jan 0.5.1

@Van-QA Van-QA closed this as completed Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

2 participants