-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor assistant streaming and create OpenAI compliant base class #425
Conversation
ragna/assistants/_api.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was renamed to _http_api.py
but has enough changes for git to not recognize it as such.
|
||
|
||
class HttpApiAssistant(Assistant): | ||
_API_KEY_ENV_VAR: Optional[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API key is now optional. See #375 for a discussion.
@@ -21,8 +21,8 @@ def _make_system_content(self, sources: list[Source]) -> str: | |||
) | |||
return instruction + "\n\n".join(source.content for source in sources) | |||
|
|||
async def _call_api( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The def _call_api
abstraction for def answer
was just a remnant of an old implementation that I forgot to clean up earlier:
ragna/ragna/assistants/_api.py
Lines 32 to 38 in 84cf4f6
async def answer( | |
self, prompt: str, sources: list[Source], *, max_new_tokens: int = 256 | |
) -> AsyncIterator[str]: | |
async for chunk in self._call_api( | |
prompt, sources, max_new_tokens=max_new_tokens | |
): | |
yield chunk |
This PR removes it and all subclass simply implement def answer
directly.
|
||
|
||
class AnthropicApiAssistant(ApiAssistant): | ||
class AnthropicAssistant(HttpApiAssistant): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driveby rename to align it with other provider base classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a demonstration how easy it is after this PR to add new OpenAI compliant assistants.
yield cast(str, choice["delta"]["content"]) | ||
|
||
|
||
class OpenaiAssistant(OpenaiCompliantHttpApiAssistant): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The public OpenAI API fits the new scheme nicely.
@pytest.mark.parametrize( | ||
"assistant", | ||
[assistant for assistant in HTTP_API_ASSISTANTS if assistant._API_KEY_ENV_VAR], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #375
|
||
@property | ||
def _url(self) -> str: | ||
base_url = os.environ.get("RAGNA_LLAMAFILE_BASE_URL", "http://localhost:8080") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nenb is port 8080 the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the case.
If this PR is accepted, I'll have a go at #376 and bring it up to speed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, delighted that ragna
can connect with many local LLMs so easily now, thank you!
|
||
@property | ||
def _url(self) -> str: | ||
base_url = os.environ.get("RAGNA_LLAMAFILE_BASE_URL", "http://localhost:8080") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the case.
Touching on #424 (comment)
That is certainly up for debate. I think the most practical thing given the variety of cases here is: Any REST API is OpenAI compliant if it uses the same request and response schema as OpenAI. For practicality reasons, we allow the following deviations:
|
This came from an offline discussion with @nenb and supersedes #424. It also paves the way for #375.
The main two changes are
More details inline.