-
Notifications
You must be signed in to change notification settings - Fork 11
Implement fake-streaming for non-streaming tool models #251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+218
−37
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
c92d249
WIP non streaming tool use support
baasitsharief 05a1429
only fake-stream when the model doesn't support real streaming
jkwatson d34b520
ruff?
jkwatson 0aa92f9
WIP error handling and non stream tool support
baasitsharief 0b3cb5d
return base_model_name without region prefix on error
baasitsharief faae6c2
ruff and mypy checks
baasitsharief b1f3b44
minor agent prompt changes
baasitsharief f97e1e0
pre-format the date in a more human-readable format
jkwatson efc5fde
add directions in readme for modifying the UI in CML
ewilliams-cloudera 9d2a2d5
do not yield responses with an empty string
ewilliams-cloudera 13edacd
Update llm-service/app/services/query/querier.py
jkwatson 306f526
a bit of cleanup
jkwatson 4e2b9a3
de-dupe some code
jkwatson ed47ee0
mypy
jkwatson cb27892
remove redundant constructor
jkwatson 148821f
put it back, mypy
jkwatson 04b3fcb
update name of session in form if it's updated on the session
ewilliams-cloudera f9c400d
WIP slight prompt changes
baasitsharief 9ccd3e8
custom prompt helper for openai models/summarization
jkwatson 0a2ca23
fix bad imports
jkwatson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
97 changes: 97 additions & 0 deletions
97
llm-service/app/services/query/agents/non_streamer_bedrock_converse.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
from typing import ( | ||
Sequence, | ||
Optional, | ||
Union, | ||
Any, | ||
List, | ||
AsyncGenerator, | ||
) | ||
|
||
from llama_index.core.base.llms.types import ( | ||
ChatMessage, | ||
ChatResponse, | ||
) | ||
from llama_index.core.tools import BaseTool | ||
from llama_index.llms.bedrock_converse import BedrockConverse | ||
|
||
|
||
class FakeStreamBedrockConverse(BedrockConverse): | ||
""" | ||
A class that inherits from BedrockConverse but overrides its astream_chat_with_tools function. | ||
This class is used to create a non-streaming version of the BedrockConverse. | ||
""" | ||
|
||
def __init__(self, *args: Any, **kwargs: Any) -> None: | ||
""" | ||
Initialize the FakeStreamBedrockConverse class. | ||
""" | ||
super().__init__(*args, **kwargs) | ||
|
||
async def astream_chat_with_tools( | ||
self, | ||
tools: Sequence["BaseTool"], | ||
user_msg: Optional[Union[str, ChatMessage]] = None, | ||
chat_history: Optional[List[ChatMessage]] = None, | ||
verbose: bool = False, | ||
allow_parallel_tool_calls: bool = False, | ||
tool_required: bool = False, | ||
**kwargs: Any, | ||
) -> AsyncGenerator[ChatResponse, None]: | ||
# This method is overridden to provide a non-streaming version of the chat with tools. | ||
# Here we yield a single ChatResponse object instead of streaming multiple responses. | ||
async def _fake_stream() -> AsyncGenerator[ChatResponse, None]: | ||
response = await self.achat_with_tools( | ||
tools=tools, | ||
user_msg=user_msg, | ||
chat_history=chat_history, | ||
verbose=verbose, | ||
allow_parallel_tool_calls=allow_parallel_tool_calls, | ||
tool_required=tool_required, | ||
**kwargs, | ||
) | ||
yield response | ||
|
||
return _fake_stream() | ||
|
||
@classmethod | ||
def from_bedrock_converse( | ||
cls, bedrock_converse: BedrockConverse | ||
) -> "FakeStreamBedrockConverse": | ||
""" | ||
Create a FakeStreamBedrockConverse object from a BedrockConverse object. | ||
|
||
Args: | ||
bedrock_converse: A BedrockConverse object | ||
|
||
Returns: | ||
A FakeStreamBedrockConverse object with the same public attributes as the input BedrockConverse | ||
""" | ||
# Create a new instance of FakeStreamBedrockConverse with only the public parameters | ||
# Let the parent class handle initialization of private attributes | ||
return cls( | ||
model=bedrock_converse.model, | ||
temperature=bedrock_converse.temperature, | ||
max_tokens=bedrock_converse.max_tokens, | ||
additional_kwargs=bedrock_converse.additional_kwargs, | ||
callback_manager=bedrock_converse.callback_manager, | ||
system_prompt=bedrock_converse.system_prompt, | ||
messages_to_prompt=bedrock_converse.messages_to_prompt, | ||
completion_to_prompt=bedrock_converse.completion_to_prompt, | ||
pydantic_program_mode=bedrock_converse.pydantic_program_mode, | ||
output_parser=bedrock_converse.output_parser, | ||
profile_name=bedrock_converse.profile_name, | ||
aws_access_key_id=bedrock_converse.aws_access_key_id, | ||
aws_secret_access_key=bedrock_converse.aws_secret_access_key, | ||
aws_session_token=bedrock_converse.aws_session_token, | ||
region_name=bedrock_converse.region_name, | ||
api_version=bedrock_converse.api_version, | ||
use_ssl=bedrock_converse.use_ssl, | ||
verify=bedrock_converse.verify, | ||
endpoint_url=bedrock_converse.endpoint_url, | ||
timeout=bedrock_converse.timeout, | ||
max_retries=bedrock_converse.max_retries, | ||
guardrail_identifier=bedrock_converse.guardrail_identifier, | ||
guardrail_version=bedrock_converse.guardrail_version, | ||
application_inference_profile_arn=bedrock_converse.application_inference_profile_arn, | ||
trace=bedrock_converse.trace, | ||
) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.