Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(agent): Refactor & improve create_chat_completion #7082

Merged
merged 4 commits into from
Apr 16, 2024

Conversation

Pwuts
Copy link
Member

@Pwuts Pwuts commented Apr 15, 2024

  • feat(agent/utils): Improve conditions and errors in json_loads
  • refactor(agent/core): Rearrange and split up OpenAIProvider.create_chat_completion
  • fix(agent): Handle decoding of function call arguments in create_chat_completion

- Include all decoding errors when raising a ValueError on decode failure
- Use errors returned by `return_errors` instead of an error buffer
- Fix check for decode failure
…hat_completion`

- Rearrange to reduce complexity, improve separation/abstraction of concerns, and allow multiple points of failure during parsing
- Move conversion from `ChatMessage` to `openai.types.ChatCompletionMessageParam` to `_get_chat_completion_args`
- Move token usage and cost tracking boilerplate code to `_create_chat_completion`
- Move tool call conversion/parsing to `_parse_assistant_tool_calls` (new)
…t_completion`

- Amend `model_providers.schema`: change type of `arguments` from `str` to `dict[str, Any]` on `AssistantFunctionCall` and `AssistantFunctionCallDict`
- Implement robust and transparent parsing in `OpenAIProvider._parse_assistant_tool_calls`
- Remove now unnecessary `json_loads` calls throughout codebase
Copy link

netlify bot commented Apr 15, 2024

Deploy Preview for auto-gpt-docs canceled.

Name Link
🔨 Latest commit 8fdacdc
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/661d805ad9750a00082152aa

@Pwuts Pwuts requested a review from kcze April 15, 2024 18:23
@Pwuts Pwuts changed the title Refactor & improve create_chat_completion refactor(agent): Refactor & improve create_chat_completion Apr 15, 2024
Copy link

codecov bot commented Apr 15, 2024

Codecov Report

Attention: Patch coverage is 15.58442% with 65 lines in your changes are missing coverage. Please review.

Project coverage is 45.24%. Comparing base (d7f00a9) to head (8fdacdc).

Files Patch % Lines
...pt/autogpt/core/resource/model_providers/openai.py 7.81% 59 Missing ⚠️
autogpts/autogpt/autogpt/core/utils/json_utils.py 66.66% 1 Missing and 1 partial ⚠️
...autogpt/autogpt/agent_factory/profile_generator.py 0.00% 1 Missing ⚠️
...pt/core/planning/prompt_strategies/initial_plan.py 0.00% 1 Missing ⚠️
.../core/planning/prompt_strategies/name_and_goals.py 0.00% 1 Missing ⚠️
...pt/core/planning/prompt_strategies/next_ability.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7082      +/-   ##
==========================================
- Coverage   45.44%   45.24%   -0.20%     
==========================================
  Files         139      139              
  Lines        6569     6595      +26     
  Branches      924      932       +8     
==========================================
- Hits         2985     2984       -1     
- Misses       3434     3461      +27     
  Partials      150      150              
Flag Coverage Δ
Linux 45.14% <15.58%> (-0.20%) ⬇️
Windows 43.39% <15.58%> (-0.19%) ⬇️
autogpt-agent 45.21% <15.58%> (-0.20%) ⬇️
macOS 44.59% <15.58%> (-0.20%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@kcze kcze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes are good; moving json_loads inside the provider makes it easier to use.
Regarding complexity: using create_chat_completion with large parsing function and all the mixins made it difficult to understand the order of execution in the agent.
Maybe we could introduce some higher-level wrapper that would handle different llm apis? And it could accept parsing function and handle retries instead of model_providers.

Comment on lines -409 to +417
completion_kwargs = self._get_completion_kwargs(model_name, functions, **kwargs)
tool_calls_compat_mode = functions and "tools" not in completion_kwargs
if "messages" in completion_kwargs:
model_prompt += completion_kwargs["messages"]
del completion_kwargs["messages"]
openai_messages, completion_kwargs = self._get_chat_completion_args(
model_prompt, model_name, functions, **kwargs
)
tool_calls_compat_mode = bool(functions and "tools" not in completion_kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be out of scope but can we make it possible to use create_chat_completion without providing model_name - can be default or chosen by config, agent.

Copy link
Member Author

@Pwuts Pwuts Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I'd rather implement that at a higher level. I want to have an LLM call multiplexer (sort of wrapper function) through which it's possible to use all available models/providers. That way it would be easy to use multiple providers throughout the application.

Maybe we could introduce some higher-level wrapper that would handle different llm apis?

Yes :)

And it could accept parsing function and handle retries instead of model_providers.

That would be a nice clean solution, but some ModelProviders will still need internal retry mechanisms to ensure reliable output. OpenAI's function call argument parsing is one example, and I think we'll find more once we have a working llamafile integration.

When there are multiple points at which parsing can fail, it's most (time+cost) efficient to parse with a fail-soft strategy: running as many of the parsing steps as possible before stopping and regenerating the response with feedback collected from failed parsing steps. This is only possible if the ModelProvider itself implements the retry mechanism, because it is forbidden to return broken.

An additional potential benefit of implementing the retry mechanism in the ModelProvider is that we can tailor the feedback prompt to the LLM that is being used.

What we could implement at a higher level is more high-level parsing functionality. For example, pass in a prompt and a Pydantic model, and it ensures that the output is converted into that Pydantic model.

@Pwuts Pwuts merged commit 7082e63 into master Apr 16, 2024
20 of 22 checks passed
@Pwuts Pwuts deleted the refactor-llm-provider-stuff branch April 16, 2024 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants