Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: LocalAI functions #726

Merged
merged 18 commits into from Jul 9, 2023
Merged

feat: LocalAI functions #726

merged 18 commits into from Jul 9, 2023

Conversation

mudler
Copy link
Owner

@mudler mudler commented Jul 6, 2023

Description

This PR closes #588 and #354

It implements LocalAI functions - which are compatible with OpenAI functions but works differently (there is no expectations from the model to return valid JSON)

It is based on the fantastic work done in llama.cpp by @ejones ( ❤️ ) in ggerganov/llama.cpp#1773 and ggerganov/llama.cpp#1887

This PR wires up functions to a grammar generator. The functions passed via API are parsed, and a grammar is generated to constrain the LLM to output specific JSON fields.

Small bad, but needed things:

Extra:

  • a grammar and a grammar_json_functions parameter is added to /chat/completions to pass by JSON or grammars to constrain the output
  • Introduces makefiles variables to pin llama.cpp out from the binding (useful for testing!)
  • Adds a function template to customize the template function prompt
  • Various fixes, between that fix displaying version correctly

Limitations:

  • works only on llama.cpp based backends
  • Currently this implementation does not support token stream
  • a "no action" is injected to the set of functions to allow the LLM to just return text - not customizable currently. this is left for enhancement later
  • Seems ARM/GPU here doesn't work as expected (gcc/nvcc toolchain? works on x86_64/x86_64+GPU)

Next, nice to have items:

  • OpenAI schema parsing and output
  • Refactoring (I'll keep the two things scoped, and just added functionalities here)

with this is now possible to emulate OpenAI functions and run directly their examples:

localai-functions-1
functions-2

See also my tweet: https://twitter.com/mudler_it/status/1675524071457533953

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

@mudler mudler linked an issue Jul 6, 2023 that may be closed by this pull request
Signed-off-by: mudler <mudler@localai.io>
@mudler
Copy link
Owner Author

mudler commented Jul 6, 2023

tested also on x86_64 GPU and works. Seems all good to go.

Add notice to documentation

Signed-off-by: mudler <mudler@localai.io>
@Rybens92
Copy link

Rybens92 commented Jul 8, 2023

Great work! But I have a problem with using this PR with Langchain OpenAI Functions Agent.

My code:
`llm = ChatOpenAI(temperature=0.5,
model="wizardlm-13b-v1.1.ggmlv3.q8_0.bin", openai_api_base="http://localhost:8080/v1", openai_api_key="ed")

llm_tools = tools.tux_load_tools(llm, memory=None)

mrkl = initialize_agent(llm_tools, llm, agent=AgentType.OPENAI_MULTI_FUNCTIONS, verbose=True)

mrkl.run(
"What is the weather in LA and SF?"
)`

Output:
> Entering new chain... Traceback (most recent call last): File "/mnt/linux/AI/TuxMate/function.py", line 28, in <module> mrkl.run( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chains/base.py", line 440, in run return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[ File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chains/base.py", line 243, in __call__ raise e File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chains/base.py", line 237, in __call__ self._call(inputs, run_manager=run_manager) File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/agents/agent.py", line 987, in _call next_step_output = self._take_next_step( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/agents/agent.py", line 792, in _take_next_step output = self.agent.plan( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/agents/openai_functions_multi_agent/base.py", line 270, in plan predicted_message = self.llm.predict_messages( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/base.py", line 399, in predict_messages return self(messages, stop=_stop, **kwargs) File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/base.py", line 349, in __call__ generation = self.generate( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/base.py", line 125, in generate raise e File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/base.py", line 115, in generate self._generate_with_cache( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/base.py", line 262, in _generate_with_cache return self._generate( File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/openai.py", line 372, in _generate return self._create_chat_result(response) File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/openai.py", line 388, in _create_chat_result message = _convert_dict_to_message(res["message"]) File "/mnt/linux/AI/TuxMate/.conda/lib/python3.10/site-packages/langchain/chat_models/openai.py", line 112, in _convert_dict_to_message return FunctionMessage(content=_dict["content"], name=_dict["name"]) KeyError: 'content'

Are there any compatibility problems with json output with langchains function agents?
Or is it my error?

@mudler
Copy link
Owner Author

mudler commented Jul 9, 2023

good catch, thanks @Rybens92 , should be fixed now

Signed-off-by: mudler <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
@mudler mudler merged commit 7aaa106 into master Jul 9, 2023
14 checks passed
@mudler mudler deleted the functions branch July 9, 2023 11:38
@mudler mudler added the enhancement New feature or request label Jul 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature: Chat completion functions feature: constrained grammars
2 participants