Add initial support for Zephyr 7b Beta #41

brainlid · 2023-11-30T00:14:30Z

This is for running the model directly on hardware using Nx and Bumblebee.

The Zephyr 7B beta LLM doesn't have all the capabilities of ChatGPT, nor the safeguards.

What works:

running a streaming, chat-based interaction
non-streaming call support
- Allow text completion streaming true/false option on a per-call basis elixir-nx/bumblebee#295
- Option to exclude prompt from generated text elixir-nx/bumblebee#247
varied output - Support non-deterministic output in text generation serving elixir-nx/bumblebee#284

What doesn't work:

cancelling - we can kill the process handling the stream, but the GPU could keep going until finished or it reaches the token limit
no function support

Closes #26

- add nx dep - make Message struct editable through changeset - helps with UI - initial streaming support works

brainlid · 2023-12-06T00:21:25Z

Zepher-7b Beta does NOT support function calling. It doesn't understand how to do it and has not been trained for it.

There are alternate models that have fine-tuned Zephyr for function calling, but those have licensing problems. They trained the model using OpenAI, which is a violation of the terms of use.

- raise error when adding functions and not supported

* main: (22 commits) prep for v0.1.6 Fix Req retry delay updated for v0.1.5 release updated for v0.1.5 release update to 0.1.5 upgrade Req to v0.4.8 - contains a retry fix fix: remove unecessary api_key from json payload updated changelog prep for new v.0.1.4 release document overriding the api endpoint allow overriding OpenAI compatible API endpoint Update Req to 0.4.7 Update Req to version 0.46 expanded comment Pass api_key to request if present in chat Allow passing api_key to ChatOpenAI added Utils.ChainResult module - helper functions for working with an LLMChain's result value preparation for v0.1.3 release Lessen retry delay to 300ms Add retry strategy to OpenAI Chat API requests ...

acalejos · 2024-01-14T00:34:33Z

Zepher-7b Beta does NOT support function calling. It doesn't understand how to do it and has not been trained for it.

There are alternate models that have fine-tuned Zephyr for function calling, but those have licensing problems. They trained the model using OpenAI, which is a violation of the terms of use.

@brainlid

Have you thought about any ways to support Functions through rolling a custom dispatching? My thought is using something like Instructor to coerce the LLM into categorizing the task thats being asked into a set of function. You add the task description as one of the possible outputs. Then you map all of the tasks to their respective functions. Huggingface has a diagram that sort of shows what I'm referring to here. You could also coerce the parameters using instructor as well.

I haven't tried this yet, but just wanted to throw the idea out there.

brainlid · 2024-01-16T20:19:53Z

@acalejos Yes! As you probably know by now, I interviewed Thomas Millar about InstructorEx in the episode that came out today.

The challenge is that Instructor doesn't work with Bumblebee yet, and relies on a llamacpp ability to restrict the output grammar, forcing it into a compliant JSON structure.

I'm very interested in the work going on there and this direction. It's very cool.

* main: (27 commits) fixed documentation warning updated changelog prep for v0.1.7 release retry connection when underlying mint connection closed - does a limited retry count of 3 be more permissive with ecto dep updated deps updated ex_doc fix: rebase and integrate merge conflicts feat: add unit tests, fix errors feat: streaming support for Google AI feat: Google AI support without streaming. updated to use req streaming api - detects Mint :closed error and does a retry which worked in local tests added test for expected response from streamed response body Update ecto 3.10.3 -> 3.11.1 Cleanup non-api test warning output link UI display text for a function to the function itself cleanup add new RoutingChain with PromptRoute - important for more complex assistants - first pass operation classifies which direction the user's prompt should go - return the desired chain for performing the user's request ChatOpenAI update for fake API responses - support returning fake error responses add TextToTitleChain - simple helper chain for summarizing a user's prompt into a title ...

* main: handle receiving JSON data broken up over multiple messages updated changelog update for v0.1.8 release code formatting Add mistral chat updated changelog updated changelog doc updates breaking change for routing_chain - RoutingChain now takes a default_route instead of default_chain - takes the default route's name into account in the generated LLM prompt - returns the selected route instead of route.chain Add max_tokens option for OpenAI calls. Add clause to match call_response spec Update lib/chat_models/chat_ollama_ai.ex Add support for Ollama open source models

- includes tests - some docs included for serving settings

brainlid added 3 commits November 29, 2023 17:05

working on zephyr model support

a64a3b4

- add nx dep - make Message struct editable through changeset - helps with UI - initial streaming support works

fixes and handles when serving does not stream

26c0c63

cleaning up

e791730

brainlid added 10 commits December 7, 2023 16:05

experimenting with alternate chat templating

80ec23b

experimenting with more generic bumblebee model

e684889

fix

fb9d860

added template_format to model

8dd45a0

fixed warning

b161c1d

testing

1a03db8

track models that don't support functions

704b6fd

- raise error when adding functions and not supported

generalizing zephyr to bumblebee servings

14d6ec1

working on llama_2 chat template format

a0f5f87

brainlid added 10 commits January 22, 2024 18:29

finalizing

66bdb47

update callback firing code

a3afe63

working bumblebee chat model

e1d96bd

- includes tests - some docs included for serving settings

added documentation

02e2ce4

updated readme

29bc240

updating for release

214fd59

formatted files

c0c4577

updated ex_doc

1ad6a09

brainlid merged commit ba1efba into main Feb 29, 2024
1 check passed

brainlid deleted the me-zephyr-chat-model branch February 29, 2024 22:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial support for Zephyr 7b Beta #41

Add initial support for Zephyr 7b Beta #41

brainlid commented Nov 30, 2023 •

edited

brainlid commented Dec 6, 2023

acalejos commented Jan 14, 2024 •

edited

brainlid commented Jan 16, 2024

Add initial support for Zephyr 7b Beta #41

Add initial support for Zephyr 7b Beta #41

Conversation

brainlid commented Nov 30, 2023 • edited

brainlid commented Dec 6, 2023

acalejos commented Jan 14, 2024 • edited

brainlid commented Jan 16, 2024

brainlid commented Nov 30, 2023 •

edited

acalejos commented Jan 14, 2024 •

edited