Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for Zephyr 7b Beta #41

Merged
merged 23 commits into from
Feb 29, 2024
Merged

Add initial support for Zephyr 7b Beta #41

merged 23 commits into from
Feb 29, 2024

Conversation

brainlid
Copy link
Owner

@brainlid brainlid commented Nov 30, 2023

This is for running the model directly on hardware using Nx and Bumblebee.

The Zephyr 7B beta LLM doesn't have all the capabilities of ChatGPT, nor the safeguards.

What works:

What doesn't work:

  • cancelling - we can kill the process handling the stream, but the GPU could keep going until finished or it reaches the token limit
  • no function support

Closes #26

- add nx dep
- make Message struct editable through changeset - helps with UI
- initial streaming support works
@brainlid
Copy link
Owner Author

brainlid commented Dec 6, 2023

Zepher-7b Beta does NOT support function calling. It doesn't understand how to do it and has not been trained for it.

There are alternate models that have fine-tuned Zephyr for function calling, but those have licensing problems. They trained the model using OpenAI, which is a violation of the terms of use.

- raise error when adding functions and not supported
* main: (22 commits)
  prep for v0.1.6
  Fix Req retry delay
  updated for v0.1.5 release
  updated for v0.1.5 release
  update to 0.1.5
  upgrade Req to v0.4.8 - contains a retry fix
  fix: remove unecessary api_key from json payload
  updated changelog
  prep for new v.0.1.4 release
  document overriding the api endpoint
  allow overriding OpenAI compatible API endpoint
  Update Req to 0.4.7
  Update Req to version 0.46
  expanded comment
  Pass api_key to request if present in chat
  Allow passing api_key to ChatOpenAI
  added Utils.ChainResult module - helper functions for working with an LLMChain's result value
  preparation for v0.1.3 release
  Lessen retry delay to 300ms
  Add retry strategy to OpenAI Chat API requests
  ...
@acalejos
Copy link

acalejos commented Jan 14, 2024

Zepher-7b Beta does NOT support function calling. It doesn't understand how to do it and has not been trained for it.

There are alternate models that have fine-tuned Zephyr for function calling, but those have licensing problems. They trained the model using OpenAI, which is a violation of the terms of use.

@brainlid

Have you thought about any ways to support Functions through rolling a custom dispatching? My thought is using something like Instructor to coerce the LLM into categorizing the task thats being asked into a set of function. You add the task description as one of the possible outputs. Then you map all of the tasks to their respective functions. Huggingface has a diagram that sort of shows what I'm referring to here. You could also coerce the parameters using instructor as well.

I haven't tried this yet, but just wanted to throw the idea out there.

@brainlid
Copy link
Owner Author

@acalejos Yes! As you probably know by now, I interviewed Thomas Millar about InstructorEx in the episode that came out today.

The challenge is that Instructor doesn't work with Bumblebee yet, and relies on a llamacpp ability to restrict the output grammar, forcing it into a compliant JSON structure.

I'm very interested in the work going on there and this direction. It's very cool.

* main: (27 commits)
  fixed documentation warning
  updated changelog
  prep for v0.1.7 release
  retry connection when underlying mint connection closed - does a limited retry count of 3
  be more permissive with ecto dep
  updated deps
  updated ex_doc
  fix: rebase and integrate merge conflicts
  feat: add unit tests, fix errors
  feat: streaming support for Google AI
  feat: Google AI support without streaming.
  updated to use req streaming api - detects Mint :closed error and does a retry which worked in local tests
  added test for expected response from streamed response body
  Update ecto 3.10.3 -> 3.11.1
  Cleanup non-api test warning output
  link UI display text for a function to the function itself
  cleanup
  add new RoutingChain with PromptRoute - important for more complex assistants - first pass operation classifies which direction the user's prompt should go - return the desired chain for performing the user's request
  ChatOpenAI update for fake API responses - support returning fake error responses
  add TextToTitleChain - simple helper chain for summarizing a user's prompt into a title
  ...
* main:
  handle receiving JSON data broken up over multiple messages
  updated changelog
  update for v0.1.8 release
  code formatting
  Add mistral chat
  updated changelog
  updated changelog
  doc updates
  breaking change for routing_chain - RoutingChain now takes a default_route instead of default_chain - takes the default route's name into account in the generated LLM prompt - returns the selected route instead of route.chain
  Add max_tokens option for OpenAI calls.
  Add clause to match call_response spec
  Update lib/chat_models/chat_ollama_ai.ex
  Add support for Ollama open source models
- includes tests
- some docs included for serving settings
@brainlid brainlid merged commit ba1efba into main Feb 29, 2024
1 check passed
@brainlid brainlid deleted the me-zephyr-chat-model branch February 29, 2024 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Mistral via Bumblebee
2 participants