Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1155: Add support for OpenAI-compatible endpoint in LLM and Embed #1197

Merged
merged 3 commits into from
Mar 5, 2024

Conversation

bioshazard
Copy link
Contributor

@bioshazard bioshazard commented Jan 19, 2024

Description

Delivers using the OpenAI-specific term base_url for LLM and api_base for Embed. Tested on LocalAI.io for LLM and Embed. Original issued asked for endpoint support, but base_url seemed superior to match OpenAI kwargs.

Fixes #1155

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Tested locally against LocalAI.io with the following config:
os.environ["OPENAI_API_KEY"] = "sk-xxxx"
app = App.from_config(config={
  "app": {
    "config": {
      "id": "test"
    }
  },
  "llm": {
    "provider": "openai",
    "config": {
      "model": 'gpt-4',
      "temperature": 0.1,
      "max_tokens": 1000,
      "top_p": 1,
      "stream": False,
      "base_url": "http://localhost:8180/v1"
    },
  },
  "embedder": {
    "config": {
      "api_base": "http://localhost:8180/v1"
    }
  }
})

app.add("https://www.forbes.com/profile/elon-musk")
app.add("https://en.wikipedia.org/wiki/Elon_Musk")
out = app.query("What is the net worth of Elon Musk today?")
# Answer: The net worth of Elon Musk today is $258.7 billion.
print(out)

It appears perfectly functional from LocalAI logs. The RAG operation itself returned "I dont know Elon's net worth", but all the RAG search results and LLM templating looked right in the debug. I personally plan to use EmbedChain with this change for my RAG ingest and search but to use my own template for synthesizing search results since running it against a 7b model looks like it will need a little more tinkering to behave as expected than what default prompt provides (maybe there is an obvious way to change the prompt I am missing??). I did not run any unit tests.

EDIT: I found the prompt override nvm: https://github.com/embedchain/embedchain/blob/9afc6878c82ee71332fa09aebecea93dd7829e7f/configs/full-stack.yaml#L18C5-L28C102

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • closes XXX (Replace xxxx with the GitHub issue number)
  • Made sure Checks passed

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jan 19, 2024
@bioshazard
Copy link
Contributor Author

LocalAI configs if you wanted to attempt utilization yourself:

gpt-4.yaml

name: gpt-4
mmap: true
parameters:
  # model: huggingface://TheBloke/Mistral-7B-OpenOrca-GGUF/mistral-7b-openorca.Q6_K.gguf
  model: openhermes-2.5-mistral-7b.Q6_K.gguf
  temperature: 0.2
  top_k: 40
  top_p: 0.95
template:
  chat_message: |
    <|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "user"}}user{{end}}
    {{if .Content}}{{.Content}}{{end}}
    <|im_end|>
    
  chat: |
    {{.Input}}
    <|im_start|>assistant
    
  completion: |
    {{.Input}}
context_size: 4096
gpu_layers: 55
f16: true
stopwords:
- <|im_end|>

usage: |
  curl http://localhost:8180/v1/chat/completions -H "Content-Type: application/json" -d '{
      "model": "gpt-4",
      "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
  }'

text-embedding-ada-002.yaml

name: text-embedding-ada-002
backend: sentencetransformers
embeddings: true
parameters:
  model: all-MiniLM-L6-v2

@thhung
Copy link

thhung commented Mar 3, 2024

Can we merge this PRs? It seems only add more config and change only the OpenAI Chat and Embeding to support the local endpoints.

Copy link
Collaborator

@deshraj deshraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding support for this.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 4, 2024
embedchain/llm/openai.py Outdated Show resolved Hide resolved
@bioshazard
Copy link
Contributor Author

Failing test is easy fix. Should not require the config or env for base var. Default should be None. Might get around to this this week. One line change imo

@deshraj
Copy link
Collaborator

deshraj commented Mar 5, 2024

yes, Merging this PR for now. Will fix the tests in my follow up PR.

@deshraj deshraj merged commit 11f4ce8 into embedchain:main Mar 5, 2024
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support custom endpoint for OpenAI
3 participants