Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make main.py compatible with OpenAI compatible APIs #189

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

hmellor
Copy link

@hmellor hmellor commented Jan 23, 2024

Solves #161 and #148 and is an alternative to #179.

Employs the DRY principle by only changing the creation of the Evaluator class in main.py and generation.parallel_generations function. Therefore, won't need to maintain multiple Evaluator classes in parallel.

Using the completions instead of chat.completions was a design choice because it eliminates errors/confusion from additional chat templating taking place behind the API.

If you want to evaluate a model running behind an OpenAI compatible API, then you can use base_url to send any generation requests to that URL.

  • If you are self-hosting an OpenAI compatible API:
    • Set base_url to the url you are hosting with (i.e. http://localhost:8000/v1).
    • Set model to the served name of your model.
  • If you are using OpenAI's API:
    • Set the environment variable OPENAI_API_KEY.
    • Set base_url to https://api.openai.com/v1.
    • Set model to the name of the OpenAI model you want to use (e.g. gpt-3.5-turbo-1106).

@hmellor
Copy link
Author

hmellor commented Jan 23, 2024

@loubnabnl, if you have time I'd appreciate a review, thanks!

@tshrjn
Copy link

tshrjn commented Feb 1, 2024

Seems like there is an issue with chat format:

    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'messages' is a required property", 'type': 'invalid_request_error', 'param': None, 'code': None}}
  0%|                                                                                                                                                      | 0/164 [00:02<?, ?it/s]
Task exception was never retrieved
future: <Task finished name='Task-4' coro=<tqdm_asyncio.gather.<locals>.wrap_awaitable() done, defined at /opt/homebrew/anaconda3/envs/7diamond/lib/python3.10/site-packages/tqdm/asyncio.py:75> exception=BadRequestError('Error code: 400 - {\'error\': {\'message\': "\'messages\' is a required property", \'type\': \'invalid_request_error\', \'param\': None, \'code\': None}}')>
Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/tqdm/asyncio.py", line 76, in wrap_awaitable
    return i, await f
  File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/openai/resources/completions.py", line 1020, in create
    return await self._post(
  File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/openai/_base_client.py", line 1705, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
  File "/opt/homebrew/anaconda3/envs/env_name/lib/python3.10/site-packages/openai/_base_client.py", line 1408, in request
    return await self._request(
  File "/opt/homebrew/anaconda3/envs/7diamond/lib/python3.10/site-packages/openai/_base_client.py", line 1499, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'messages' is a required property", 'type': 'invalid_request_error', 'param': None, 'code': None}}

@hmellor
Copy link
Author

hmellor commented Feb 2, 2024

@tshrjn you're going to need to provide more context, the word chat doesn't feature in my PR at all.

In the PR description I explicitly state that I am not using the chat endpoint, so I don't know what you did to get a chat error.

@nielstron
Copy link

I tested this branch and it worked perfectly fine. Only caveat, it really only works with completion models (i.e. babbage, davinci at OpenAI) and not with chat models! But this is expected due to the format of the benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants