Implement a JSON responding OpenAI LLM as JSONOpenAILLM #331

burtenshaw · 2024-02-06T10:44:46Z

This pr implements a new class in distilabel.llm.openai named JSONOpenAILLM.

The class uses the json_response feature from Open AI and validates that the model selected is present in a hard coded list of models that support this feature.

plaguss

Thanks @burtenshaw! I left some comments, also we use pytest for the tests instead of unittest (other than using the mock module), if you could update those. Once the tests pass, looks good to me

src/distilabel/llm/openai.py

tests/llm/test_openai.py

Co-authored-by: Agus <agustin.piqueres@gmail.com>

src/distilabel/llm/openai.py

alvarobartt · 2024-02-06T13:08:11Z

src/distilabel/llm/openai.py

+        self.json_supporting_models = [
+            "gpt-4-0125-preview",
+            "gpt-4-turbo-preview",
+            "gpt-4-1106-preview",
+            "gpt-3.5-turbo-1106",
+        ]
+        assert model in self.json_supporting_models, f"Provided `model` does not support JSON input, \
+            available models are {self.json_supporting_models}"


Maybe it's just better to override the available_models property instead?

i.e. pull the models from OpenAI and filter the default models so that just those are used, and keep the fine-tunes, as we're unsure about those

Meaning the same property as available_models within OpenAILLM but with a slight filtering on the proprietary models offered by OpenAI

@alvarobartt Thanks for your feedback.

like this?

@cached_property def available_models(self) -> List[str]: """Returns the list of available models in your OpenAI account.""" all_available_models = super().available_models json_supporting_models = [ "gpt-4-0125-preview", "gpt-4-turbo-preview", "gpt-4-1106-preview", "gpt-3.5-turbo-1106", ] return list(set(all_available_models) & set(json_supporting_models))

It makes sense to combine the functionality. I'm just worried about making the logging clear when the model name is wrong.

In the situation where the user gives an incorrect model name, we should make it clear to the user whether it's incorrect because it doesn't support JSON or it isn't available on their account.

I initially went for the separate approach because the account is asserted, then JSON compatibility is asserted.

🤔 I can of course just log in the available_models method if there are no JSON compatible model in the account's models.

Hmm fair I get your point now :/ Then you can leave it as is for the moment, and if any issue arises we can revisit, but the current is fine too!

Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com>

plaguss · 2024-02-07T09:05:25Z

Would close #328

* Implement JSON OpenAILLM by inheriting from OpenAILLM * mock and test JSONOPenAILLM * refactor test to decouple available models * add documentation for JSONOpenAILLM * expose JSONOpenAILLM in distilabel.llms * sort and format * sort and format * Update tests/llm/test_openai.py Co-authored-by: Agus <agustin.piqueres@gmail.com> * switch testing from unittest to pytest * sorting * extra assert in test_generate * format * refactor tests to mock out OpenAI client * Update src/distilabel/llm/openai.py Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com> * combine available models method * use mocked openai in tests * testing: move assert raise catch to fixture * test: refactor fixture to not use globals * refactor mocking for simplicity share assertion --------- Co-authored-by: Agus <agustin.piqueres@gmail.com> Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com>

burtenshaw added 5 commits February 6, 2024 11:13

Implement JSON OpenAILLM by inheriting from OpenAILLM

6b79344

mock and test JSONOPenAILLM

74577aa

refactor test to decouple available models

532e833

add documentation for JSONOpenAILLM

f66e1ed

expose JSONOpenAILLM in distilabel.llms

53eed94

burtenshaw requested review from alvarobartt and plaguss February 6, 2024 10:44

burtenshaw self-assigned this Feb 6, 2024

burtenshaw added 2 commits February 6, 2024 11:54

sort and format

fdfbd2b

sort and format

b6634a9

plaguss reviewed Feb 6, 2024

View reviewed changes

src/distilabel/llm/openai.py Outdated Show resolved Hide resolved

tests/llm/test_openai.py Outdated Show resolved Hide resolved

burtenshaw and others added 5 commits February 6, 2024 12:15

Update tests/llm/test_openai.py

895fcc8

Co-authored-by: Agus <agustin.piqueres@gmail.com>

switch testing from unittest to pytest

5e6a51a

sorting

378ded4

extra assert in test_generate

467b17f

format

6e12aea

alvarobartt reviewed Feb 6, 2024

View reviewed changes

burtenshaw and others added 7 commits February 6, 2024 14:19

refactor tests to mock out OpenAI client

42f1a6b

Update src/distilabel/llm/openai.py

3a9ef23

Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com>

combine available models method

918acc4

use mocked openai in tests

a2962f2

testing: move assert raise catch to fixture

b50ba31

test: refactor fixture to not use globals

a6d6f2d

refactor mocking for simplicity share assertion

a74f0fb

burtenshaw merged commit d9e54e9 into argilla-io:main Feb 7, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a JSON responding OpenAI LLM as JSONOpenAILLM #331

Implement a JSON responding OpenAI LLM as JSONOpenAILLM #331

burtenshaw commented Feb 6, 2024

plaguss left a comment

alvarobartt Feb 6, 2024

alvarobartt Feb 6, 2024

alvarobartt Feb 6, 2024

burtenshaw Feb 6, 2024 •

edited

Loading

burtenshaw Feb 6, 2024 •

edited

Loading

alvarobartt Feb 7, 2024

plaguss commented Feb 7, 2024 •

edited

Loading

Implement a JSON responding OpenAI LLM as JSONOpenAILLM #331

Implement a JSON responding OpenAI LLM as JSONOpenAILLM #331

Conversation

burtenshaw commented Feb 6, 2024

plaguss left a comment

Choose a reason for hiding this comment

alvarobartt Feb 6, 2024

Choose a reason for hiding this comment

alvarobartt Feb 6, 2024

Choose a reason for hiding this comment

alvarobartt Feb 6, 2024

Choose a reason for hiding this comment

burtenshaw Feb 6, 2024 • edited Loading

Choose a reason for hiding this comment

burtenshaw Feb 6, 2024 • edited Loading

Choose a reason for hiding this comment

alvarobartt Feb 7, 2024

Choose a reason for hiding this comment

plaguss commented Feb 7, 2024 • edited Loading

burtenshaw Feb 6, 2024 •

edited

Loading

burtenshaw Feb 6, 2024 •

edited

Loading

plaguss commented Feb 7, 2024 •

edited

Loading