Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A brandly new DatasetGenerator using gpt-3.5-turbo and json #20

Merged
merged 91 commits into from
Apr 28, 2023

Conversation

zhaochenyang20
Copy link
Collaborator

@zhaochenyang20 zhaochenyang20 commented Apr 24, 2023

Description

After contacting the authors of Can ChatGPT Reproduce Human-Generated Labels? A Study of Social Computing Tasks, I thoroughly refactored our DatasetGenerator. We now use a prompt template and ask LLM to return the generated examples in json format, which is more controllable than our previous method of returning natural language examples and regenerating labels in a separate API call.

Followed are detailed changes:

To be Discussed

  • NA

Already Discussed

  1. I fixed a bug in the DatasetGenerator by including split.value in the code. This ensures that DatasetSplit can be serialized with the save_to_disk method.
  2. Following the previous change, I updated Vijay's run_locally.py by adding .value to the lines that define the generated_training, validation, and testing sets.
  3. I created OpenAIDatasetGenerator, InputOutputGenerator and added unit test forInputOutputGenerator. However, I have some questions and concerns that need to be addressed:
  • Is the split argument necessary in the generate_examples function?
  • Currently, we use gpt-3.5-turbo rather than text-davinci-002 because turbo is derived from Codex and can handle json response, while text-davinci can't. But this narrows the models users can choose.
  • I Mocked the behavior of openai.ChatCompletion through MockCompletion class for our unit test. 🚀🚀🚀
  • I hard-coded natrual_instruction and few_shot_examples in MockPromptSpec.
  • Although the current response mining is much better than before, there are still some issues. First, the key LLM returns us in the json isn't always expected. Second, the code for extracting the expected key from the json response is not very Pythonic.
  • The exception handling in my code is not optimal, as I try to cover all potential errors that might occur in the try block. If any error other than json.JSONDecodeError,IndexError,TypeError,ValueError, AttributeError occurs, the program will be terminated. However, I cannot use a simple except: statement to catch all errors because it is blocked by mypy.
  • The generate_examples function is not very Pythonic in the for _ in tqdm(range(num_examples), desc="Generating examples"): loop. However, I do not have a better way to make it more Pythonic while also displaying the generation progress, which is essential.
  1. I removed the use of pandas and directly returned a Dataset object created from a dictionary.
  2. I added max_api_call to set the upper bound of the API call.

References

Blocked by

  • Issue 26. Currently just mock the behavior of rompt_spec.parse_from_prompt.

Copy link
Collaborator

@viswavi viswavi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, things are looking pretty good! I have a couple comments (primarily, I still think the iogenerator.py class can be incorporated into the openai.py class, and I have some suggestions about variable naming), but I think these should be easy to resolve. Hopefully we'll be good to merge after that!

prompt2model/dataset_generator/iogenerator.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/iogenerator.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/iogenerator.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
tests/dataset_generator_test.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@viswavi viswavi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me (LGTM)!

I made one typo correction which you can accept before merging

@viswavi viswavi requested a review from neubig April 27, 2023 02:51
Copy link
Collaborator

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking nice! I had a few comments.

prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking nice! I just have one final suggestion about the place where we add mocked code.

prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
prompt2model/dataset_generator/openai.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, please go ahead and merge :)

@zhaochenyang20 zhaochenyang20 merged commit e636192 into main Apr 28, 2023
@zhaochenyang20 zhaochenyang20 deleted the Eren_Dataset_Generator_JSON branch April 28, 2023 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants