Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatgpt_fn returned json parsing error #80

Closed
hl395 opened this issue Jul 12, 2023 · 2 comments
Closed

chatgpt_fn returned json parsing error #80

hl395 opened this issue Jul 12, 2023 · 2 comments

Comments

@hl395
Copy link

hl395 commented Jul 12, 2023

It seems that the Open AI updated their returned results for ChatGPT queries, when server overloaded (or exceeding query limit)? I am now getting a json parsing error after completing partial annotations with chatgpt_fn.

Error traceback attached below.

INFO:root:Creating the annotator from chatgpt_fn.
INFO:root:Saving annotations to /home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/evaluators_configs/chatgpt_fn/annotations_seed0_configs.json.
INFO:root:Loading all annotations from /home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/evaluators_configs/chatgpt_fn/annotations_seed0_configs.json.
WARNING:root:The length of outputs before and after merge are not the same. We have len(outputs_1)==
805, len(outputs_2)==657, and len(df_annotated)==657.
This means that there are missing examples or duplicates. We are taking a SQL inner join.

INFO:root:Annotating 640 examples with chatgpt_fn
INFO:root:Using openai_completions on 640 prompts using gpt-3.5-turbo-16k-0613.
INFO:root:Kwargs to completion: {'max_tokens': 50, 'temperature': 0, 'function_call': {'name': 'print_best_model'}, 'functions': [{'name': 'print_best_model', 'description': 'Print the best model given the preferred output.', 'parameters': {'type': 'object', 'properties': {'best_output': {'type': 'string', 'description': "Name of the best output, should be 'Output (a)' or 'Output (b)'"}}}, 'required': ['best_output']}]}
INFO:root:Kwargs to completion: {'n': 1, 'model': 'gpt-3.5-turbo-16k-0613', 'is_chat': True, 'max_tokens': 50, 'temperature': 0, 'function_call': {'name': 'print_best_model'}, 'functions': [{'name': 'print_best_model', 'description': 'Print the best model given the preferred output.', 'parameters': {'type': 'object', 'properties': {'best_output': {'type': 'string', 'description': "Name of the best output, should be 'Output (a)' or 'Output (b)'"}}}, 'required': ['best_output']}]}
prompt_batches: 15%|████████████████████▌ | 99/640 [00:12<01:08, 7.90it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/decoders/openai.py", line 205, in _openai_completion_helper
all_args = json.loads(choice.message.function_call.arguments)

File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/json/init.py", line 346, in loads
return _default_decoder.decode(s)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/bin/alpaca_eval", line 8, in
sys.exit(main())
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/main.py", line 483, in main
fire.Fire(evaluate)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/main.py", line 126, in evaluate
annotations = annotator.annotate_head2head(
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/annotators/pairwise_evaluator.py", line 316, in annotate_head2head
out = self.annotate_pairs(df_to_annotate, **decoding_kwargs)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/annotators/pairwise_evaluator.py", line 346, in annotate_pairs
df_annotated = self._annotate(df_to_annotate, **decoding_kwargs)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/annotators/pairwise_evaluator.py", line 437, in _annotate
curr_annotated = self.annotators[annotator](df_annotated.loc[curr_idcs, self.all_keys], **decoding_kwargs)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/annotators/pairwise_evaluator.py", line 676, in call
completions = self.fn_completions(prompts=prompts, **self.completions_kwargs, **decoding_kwargs)
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/alpaca_eval/decoders/openai.py", line 140, in openai_completions
completions = list(
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/site-packages/tqdm/std.py", line 1178, in iter
for obj in iterable:
File "/home/liu/.conda/envs/hao_alpaca_eval_py310/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

@YannDubs
Copy link
Collaborator

YannDubs commented Jul 13, 2023

strange, it works for me...

Can you try alpaca_eval --model_outputs 'example/outputs.json' --max_instances 10 --annotators_config chatgpt_fn?
if that doesn't work try setting num_procs in the configs like this:

chatgpt_fn:
  prompt_template: "chatgpt_fn/basic_function_prompt.txt"
  fn_completions: "openai_completions"
  completions_kwargs:
    model_name: "gpt-3.5-turbo-16k-0613"
    max_tokens: 50
    num_procs: 1

this will make the error more readable. You can then paste it here (in a code block please)!

@hl395
Copy link
Author

hl395 commented Jul 13, 2023

Thanks @YannDubs . Setting max_instances works and helped me locate which instance failed. It turns out the issue is with the output of my model, which exploded with drawing ASCII art. It appended many slashes at the end of the answer (below). This caused the failure of no proper json returned from ChatGPT. After changed it, the evaluation works fine. Please feel free to close this issue.

{
        "instruction": "Make ASCII art of a cat",
        "output": "|      /      \\\n        |     /       \\\n        | / \\       / \\\n      | / \\ \\   / \\ / \\ /\n       |/ \\ \\ / / \\ / / \n        / \\ \\ / / / / / / \n      / \\ \\ / / / / / / / \n       \\ \\ \\ / / / / / / / \n         \\ \\ \\ / / / / / / / \n          \\ \\ \\ / / / / / / / / \n           \\ \\ \\ / / / / / / / / / \n             \\ \\ \\ / / / / / / / / / / / \n               \\ \\ \\ / / / / / / / / / / / / / / \n                 \\ \\ \\ / / / / / / / / / / / / / / / / / \n                   \\ \\ \\ / / / / / / / / / / / / / / / / / / / / / / / \n                     \\ \\ \\ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / \n                       \\ \\ \\ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /",
        "generator": "Model_test",
        "dataset": "oasst"
    },

@hl395 hl395 closed this as completed Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants