llama3 does not return pure json #521

barsuna · 2024-05-21T09:37:57Z

Testing gpt-researcher with llama3, i found that 3 times out of 4 llama3 will respond with json + some verbiage to prompt in generate_search_queries_prompt.

Not sure it is worth changing the prompt for sake of llama3 alone, but for documentation purposes here is the updated prompt that seems to work every time

before:

f'You must respond with a list of strings in the following format: ["query 1", "query 2", "query 3"].'

after

f'Your response must include list of the query strings in json format and nothing else. For example: ["query 1", "query 2", "query 3"]'

The text was updated successfully, but these errors were encountered:

Dilip-17 · 2024-05-22T06:14:33Z

Hey @barsuna. I was searching how to use llama with gpt researcher and stumbled upon this post. If possible, could you tell me how to get gpt researcher to work with llama 3?

barsuna · 2024-05-23T07:44:10Z

@Dilip-17 there was same question on another issue, i added some pointers there

#520

the challenge is mostly not how to run, but having the gpu memory necessary to run llama3 - even the borderline usable (imo, opinions are divided on this) 4-bit quantized 70b model takes about ~43GB, i'd recommend Q6 which is close to 60GB

assafelovic · 2024-05-25T06:38:05Z

Hey working with different LLMs (other than the default OpenAI) required extra manual tweaking. Would love to learn from your experience if you find ways to make the code more generic!

barsuna · 2024-05-28T12:01:12Z

To its credit, llama3 worked pretty much out of box with gpt-researcher (the only tweak needed was the prompt change above). It seems it is possible to stretch the context window to 16k without tuning (though i've done very limited testing of that).

So far progress with llama3 was difficult for things requiring function calling and in-prompt memory - autonomous agents, with single or 1 by 1 prompting agents things seem to be better.

Of course the main challenge remains the quality of reports, i'm currently trying to compare llama3 vs gpt4, it seems both are challenged somewhat and my belief is the likely direction to solve this is to balance automation/augmentation - let user do more if they wished.

Havent measured quality of embeddings and its impact on quality of report much either.

assafelovic · 2024-06-12T06:31:34Z

Great thank you for the feedback @barsuna ! Closing for now but feel free to open new threads if needed

assafelovic closed this as completed Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama3 does not return pure json #521

llama3 does not return pure json #521

barsuna commented May 21, 2024

Dilip-17 commented May 22, 2024 •

edited

barsuna commented May 23, 2024

assafelovic commented May 25, 2024

barsuna commented May 28, 2024

assafelovic commented Jun 12, 2024

llama3 does not return pure json #521

llama3 does not return pure json #521

Comments

barsuna commented May 21, 2024

Dilip-17 commented May 22, 2024 • edited

barsuna commented May 23, 2024

assafelovic commented May 25, 2024

barsuna commented May 28, 2024

assafelovic commented Jun 12, 2024

Dilip-17 commented May 22, 2024 •

edited