-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama3 does not return pure json #521
Comments
Hey @barsuna. I was searching how to use llama with gpt researcher and stumbled upon this post. If possible, could you tell me how to get gpt researcher to work with llama 3? |
@Dilip-17 there was same question on another issue, i added some pointers there the challenge is mostly not how to run, but having the gpu memory necessary to run llama3 - even the borderline usable (imo, opinions are divided on this) 4-bit quantized 70b model takes about ~43GB, i'd recommend Q6 which is close to 60GB |
Hey working with different LLMs (other than the default OpenAI) required extra manual tweaking. Would love to learn from your experience if you find ways to make the code more generic! |
To its credit, llama3 worked pretty much out of box with gpt-researcher (the only tweak needed was the prompt change above). It seems it is possible to stretch the context window to 16k without tuning (though i've done very limited testing of that). So far progress with llama3 was difficult for things requiring function calling and in-prompt memory - autonomous agents, with single or 1 by 1 prompting agents things seem to be better. Of course the main challenge remains the quality of reports, i'm currently trying to compare llama3 vs gpt4, it seems both are challenged somewhat and my belief is the likely direction to solve this is to balance automation/augmentation - let user do more if they wished. Havent measured quality of embeddings and its impact on quality of report much either. |
Great thank you for the feedback @barsuna ! Closing for now but feel free to open new threads if needed |
Testing gpt-researcher with llama3, i found that 3 times out of 4 llama3 will respond with json + some verbiage to prompt in generate_search_queries_prompt.
Not sure it is worth changing the prompt for sake of llama3 alone, but for documentation purposes here is the updated prompt that seems to work every time
before:
after
The text was updated successfully, but these errors were encountered: