New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/benchmark gpt pilot #5184
base: master
Are you sure you want to change the base?
Feature/benchmark gpt pilot #5184
Conversation
β Deploy Preview for auto-gpt-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This is still a work-in-progress and I may need help. Locally I'm working on the config files here - https://github.com/Significant-Gravitas/Auto-GPT/tree/bd42e25916f3c4bba9b86de227428482482b507b/benchmark/agent/GPT-Pilot/agbenchmark ...But it seems that they should ultimately be in the Currently the test is failing:
|
I'm making progress on this. The documentation on adding new agents could be clearer though |
@merwanehamadi anything we can do to help out here? |
Pilot GPT is only very new - 25 days. I think it needs some architectural changes before the benchmarks can be run against it. |
Hi @nalbion thank you for doing this. The team is currently working on fixing the benchmarking system to work smoothly with the mono repo, see #5194. Also I noticed you have your agent in benchmark/agents, we are going to be keeping the agents in autogpts/{agent_name}. |
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
@Swiftyos the rules for the competition seem to be contradictory. β The creation of new agents should strictly be facilitated through the AutoGPT repo β Participants may begin with an existing agent or start fresh. β
Please note that you are free to utilize any AI technology, ...as long as Pilot-GPT implements the Agent Protocol, it should be eligible, right? |
Conflicts have been resolved! π A maintainer will review the pull request shortly. |
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
Background
GPT-Pilot benchmark config
Changes ποΈ
Because Auto-GPT-Benchmarks is deprecated I have had to make some additional code changes here to update the paths.
PR Quality Scorecard β¨
+2 pts
+5 pts
+5 pts
+5 pts
-4 pts
+4 pts
+5 pts
-5 pts
agbenchmark
to verify that these changes do not regress performance? β+10 pts