Some questions on library choice #8

BradKML · 2024-04-23T08:00:23Z

What are the difference between Anterion's components, and tools like ChatDev, AgentGPT, MetaGPT, and AutoGen?
How can one get AI to do web design and web development (or more generally UX development and accessibility)?
What are the criteria in picking an open model for use in Anterion? (e.g. Qwen, DeepSeek, Phind, Wizard, LLaMA)
Can this project handle reading documents and browsing the internet for technical updates?
Is automated fine-tuning or reranking possible to handle data science related tasks? (e.g. LDB, LATS, Parsel)

MiscellaneousStuff · 2024-04-24T11:15:52Z

Whoo, that's a lot of questions, I'll try my best to answer all of them.

The main difference between Anterion and those other projects that you mentioned is that we're using SWE-agent as the base layer, and will soon include planning as well. From our experience, SWE-agent's approach of heavily utilising demonstrations per task and having restricted inputs and outputs greatly improves it's ability to solve novel programming tasks and also be robust to errors and allow it to know when to stop attempting to do something entirely, or to just backtrack and try another approach to a problem.
One can improve AI web design and development by integrating multi-modal components into agents. This could involve using something like GPT4-V or Claude's vision API to allow the agent to have a general semantic understanding of the browser, and then use more specific tools like AgentQL to interface with specific parts of the website. You can also integrate a sense of time into the agent by recording generally how long things take to happen, this is quite important in web design as well. By integrating visual and spatial/geometric information into the agent, as well as improving how you present the DOM (whether that be the raw DOM, or some React-based DOM, etc.) into the agent and then combine those with the robustness of something like SWE-agent, this would be a good angle for improving AI-based web development. We'd like to try some of these methods ourself within Anterion.
Ideally we would like to implement LiteLLM (for API-based LLMs) and Ollama (for local LLMs) which should abstract providing access to a wide range of local LLMs for end-users. This is the best approach as it will allow users to use whatever they want in a way which is easiest for us to implement (as long as those libraries cover the LLMs which people want to use). As to which LLMs in general work well, this will just be a question of performance. The ideal LLM will be able to work well with the prompts which are used by SWE-agent and have the context length to support that. At the moment, the average call for SWE-agent requires around 10,000 input tokens and may output up to 1,000 output tokens (depending on the individual action which the agent is performing). This gravitates towards bigger models which can utilise these larger calls better.
This project can indirectly handle reading documents using bash tools or python, however if this is a feature you'd like to see more explicitly implemented then we can definitely add that to the list of features we'll add into the agents command specification next. As for browsing the internet, this is something which we'll explicitly add to the agents capabilities next, refer to Issue Add Browser Support During Web Development Tasks #3, for further context.
I think you'll need to expand on this further for me to understand. From what I'm understanding, this query is asking if the agent can dynamically adjust its behaviour during data science tasks in a way that allows it to dynamically analyse data or change how it's performing certain tasks? The answer to this is generally yes but it's hard to say exactly without knowing exactly what you had in mind. If you join our discord on https://discord.com/invite/nbY6njCuxh, maybe we can give you a better answer.

Thanks for your interest in the project! Stick around, we've got exciting updates coming soon.

BradKML · 2024-04-25T05:23:39Z

For 4, it is mostly about reading downloaded documentation + knowing how to web-browse to find updated documentation. Either way thanks for the cross-link to the other issue
For 5, It is mostly inspired by this benchmark that there is a whole field of optimization techniques that might be good to look at https://paperswithcode.com/sota/code-generation-on-humaneval

LDB with debugger that reads error outputs https://github.com/floridsleeves/llmdebugger
LATS with planning and reflective thinking https://github.com/andyz245/LanguageAgentTreeSearch
L2MAC with file store for consistent code generation https://github.com/samholt/l2mac
ANPL and Parsel with decomposition https://github.com/IPRC-DIP/ANPL https://github.com/ezelikman/parsel
Octopack instruction tuning https://github.com/bigcode-project/octopack

MiscellaneousStuff self-assigned this Apr 24, 2024

MiscellaneousStuff added the good first issue Good for newcomers label Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions on library choice #8

Some questions on library choice #8

BradKML commented Apr 23, 2024

MiscellaneousStuff commented Apr 24, 2024

BradKML commented Apr 25, 2024

Some questions on library choice #8

Some questions on library choice #8

Comments

BradKML commented Apr 23, 2024

MiscellaneousStuff commented Apr 24, 2024

BradKML commented Apr 25, 2024