Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraints awareness #3466

Closed
1 task done
Boostrix opened this issue Apr 28, 2023 · 18 comments
Closed
1 task done

Constraints awareness #3466

Boostrix opened this issue Apr 28, 2023 · 18 comments
Labels

Comments

@Boostrix
Copy link
Contributor

Boostrix commented Apr 28, 2023

Duplicates

  • I have searched the existing issues

Summary 馃挕

There's a bunch of RFEs here using the terms "maximum FOO" (context, tokens, time, memory, space etc).

Thus, more broadly, it might make sense to encode support for actual constraints as a first class concept into the design, so that under the hood, the system is aware of its own resource utilization (execution time, space, traffic, tokens, context, API usage and so on).

As to API usage/billing, it would currently have to scrape some of the openai pages apparently.

That way, planning would also be better informed/simplified, because the system could take into account the "costs" of its operations.

This sort of thing would also make it possible to constrain the system when it is behind a low-bandwidth connection, and e.g. prioritize other work, to reduce bandwidth utilization

Could be thought of as the equivalent of Unix/Linux "quotas" - i.e. a way to monitor resource utilization for different types of resources - which also be a great thing for benchmarking purposes

Examples 馃寛

No response

Motivation 馃敠

No response

@Boostrix
Copy link
Contributor Author

this is somewhat related, but not identical: #2237

@johnisanerd
Copy link
Contributor

johnisanerd commented May 2, 2023

I want to echo my support for this idea. It would be great to be able to constrain resources. Any developer who has worked with aws or gcp has a nightmare story of running up a huge bill by accident.

@Boostrix
Copy link
Contributor Author

Boostrix commented May 2, 2023

note that this implies being able to specify custom API keys for sub-agents, too (as mentioned yesterday in an PR)

Also see: #3313

PS: This should probably be renamed "quotas" instead of constraints, because constraints means a different thing in the GPT/LLM context ?

@Boostrix
Copy link
Contributor Author

Boostrix commented May 5, 2023

This posting sums up the typical thinking quite well:

#15 (comment)

  • Supervise his work (kind of reporting in a file).
  • Give him restrictions (highly recommended restriction/config/settings file), including the use of tokens and external connections that can provide financial risk (debt)."

In other words, constraints would be analogous to "quotas", with an option to set soft/hard quotas.
And a violation would trigger asking the "human in the loop for feedback":

While sub-agents running into constraint violations would need to notify their parent agent via inter-agent messaging, which would then either be handled internally, or simply pass onto the next parent agent and so on, until the human is consulted once again. Obviously, the equivalent of a "project manager agent" could be given some leeway to control its sub-agents with different constraints/API budgets and handle violations gracefully, without interrupting the main loop.

@johnisanerd
Copy link
Contributor

Not knowing the architecture here very well, could it be accomplished as a plugin?

@Boostrix
Copy link
Contributor Author

Boostrix commented May 5, 2023

The plugin-interface is in the process of being revamped as part of #3652
So, I guess, yes it would be possible - the question is whether that's sensible, given that these "quotas" (will be using that term from now on, since constraints have a different meaning in the context of LLM) will need to be monitored/tracked basically ALL THE TIME when an agent is running a task/objective or planning stage.

Thus, while the code might conceptually reside in plug-in space, it would de facto be a core component -since it would need to be called permanently by the core to track costs of different actions.

Then again, if the core devs should end up not being supportive of the idea, that's certainly an option - but based on some comments I've seen in a few PRs, there's related work going on anyway, so what seems more likely is that this will be implemented "soonish".

Initially, with a focus on tracking "obvious costs" (API use), but probably with means to extend this over time.
I've seen comments and PRs related to tracking API tokens and number of steps, which I consider rather promising and most urgent for the time being.

From an execution standpoint, unnecessary looping is the one thing that most people find annoying, so tracking looping (which is a form of making the same step over and over again once you think about it), would be highly useful. Consider it like a way to not just track the number of steps, but track the number of max identical steps (where identical would be the determined by hashing the name and arguments of the action to be taken, while always getting the same response from the LLM): #3668

To literally track bandwidth, CPU/RAM utilization and disk space, we'd want to - de facto, wrap a library like psutil to do so in a multi platform fashion.

@johnisanerd
Copy link
Contributor

You raise an interesting point regarding the use of quotas as a means of controlling robots within a software framework. Quotas can be a powerful tool in limiting the behavior of robots, especially when they are tied to a reporting system that allows for effective oversight.

However, as you suggest, it may be more effective to implement quotas at the main software level, rather than within a plugin. This would ensure that all sub-agents within the system are subject to the same constraints and reporting mechanisms, and that violations are handled consistently across the board.

It's also worth considering the potential trade-offs of using quotas in this way. On the one hand, quotas can help to prevent robots from engaging in harmful or risky behaviors, and can provide a mechanism for catching and correcting errors. On the other hand, overly restrictive quotas could limit the effectiveness of the system and prevent it from achieving its goals.

Ultimately, the decision to use quotas in this way will depend on the specific context and goals of the software framework in question. It may be helpful to experiment with different levels of constraint and to solicit feedback from stakeholders and users to determine the optimal balance between control and flexibility.

@johnisanerd
Copy link
Contributor

I guess I was thinking a plugin would be an easy way to deploy it quickly, get user feedback, and then when demonstrated it could become a core feature.

@Boostrix
Copy link
Contributor Author

Boostrix commented May 5, 2023

For now, it seems API / cost tracking and tracking of steps is in the pipeline - the rest, we'll see. But given the agility/pace of the project, it's probably just a few weeks to get this implemented.

Regarding tracking of API costs and steps, these are the PRs that I am aware of:

So, in essence, a number of folks have come up with the idea previously - just not with an overly broad focus, but given the current evolution towards a multi-agent system, tracking resource utilization seems logical, and important.

There's now initial work to track memory per agent: #3844

@zachary-kaelan
Copy link

When it comes to human stuff, it may be worth trying to phrase it in the prompt as a, "very slow but very powerful AGI model for usage when you get stuck", which when called just sends you a message asking for input. We could integrate that into the quota system with some fake numbers to make it tunable in the same way. "Flat $30 per generation with latency of 5 minutes".

@Boostrix
Copy link
Contributor Author

Boostrix commented May 5, 2023

latency is indeed an important consideration, not just network latency, but also "thinking" latency - i.e. the pure process of delegating a task to the [remote] LLM/GPT (and that would even apply if it were local)

@anonhostpi
Copy link

This will be partially mitigated by the introduction of workspaces in the re-arch. Of course a REST API is also good for linking more than one instance of AutoGPT to another

@Boostrix
Copy link
Contributor Author

Boostrix commented May 5, 2023

The following RFE also discusses how actions/commands may have their own associated "costs" and may be subject to constraints, too: #3945

Of course a REST API is also good for linking more than one instance of AutoGPT to another

Also see:

@Boostrix
Copy link
Contributor Author

Boostrix commented May 9, 2023

the new budget manager implementatoin (#4040) is likely to provide a good foundation to experiment with the concept of gathering stats and monitoring/tracking those to comply with some constraints.

From an architectural perspective, it would probably make sense to have the equivalent of a StatsProvider (to capture/provide data), and an actual StatsObserver/Monitor to actually see if the system remains within same well-defined bounds.

Whiile this is straightforward to do for simple metrics like API tokens, number of steps taken, or duration of an execution - system specific stats would be better captured by coming up with an adaptor class to wrap psutil accordingly.

That way, the system could also be told to observer CPU/RAM/DISK utilization etc

@Boostrix
Copy link
Contributor Author

Boostrix commented Jul 3, 2023

This article goes into detail about lack of contraint awareness/budget management (beyond just API tokens): https://lorenzopieri.com/autogpt_fix/

AutoGPT does subgoals creation and any other activity by prompting LLMs without any real concern for resource usage and user preferences. As well known from decision theory and reinforcement learning, we can encode the preference of the user in a scalar utility (or reward) function. Moreover we can estimate the cost incurred when executing an action, in terms of compute, time, processes, money or similar scarce resources. Finally, we can take note of the confidence of the LLM model inference to any given statement, as everything else being equal it makes sense to build plans with greater confidence. Plugging all together, armed with this generalised utility function we now have a criteria to guide subgoal creation and planning, since any LLMs statement is now weighted. Notably it is the LLM itself doing this value assignment, even though building custom classifiers is also an option, similarly to the reward preference models of RLHF.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 6, 2023

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

@github-actions github-actions bot added the Stale label Sep 6, 2023
@github-actions
Copy link
Contributor

This issue was closed automatically because it has been stale for 10 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2023
@Boostrix
Copy link
Contributor Author

Boostrix commented Oct 4, 2023

For the record, not agreeing with this being "staled" - should probably be re-opened and added to some future milestone?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants