Open
Description
Description of the new feature / enhancement
Advanced Paste is a powerful feature, but sadly the hard-coded model GPT-4o has a 30K TPM (token per minute) limit for Tier 1 which will be the majority of users.
The limit per day is also only 90K compared to 2M for mini models.
Users should be able to choose the model that best suites their needs based on cost, speed, complexity and context required.
e.g.
- o4-mini is far more advanced than 4o at less than half the cost.
- 4o-mini is less than 1/16th the cost of 4o while being much faster. A 200K token action would still cost ~2.5x less than the maximum 30K task possible with 4o.
Scenario when this would be used?
Ideally, the global model could be overwritten for custom actions allowing even greater control.
I'd immediately switch to 4o-mini as my default since it would be faster, cheaper, and could complete larger tasks instead of returned a rate limit message.
Having Advanced Paste automatically switch to a mini model when the token count is too high would also be much preferred to an error.