Skip to content

Advanced Paste: Add model picker to avoid rate-limits and reduce costs (GPT-4o is 30K TPM, 4o-mini/4.1-mini/o4-mini are 200K TPM) #39700

@ThoughtPhotography

Description

@ThoughtPhotography

Description of the new feature / enhancement

Advanced Paste is a powerful feature, but sadly the hard-coded model GPT-4o has a 30K TPM (token per minute) limit for Tier 1 which will be the majority of users.

The limit per day is also only 90K compared to 2M for mini models.

Users should be able to choose the model that best suites their needs based on cost, speed, complexity and context required.

e.g.

  • o4-mini is far more advanced than 4o at less than half the cost.
  • 4o-mini is less than 1/16th the cost of 4o while being much faster. A 200K token action would still cost ~2.5x less than the maximum 30K task possible with 4o.

Scenario when this would be used?

Ideally, the global model could be overwritten for custom actions allowing even greater control.

I'd immediately switch to 4o-mini as my default since it would be faster, cheaper, and could complete larger tasks instead of returned a rate limit message.

Having Advanced Paste automatically switch to a mini model when the token count is too high would also be much preferred to an error.

Supporting information

Image

https://platform.openai.com/docs/pricing

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs-TriageFor issues raised to be triaged and prioritized by internal Microsoft teams

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions