Skip to content

Only require max_tokens when token rate limits apply#3771

Merged
virajmehta merged 15 commits intomainfrom
gb/fix-3648
Oct 1, 2025
Merged

Only require max_tokens when token rate limits apply#3771
virajmehta merged 15 commits intomainfrom
gb/fix-3648

Conversation

@GabrielBianconi
Copy link
Copy Markdown
Member

@GabrielBianconi GabrielBianconi commented Oct 1, 2025

Fix #3648


Important

Modify rate limiting to require max_tokens only when token rate limits apply, updating RateLimitedRequest implementations and adding relevant tests.

  • Behavior:
    • RateLimitedRequest trait's estimated_resource_usage method now takes resources parameter to determine if max_tokens is required.
    • EmbeddingRequest and ModelInferenceRequest implementations updated to conditionally require max_tokens based on RateLimitResource::Token presence.
    • RateLimitingConfig::get_rate_limited_resources method added to determine active rate-limited resources.
  • Tests:
    • Added test_model_provider_infer_max_tokens_check in model.rs to validate behavior when max_tokens is missing or provided.
    • Added test_max_tokens_validation_with_rate_limits in rate_limiting/mod.rs to ensure correct resource inclusion based on rate limits.
  • Misc:
    • Renamed RateLimitResourceUsage to EstimatedRateLimitResourceUsage in several places for clarity.

This description was created by Ellipsis for d7da1ef. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings October 1, 2025 14:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes issue #3648 by making the max_tokens parameter only required when token rate limits are actually active. The change introduces conditional validation that checks which rate limit resources are configured before requiring specific parameters.

Key changes:

  • Added resource-aware validation for rate limiting requirements
  • Replaced fixed resource usage calculation with conditional estimation based on active rate limits
  • Updated the trait interface to pass rate-limited resources to estimation methods

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
tensorzero-core/src/rate_limiting/mod.rs Adds method to get active rate-limited resources and updates resource usage calculation to be conditional
tensorzero-core/src/model.rs Adds test coverage for max_tokens validation with different rate limiting configurations
tensorzero-core/src/inference/types/mod.rs Updates ModelInferenceRequest to conditionally estimate token usage only when token rate limits are active
tensorzero-core/src/embeddings.rs Updates EmbeddingRequest to conditionally estimate resource usage based on active rate limits

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@GabrielBianconi GabrielBianconi marked this pull request as draft October 1, 2025 14:47
Copilot AI review requested due to automatic review settings October 1, 2025 15:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings October 1, 2025 15:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings October 1, 2025 15:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@GabrielBianconi GabrielBianconi marked this pull request as ready for review October 1, 2025 16:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@virajmehta virajmehta self-assigned this Oct 1, 2025
virajmehta
virajmehta previously approved these changes Oct 1, 2025
@virajmehta virajmehta enabled auto-merge October 1, 2025 16:40
@virajmehta virajmehta added this pull request to the merge queue Oct 1, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Oct 1, 2025
Copilot AI review requested due to automatic review settings October 1, 2025 21:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@virajmehta virajmehta enabled auto-merge October 1, 2025 21:36
@virajmehta virajmehta added this pull request to the merge queue Oct 1, 2025
Merged via the queue into main with commit 0a94257 Oct 1, 2025
30 checks passed
@virajmehta virajmehta deleted the gb/fix-3648 branch October 1, 2025 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_tokens should not be required if there is no relevant rate limit

4 participants