Skip to content

Conversation

@luis5tb
Copy link
Contributor

@luis5tb luis5tb commented Oct 3, 2025

This patch ensures if max tokens is not defined, then is set to None instead of 0 when calling openai_chat_completion. This way some providers (like gemini) that cannot handle the max_tokens = 0 will not fail

Issue: #3666

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 3, 2025
@luis5tb luis5tb changed the title Fix BadRequestError due to unvalid max_tokens fix: Avoid BadRequestError due to invalid max_tokens Oct 3, 2025
Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hold on this pending discord discussion

@luis5tb luis5tb force-pushed the max_tokens branch 4 times, most recently from 43fb189 to ea42cf6 Compare October 5, 2025 08:35
Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good find!

@leseb
Copy link
Collaborator

leseb commented Oct 15, 2025

@luis5tb see CI failure:

Run './scripts/integration-tests.sh --inference-mode record-if-missing' with required API keys to generate.

Thanks!

@luis5tb
Copy link
Contributor Author

luis5tb commented Oct 15, 2025

The error in the unittest is weird

@luis5tb see CI failure:

Run './scripts/integration-tests.sh --inference-mode record-if-missing' with required API keys to generate.

Thanks!

I trigger the recording workflow and now is green

@leseb
Copy link
Collaborator

leseb commented Oct 15, 2025

@luis5tb the amount of changes files seems off?

Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use one of -

  • record-if-missing, or
  • record + git restore + git add

the restore will remove all the minor changes like call id and token counts

@luis5tb
Copy link
Contributor Author

luis5tb commented Oct 15, 2025

please use one of -

  • record-if-missing, or
  • record + git restore + git add

the restore will remove all the minor changes like call id and token counts

I used the remote recording option (scripts/github/schedule-record-workflow.sh), there is no option for record-if-missing there. From what I see, using that option the "record" is hardcoded: https://github.com/llamastack/llama-stack/blob/main/.github/workflows/record-integration-tests.yml#L70

@luis5tb luis5tb force-pushed the max_tokens branch 2 times, most recently from edf6e86 to 4c68892 Compare October 15, 2025 15:07
@luis5tb luis5tb force-pushed the max_tokens branch 3 times, most recently from cec6d76 to 3cba12d Compare October 16, 2025 06:16
@luis5tb
Copy link
Contributor Author

luis5tb commented Oct 16, 2025

please use one of -

  • record-if-missing, or
  • record + git restore + git add

the restore will remove all the minor changes like call id and token counts

I used the remote recording option (scripts/github/schedule-record-workflow.sh), there is no option for record-if-missing there. From what I see, using that option the "record" is hardcoded: https://github.com/llamastack/llama-stack/blob/main/.github/workflows/record-integration-tests.yml#L70

To try to workaround this I created it with remote, fetch the commit, reset it and only add the new files (similar to the record + git restore + git add). It is still adding a lot of files though

@luis5tb luis5tb requested a review from mattf October 16, 2025 14:04
This patch ensures if max tokens is not defined it is set to None.
This avoid some providers to fail, as they don't have protection for
it being set to 0

Issue: llamastack#3666
Removed the modification, just adding the new content
@luis5tb
Copy link
Contributor Author

luis5tb commented Oct 27, 2025

please use one of -

  • record-if-missing, or
  • record + git restore + git add

the restore will remove all the minor changes like call id and token counts

@mattf, any more inputs on this? Is the PR ready? I already add only the new additions and not the minor changes (though still a long list of files)

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is green!

@ashwinb ashwinb merged commit f18b5eb into llamastack:main Oct 27, 2025
24 checks passed
ashwinb pushed a commit to ashwinb/llama-stack that referenced this pull request Oct 30, 2025
This patch ensures if max tokens is not defined, then is set to None
instead of 0 when calling openai_chat_completion. This way some
providers (like gemini) that cannot handle the `max_tokens = 0` will not
fail

Issue: llamastack#3666
ashwinb added a commit that referenced this pull request Oct 31, 2025
## Summary

Cherry-picks 5 critical fixes from main to the release-0.3.x branch for
the v0.3.1 release, plus CI workflow updates.

**Note**: This recreates the cherry-picks from the closed PR #3991, now
targeting the renamed `release-0.3.x` branch (previously
`release-0.3.x-maint`).

## Commits

1. **2c56a8560** - fix(context): prevent provider data leak between
streaming requests (#3924)
- **CRITICAL SECURITY FIX**: Prevents provider credentials from leaking
between requests
   - Fixed import path for 0.3.0 compatibility

2. **ddd32b187** - fix(inference): enable routing of models with
provider_data alone (#3928)
   - Enables routing for fully qualified model IDs with provider_data
   - Resolved merge conflicts, adapted for 0.3.0 structure

3. **f7c2973aa** - fix: Avoid BadRequestError due to invalid max_tokens
(#3667)
- Fixes failures with Gemini and other providers that reject
max_tokens=0
   - Non-breaking API change

4. **d7f9da616** - fix(responses): sync conversation before yielding
terminal events in streaming (#3888)
- Ensures conversation sync executes even when streaming consumers break
early

5. **0ffa8658b** - fix(logging): ensure logs go to stderr, loggers obey
levels (#3885)
   - Fixes logging infrastructure

6. **75b49cb3c** - ci: support release branches and match client branch
(#3990)
   - Updates CI workflows to support release-X.Y.x branches
- Matches client branch from llama-stack-client-python for release
testing
   - Fixes artifact name collisions

## Adaptations for 0.3.0

- Fixed import paths: `llama_stack.core.telemetry.tracing` →
`llama_stack.providers.utils.telemetry.tracing`
- Fixed import paths: `llama_stack.core.telemetry.telemetry` →
`llama_stack.apis.telemetry`
- Changed `self.telemetry_enabled` → `self.telemetry` (0.3.0 attribute
name)
- Removed `rerank()` method that doesn't exist in 0.3.0

## Testing

All imports verified and tests should pass once CI is set up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants