Skip to content

fix(tests): fix flaky tests to accept string or number#2319

Merged
lizradway merged 1 commit into
strands-agents:mainfrom
lizradway:four
May 22, 2026
Merged

fix(tests): fix flaky tests to accept string or number#2319
lizradway merged 1 commit into
strands-agents:mainfrom
lizradway:four

Conversation

@lizradway
Copy link
Copy Markdown
Member

@lizradway lizradway commented May 22, 2026

Description

Fix flaky assertion in test_chat_completions_agent_invoke — the model sometimes responds with the word "four" instead of the digit "4" when asked "What is 2+2?". The assertion now accepts either form.

Related Issues

N/A

Documentation PR

N/A

Type of Change

Bug fix

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@lizradway lizradway deployed to manual-approval May 22, 2026 19:10 — with GitHub Actions Active
@lizradway lizradway changed the title fix(tests): update bedrock to accept string or number fix(tests): fix flaky tests to accept string or number May 22, 2026
@lizradway lizradway marked this pull request as ready for review May 22, 2026 19:12
@github-actions github-actions Bot added size/xs and removed size/xs labels May 22, 2026
@github-actions
Copy link
Copy Markdown

Assessment: Comment

The fix is reasonable — LLMs are non-deterministic and can respond with either "4" or "four" to a math question. One missed instance of the same pattern exists in test_reasoning_content_multi_turn (line 72) that should also be updated for consistency.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@lizradway lizradway merged commit f6c3b57 into strands-agents:main May 22, 2026
19 of 21 checks passed
@lizradway lizradway deleted the four branch May 22, 2026 19:22
@notowen333
Copy link
Copy Markdown
Contributor

I wonder also if we should follow up a scan for similar assertions that should be widened

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants