fix(crewai-crews): harden against LLM blocking calls at import time by jpr5 · Pull Request #3974 · CopilotKit/CopilotKit

jpr5 · 2026-04-16T19:43:27Z

Summary

CrewAI's ChatWithCrewFlow.__init__ (invoked from ag_ui_crewai.endpoint.add_crewai_crew_fastapi_endpoint at module import in ag-ui-crewai <= 0.1.5) makes synchronous blocking LLM calls via crewai.cli.crew_chat.generate_input_description_with_ai and generate_crew_description_with_ai. ANY LLM hiccup — aimock regression, OpenAI outage, network blip, DNS failure — crashes the Python process BEFORE uvicorn can bind its port, causing Railway/Kubernetes health checks to fail and deploys to roll back.

This was the direct cause of the crewai-crews Railway crash fixed server-side in #3971. That fix patched the aimock response schema, but the underlying fragility in upstream CrewAI / ag-ui-crewai remained — a future blip would crash us again.

This PR adds a defensive monkey-patch in agent_server.py that replaces both generator functions with static-string returns BEFORE ag_ui_crewai is imported. The AI-generated descriptions are only surfaced in the CrewAI chat UI (which the CopilotKit runtime does not use), so static defaults are functionally equivalent for our showcase.

Upstream issue filed: crewAIInc/crewAI#5510

The long-term fix is deferred construction in ag-ui-crewai, which has landed on ag-ui main but is not yet released. Remove this shim once ag-ui-crewai > 0.1.5 ships.

Why a monkey-patch and not lazy-init

add_crewai_crew_fastapi_endpoint is the entry point and internally constructs ChatWithCrewFlow(crew) synchronously in ag-ui-crewai <= 0.1.5. Deferring that call would require either vendoring the endpoint function or reimplementing it. The monkey-patch is two lines and removes cleanly when the upstream fix ships.

Test plan

Verified locally via Docker build + run with an intentionally broken LLM endpoint (OPENAI_BASE_URL=http://invalid-host/v1):

Unhardened (negative control):

File "/app/agent_server.py", line 27, in <module>
    add_crewai_crew_fastapi_endpoint(app, LatestAiDevelopment(), "/")
  File ".../ag_ui_crewai/endpoint.py", line 250, in add_crewai_crew_fastapi_endpoint
    add_crewai_flow_fastapi_endpoint(app, ChatWithCrewFlow(crew=crew), path)
  File ".../ag_ui_crewai/crews.py", line 56, in __init__
    self.crew_chat_inputs = crew_chat_generate_crew_chat_inputs(...)
  File ".../crewai/cli/crew_chat.py", line 387, in generate_crew_chat_inputs
    description = generate_input_description_with_ai(input_name, crew, chat_llm)
  File ".../crewai/cli/crew_chat.py", line 481, in generate_input_description_with_ai
    response = chat_llm.call(...)
APIError

Container exits with code 1, never binds a port.

Hardened (this PR):

INFO:     Started server process [7]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

curl http://localhost:PORT/api/health -> {"status":"ok","integration":"crewai-crews","agent":"ok","timestamp":"..."} (HTTP 200).

Checklist

Docker build succeeds locally
Unhardened build crashes on import with broken LLM endpoint (negative control)
Hardened build starts cleanly with broken LLM endpoint and responds 200 on /api/health
Upstream issue filed on crewAIInc/crewAI

…ings CrewAI's ChatWithCrewFlow.__init__ (called by ag_ui_crewai's add_crewai_crew_fastapi_endpoint at module import) makes blocking synchronous LLM calls via generate_input_description_with_ai and generate_crew_description_with_ai in crewai/cli/crew_chat.py. Any LLM hiccup (aimock regression, OpenAI outage, network blip) crashes the Python process before uvicorn can bind, causing Railway healthcheck failure and deploy rollback. Patch both functions to return static strings before ag_ui_crewai is imported. The AI-generated descriptions are only used by the CrewAI chat UI (not the CopilotKit runtime), so static defaults are functionally equivalent for our showcase. Verified via docker build: unhardened image crashes on import with APIError at crew_chat.py:481; hardened image starts cleanly with an invalid OPENAI_BASE_URL and responds on /api/health. Upstream fix (deferred construction) landed on ag-ui main but is not yet released in ag-ui-crewai > 0.1.5. Remove shim when released. Upstream issue: crewAIInc/crewAI#5510

vercel · 2026-04-16T19:43:33Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
chat-with-your-data	Ready	Preview, Comment	Apr 16, 2026 8:11pm
docs	Ready	Preview, Comment	Apr 16, 2026 8:11pm
form-filling	Ready	Preview, Comment	Apr 16, 2026 8:11pm
research-canvas	Ready	Preview, Comment	Apr 16, 2026 8:11pm
travel	Ready	Preview, Comment	Apr 16, 2026 8:11pm

github-actions · 2026-04-16T19:43:38Z

📣 Social Copy Generator

Generate social media copies (Twitter/X, LinkedIn, Blog Post) for this PR using Claude.

Generate social media copies

…ai ceiling Addresses CR R1 CRITICAL finding on PR #3974. setattr() on a Python module always succeeds regardless of prior attribute existence. Without a guard, an upstream rename of generate_input_description_with_ai or generate_crew_description_with_ai in a future crewai release would silently no-op the patch, leaving the real functions in place. The pre-bind LLM crash bug would quietly reappear in production with a green PR. Changes: - Add hasattr() guard in both agent_server.py files that raises RuntimeError with an actionable drift message if either symbol disappears upstream. - Add post-assignment assert to defend against import-order weirdness or module re-imports shadowing the reference. - Add an info-level log line so operators can see the shim is active and know to remove it after adoption. - Add upstream issue link (crewAIInc/crewAI#5510) and explicit ag-ui-crewai release status to the comment block. - Pin ag-ui-crewai upper bound to <0.1.6 in both requirements.txt files so the shim's applicability window is enforced by pip — upgrading past 0.1.5 forces the engineer to confront the version mismatch and remove the shim. Applied to both showcase/packages/crewai-crews/src/agent_server.py and showcase/starters/crewai-crews/agent_server.py to keep the demo and starter trees in sync. Verified locally: - Docker build succeeds (showcase/packages/crewai-crews). - Hardened container starts cleanly with OPENAI_BASE_URL set to an unreachable host; /health returns 200 on both agent (8000) and Next.js (10000) ports. - Negative case: deleting generate_input_description_with_ai from the installed crewai module inside the container and re-importing agent_server raises RuntimeError with the upstream-drift message, as expected.

…python -O)

jpr5 requested review from marthakelly, mme, ranst91 and tylerslaton as code owners April 16, 2026 19:43

jpr5 mentioned this pull request Apr 16, 2026

ChatWithCrewFlow.__init__ makes blocking LLM call at module import, crashes containers on any LLM hiccup crewAIInc/crewAI#5510

Closed

vercel Bot deployed to Preview – form-filling April 16, 2026 19:45 View deployment

vercel Bot deployed to Preview – docs April 16, 2026 19:45 View deployment

vercel Bot deployed to Preview – chat-with-your-data April 16, 2026 19:45 View deployment

vercel Bot deployed to Preview – travel April 16, 2026 19:46 View deployment

vercel Bot deployed to Preview – research-canvas April 16, 2026 19:46 View deployment

vercel Bot deployed to Preview – chat-with-your-data April 16, 2026 20:04 View deployment

vercel Bot deployed to Preview – docs April 16, 2026 20:04 View deployment

vercel Bot deployed to Preview – form-filling April 16, 2026 20:04 View deployment

vercel Bot deployed to Preview – travel April 16, 2026 20:05 View deployment

vercel Bot deployed to Preview – research-canvas April 16, 2026 20:05 View deployment

fix: replace assert patch-verification with explicit raise (survives …

b970853

…python -O)

vercel Bot deployed to Preview – chat-with-your-data April 16, 2026 20:09 View deployment

vercel Bot deployed to Preview – form-filling April 16, 2026 20:10 View deployment

vercel Bot deployed to Preview – docs April 16, 2026 20:10 View deployment

vercel Bot deployed to Preview – travel April 16, 2026 20:11 View deployment

vercel Bot deployed to Preview – research-canvas April 16, 2026 20:11 View deployment

jpr5 merged commit 4eeb8a4 into main Apr 16, 2026
16 checks passed

jpr5 deleted the fix/crewai-import-time-hardening branch April 16, 2026 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(crewai-crews): harden against LLM blocking calls at import time#3974

fix(crewai-crews): harden against LLM blocking calls at import time#3974
jpr5 merged 3 commits into
mainfrom
fix/crewai-import-time-hardening

jpr5 commented Apr 16, 2026

Uh oh!

vercel Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jpr5 commented Apr 16, 2026

Summary

Why a monkey-patch and not lazy-init

Test plan

Checklist

Uh oh!

vercel Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 16, 2026

📣 Social Copy Generator

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Apr 16, 2026 •

edited

Loading