feat(platform): add designer service for AI document transformation#759
feat(platform): add designer service for AI document transformation#759yannickmonney wants to merge 1 commit into
Conversation
New `services/designer` FastAPI service that accepts a prompt + DOCX and returns a transformed DOCX. The pipeline uses pandoc for DOCX ↔ Markdown conversion, the Pencil MCP server binary (bundled in the tarballs, no desktop app or Xvfb required) as the design tool, and an OpenAI-compatible model (configured via OPENAI_DESIGN_MODEL / OPENAI_BASE_URL) driving an agentic tool-use feedback loop against the Pencil MCP tools. - POST /api/v1/transform: multipart prompt + docx → docx - Pencil tarballs tracked via Git LFS (x64 + arm64) - Dockerfile: multi-arch, pandoc + Electron shared libs + Pencil MCP binary - docker-entrypoint.sh: auto-detects runtime arch for MCP server path - Graceful fallback when Pencil MCP is unavailable (content-only edit) - Port 8005, DESIGNER_ env prefix
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 WalkthroughWalkthroughThis pull request introduces a new Tale Designer service as a monorepo workspace package. The service implements an AI-powered document transformation pipeline built on FastAPI that accepts DOCX files with user prompts and returns transformed DOCX files. The implementation includes: a Python 3.11-based Docker container with Pencil MCP integration, configuration management via environment variables, service modules for orchestrating AI-assisted transformations (via OpenAI), converting documents between DOCX and Markdown formats (via pandoc), and managing Pencil MCP server sessions. The architecture includes routers for health checks and document transformation endpoints, comprehensive test coverage, and supporting infrastructure files for packaging and deployment. Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@services/designer/app/config.py`:
- Around line 35-44: The get_design_model function currently reads directly from
os.environ and the BaseServiceSettings defers validation of required config
(e.g., openai_api_key) until runtime; update BaseServiceSettings to perform
fail-fast validation by adding Pydantic validators or a model_post_init that
asserts required fields (openai_api_key, fast_model/defaults) are present and
valid, and change get_design_model to consult the validated settings attributes
(not os.environ) and to fallback to self.get_fast_model() only after validation;
ensure references to get_openai_api_key(), get_fast_model(), and
get_design_model() rely on those validated fields so initialization raises clear
errors on missing config.
In `@services/designer/app/main.py`:
- Around line 45-51: The CORS middleware configuration in app.add_middleware
(using CORSMiddleware and settings.get_allowed_origins_list()) is overly
permissive; replace allow_methods=["*"] and allow_headers=["*"] with explicit,
minimal lists of allowed HTTP methods and headers required by the client (e.g.,
["GET","POST","PUT","DELETE"] and the specific headers your frontend uses), and
conditionally tighten these values for production (detect environment via your
settings or an is_production flag) so production uses the restrictive lists
while development can remain more permissive.
- Around line 29-32: Move the shutdown log so it runs before performing shutdown
operations: call logger.info("Shutting down Tale Designer service...") prior to
awaiting pencil_service.shutdown() and calling shutdown_telemetry(); update the
sequence around pencil_service.shutdown() and shutdown_telemetry() so the
informative log (logger.info) executes first to improve observability.
In `@services/designer/app/routers/transform.py`:
- Line 37: The log statement in transform.py currently logs the full prompt
(logger.info("Transform request: prompt={!r}, file={!r}, size={}B", prompt,
docx.filename, len(docx_bytes))), which can leak sensitive data; update this to
redact or truncate prompt before logging (e.g., produce a safe_preview from
prompt by truncating to a fixed length and/or masking sensitive parts) and log
safe_preview instead of prompt while keeping docx.filename and len(docx_bytes)
unchanged; implement the truncation/masking in a small helper near the transform
handler and use that helper when calling logger.info.
- Line 44: The Content-Disposition header uses user-controlled docx.filename
directly, risking header injection; before building headers (the line that sets
headers={"Content-Disposition": f'attachment; filename="{docx.filename}"'}),
sanitize or RFC5987-encode docx.filename: strip/replace CR, LF, and
quote/backslash characters (or percent-encode with urllib.parse.quote and send
as filename*), then use the sanitized/encoded value in the header (or include
filename* alongside a safe fallback) so the header cannot be broken by malicious
characters.
In `@services/designer/app/services/agent_service.py`:
- Around line 116-117: The code accesses response.choices[0] without guarding
against an empty list which can raise IndexError; in the function in
agent_service.py where you build messages (the lines assigning message =
response.choices[0].message and messages.append(message)), first verify that
response.choices exists and len(response.choices) > 0 (or use truthiness) before
indexing; if empty, handle gracefully—e.g., log a warning via the existing
logger, skip appending/continue, or use a safe fallback message—so the function
does not crash on an empty API choices list.
- Around line 121-134: The JSON parsing of tool arguments is unguarded and can
raise json.JSONDecodeError, so wrap the json.loads call for
tool_call.function.arguments within the same try/except that handles
session.call_tool (or add a specific except json.JSONDecodeError) to prevent
crashing the agent loop; specifically, update the block handling
message.tool_calls (references: message.tool_calls, tool_call, tool_name,
tool_args, session.call_tool) so that malformed JSON sets tool_result_content to
a clear error string (e.g., "Error: invalid tool arguments") and logs the
parsing error via logger.warning before skipping or avoiding the
session.call_tool invocation.
- Around line 56-63: Move the inline import to the module level: remove "import
re" from inside _extract_final_markdown and add a single "import re" at the top
of the file so the function uses the module-level re; keep the regex and logic
in _extract_final_markdown unchanged to preserve behavior (function name:
_extract_final_markdown, regex: r"```markdown\s*(.*?)\s*```", flag: re.DOTALL).
In `@services/designer/app/services/pandoc_service.py`:
- Around line 51-59: The markdown_to_docx function duplicates subprocess
creation and error handling already present in _run_pandoc; extend _run_pandoc
to accept an optional stdin argument (e.g., stdin_bytes: Optional[bytes] = None)
and an optional return_bytes flag, feed stdin_bytes to proc.communicate when
provided, and keep the same error handling (raise RuntimeError including
proc.returncode and decoded stderr) so all pandoc invocation logic is
centralized; then update markdown_to_docx to call _run_pandoc(args,
stdin_bytes=markdown.encode()) and use the returned bytes instead of creating
its own subprocess.
In `@services/designer/app/services/pencil_service.py`:
- Around line 28-37: The initialize method is declared async but performs only
synchronous file checks; change async def initialize(self) -> None to a
synchronous def initialize(self) -> None, keep the internal logic that uses
Path(settings.pencil_mcp_server_path), logger, self._mcp_server_path and
self._initialized unchanged, and update any callers of PencilService.initialize
to remove awaiting (i.e., call initialize() instead of await initialize()) so
call sites and type hints align with the new synchronous signature.
- Around line 66-69: Nested async context managers in pencil_service.py should
be combined into a single statement: replace the two-level "async with
stdio_client(params) as (read, write): async with ClientSession(read, write) as
session:" block with one "async with stdio_client(params) as (read, write),
ClientSession(read, write) as session:" form, then call await
session.initialize() and yield session as before; this removes the nested
context and satisfies SIM117 while preserving the use of stdio_client and
ClientSession.
- Around line 39-53: The create_session method currently declares a return type
of tuple[ClientSession, object] and a docstring saying it returns (session,
context_manager) but actually returns a StdioServerParameters instance; update
the signature to return StdioServerParameters (or the correct concrete type used
elsewhere), and revise the docstring to describe that it returns
StdioServerParameters (or the server params) and under what conditions it raises
RuntimeError; ensure references to _initialized and _mcp_server_path remain and
that the method description matches the actual behavior of create_session and
the returned StdioServerParameters value.
In `@services/designer/app/services/transform_service.py`:
- Around line 101-105: The code assumes response.choices[0] always exists which
can raise IndexError; update the block that sets content to first validate
response.choices and the first choice's message before accessing .content (e.g.,
if not response.choices or not getattr(response.choices[0], "message", None):
set content = "" or log a warning and return early), then call
_extract_final_markdown(content) as before; reference the symbols
response.choices, response.choices[0].message.content, and
_extract_final_markdown to locate where to add the defensive checks.
- Around line 66-73: The code uses an async generator get_pencil_session() as an
async-for that yields exactly once, which is confusing; refactor
get_pencil_session to be an async context manager (use `@asynccontextmanager`) so
callers can use "async with get_pencil_session() as session:" in
transform_service.run where agent_service.run(...) is called, or alternatively
update transform_service to explicitly document and handle the one-yield
generator; specifically change get_pencil_session to return a context manager
and replace the "async for session in get_pencil_session()" in transform_service
with "async with get_pencil_session() as session:" before calling
agent_service.run(prompt=..., markdown=..., pen_file_path=..., session=session).
In `@services/designer/Dockerfile`:
- Around line 15-36: The apt-get install command in the RUN instruction of the
Dockerfile installs recommended packages, inflating image size; modify the RUN
line that starts with "apt-get update && apt-get install -y \ curl \ pandoc \
..." to add the --no-install-recommends flag to apt-get install so only
essential packages are installed, then keep the existing cleanup (rm -rf
/var/lib/apt/lists/*) as-is to minimize final image size.
- Around line 102-103: Remove the HEALTHCHECK instruction from the Dockerfile
(the line starting with HEALTHCHECK --interval=30s ... CMD curl -f
http://localhost:8005/health || exit 1) and instead add a designer service entry
to compose.yml that mirrors other services' health check patterns: expose or map
port 8005, include a healthcheck section that uses curl -f
http://localhost:8005/health (with interval 30s, timeout 10s, start_period 30s,
retries 3) and the same service name used elsewhere ("designer"), ensuring the
compose healthcheck configuration matches team practice and remove the
Dockerfile HEALTHCHECK to avoid duplication.
In `@services/designer/pyproject.toml`:
- Around line 9-21: The pyproject is missing the runtime dependency for the
package imported by app/config.py; add "tale-shared" to the project dependencies
so the import from tale_shared.config (BaseServiceSettings) resolves at
runtime—update the dependencies array (the same place as "tale-telemetry") to
include an appropriate version specifier for tale-shared (e.g.,
"tale-shared>=<min-version>") so that the module is installed when the service
is built.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1eb83d49-4502-44b9-9f44-7acae3be9d85
⛔ Files ignored due to path filters (2)
services/designer/Pencil-linux-arm64.taris excluded by!**/*.tarservices/designer/Pencil-linux-x64.taris excluded by!**/*.tar
📒 Files selected for processing (22)
.gitattributespackage.jsonservices/designer/Dockerfileservices/designer/app/__init__.pyservices/designer/app/config.pyservices/designer/app/main.pyservices/designer/app/models.pyservices/designer/app/routers/__init__.pyservices/designer/app/routers/health.pyservices/designer/app/routers/transform.pyservices/designer/app/services/__init__.pyservices/designer/app/services/agent_service.pyservices/designer/app/services/pandoc_service.pyservices/designer/app/services/pencil_service.pyservices/designer/app/services/transform_service.pyservices/designer/app/utils/__init__.pyservices/designer/app/utils/temp_files.pyservices/designer/docker-entrypoint.shservices/designer/package.jsonservices/designer/pyproject.tomlservices/designer/tests/__init__.pyservices/designer/tests/test_transform.py
| def get_design_model(self) -> str: | ||
| """Get the design agent model. | ||
|
|
||
| Checks OPENAI_DESIGN_MODEL first, then DESIGNER_DESIGN_MODEL, | ||
| then falls back to OPENAI_FAST_MODEL. | ||
| """ | ||
| model = os.environ.get("OPENAI_DESIGN_MODEL") or os.environ.get("DESIGNER_DESIGN_MODEL") | ||
| if model: | ||
| return model | ||
| return self.get_fast_model() |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if BaseServiceSettings validates OPENAI_API_KEY
ast-grep --pattern 'class BaseServiceSettings {
$$$
}'
rg -n "OPENAI_API_KEY|openai_api_key" packages/tale_shared/Repository: tale-project/tale
Length of output: 508
🏁 Script executed:
#!/bin/bash
# Get full BaseServiceSettings implementation
head -60 packages/tale_shared/src/tale_shared/config/base.py
# Search for get_openai_api_key() calls in designer service
rg -n "get_openai_api_key" services/designer/
# Check if there are model_post_init or validators in BaseServiceSettings
rg -n "model_post_init|@field_validator|@validator" packages/tale_shared/src/tale_shared/config/base.pyRepository: tale-project/tale
Length of output: 2495
🏁 Script executed:
#!/bin/bash
# Check the full get_design_model and related code in designer config.py
cat -n services/designer/app/config.py | head -60
# Check where get_design_model() is called
rg -n "get_design_model" services/designer/Repository: tale-project/tale
Length of output: 2327
Add Pydantic validators to BaseServiceSettings for fail-fast validation of required config.
The get_design_model() method uses os.environ.get() directly (line 41), bypassing pydantic-settings validation. More importantly, the pattern in BaseServiceSettings—where openai_api_key is an optional field with validation deferred to method invocation—means errors surface at runtime when services call get_openai_api_key() or get_fast_model(), not at startup.
To align with project conventions (fail-fast validation at service startup), BaseServiceSettings should validate required configuration via model_post_init or Pydantic field validators, ensuring deployment issues are caught immediately on service initialization rather than at first use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/config.py` around lines 35 - 44, The get_design_model
function currently reads directly from os.environ and the BaseServiceSettings
defers validation of required config (e.g., openai_api_key) until runtime;
update BaseServiceSettings to perform fail-fast validation by adding Pydantic
validators or a model_post_init that asserts required fields (openai_api_key,
fast_model/defaults) are present and valid, and change get_design_model to
consult the validated settings attributes (not os.environ) and to fallback to
self.get_fast_model() only after validation; ensure references to
get_openai_api_key(), get_fast_model(), and get_design_model() rely on those
validated fields so initialization raises clear errors on missing config.
|
|
||
| await pencil_service.shutdown() | ||
| shutdown_telemetry() | ||
| logger.info("Shutting down Tale Designer service...") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Shutdown log message should precede actual shutdown operations.
The "Shutting down" log appears after shutdown_telemetry() completes. Consider logging before shutdown operations for better observability.
♻️ Proposed fix
await pencil_service.shutdown()
+ logger.info("Shutting down Tale Designer service...")
shutdown_telemetry()
- logger.info("Shutting down Tale Designer service...")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| await pencil_service.shutdown() | |
| shutdown_telemetry() | |
| logger.info("Shutting down Tale Designer service...") | |
| await pencil_service.shutdown() | |
| logger.info("Shutting down Tale Designer service...") | |
| shutdown_telemetry() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/main.py` around lines 29 - 32, Move the shutdown log so
it runs before performing shutdown operations: call logger.info("Shutting down
Tale Designer service...") prior to awaiting pencil_service.shutdown() and
calling shutdown_telemetry(); update the sequence around
pencil_service.shutdown() and shutdown_telemetry() so the informative log
(logger.info) executes first to improve observability.
| app.add_middleware( | ||
| CORSMiddleware, | ||
| allow_origins=settings.get_allowed_origins_list(), | ||
| allow_credentials=True, | ||
| allow_methods=["*"], | ||
| allow_headers=["*"], | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider restricting CORS for production environments.
allow_methods=["*"] and allow_headers=["*"] are permissive. If this service handles sensitive document data, consider restricting to only the required methods and headers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/main.py` around lines 45 - 51, The CORS middleware
configuration in app.add_middleware (using CORSMiddleware and
settings.get_allowed_origins_list()) is overly permissive; replace
allow_methods=["*"] and allow_headers=["*"] with explicit, minimal lists of
allowed HTTP methods and headers required by the client (e.g.,
["GET","POST","PUT","DELETE"] and the specific headers your frontend uses), and
conditionally tighten these values for production (detect environment via your
settings or an is_production flag) so production uses the restrictive lists
while development can remain more permissive.
| detail="Uploaded file is empty", | ||
| ) | ||
|
|
||
| logger.info("Transform request: prompt={!r}, file={!r}, size={}B", prompt, docx.filename, len(docx_bytes)) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider redacting or truncating prompt in logs.
Logging the full prompt may expose sensitive user data. Consider truncating or summarizing it.
♻️ Proposed fix
- logger.info("Transform request: prompt={!r}, file={!r}, size={}B", prompt, docx.filename, len(docx_bytes))
+ logger.info("Transform request: prompt_len={}, file={!r}, size={}B", len(prompt), docx.filename, len(docx_bytes))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/routers/transform.py` at line 37, The log statement in
transform.py currently logs the full prompt (logger.info("Transform request:
prompt={!r}, file={!r}, size={}B", prompt, docx.filename, len(docx_bytes))),
which can leak sensitive data; update this to redact or truncate prompt before
logging (e.g., produce a safe_preview from prompt by truncating to a fixed
length and/or masking sensitive parts) and log safe_preview instead of prompt
while keeping docx.filename and len(docx_bytes) unchanged; implement the
truncation/masking in a small helper near the transform handler and use that
helper when calling logger.info.
| return Response( | ||
| content=result_bytes, | ||
| media_type=DOCX_CONTENT_TYPE, | ||
| headers={"Content-Disposition": f'attachment; filename="{docx.filename}"'}, |
There was a problem hiding this comment.
Sanitize filename in Content-Disposition header to prevent header injection.
User-controlled filenames could contain characters like ", \n, or \r that may lead to HTTP header injection. Sanitize or encode the filename.
🛡️ Proposed fix
+import re
+
+def _sanitize_filename(filename: str) -> str:
+ """Remove or replace characters that could cause header injection."""
+ # Remove control characters and quotes
+ return re.sub(r'[\x00-\x1f\x7f"\\]', '_', filename)
+
+
`@router.post`("/transform")
async def transform(
...
) -> Response:
...
return Response(
content=result_bytes,
media_type=DOCX_CONTENT_TYPE,
- headers={"Content-Disposition": f'attachment; filename="{docx.filename}"'},
+ headers={"Content-Disposition": f'attachment; filename="{_sanitize_filename(docx.filename)}"'},
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/routers/transform.py` at line 44, The
Content-Disposition header uses user-controlled docx.filename directly, risking
header injection; before building headers (the line that sets
headers={"Content-Disposition": f'attachment; filename="{docx.filename}"'}),
sanitize or RFC5987-encode docx.filename: strip/replace CR, LF, and
quote/backslash characters (or percent-encode with urllib.parse.quote and send
as filename*), then use the sanitized/encoded value in the header (or include
filename* alongside a safe fallback) so the header cannot be broken by malicious
characters.
| try: | ||
| async for session in get_pencil_session(): | ||
| return await agent_service.run( | ||
| prompt=prompt, | ||
| markdown=markdown, | ||
| pen_file_path=pen_file_path, | ||
| session=session, | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unusual pattern: async for with a single-yield async generator.
get_pencil_session() is an async generator designed to yield exactly once (acting like a context manager). Using async for works but is semantically confusing — the loop body executes once, returns early, and the generator cleanup runs. Consider converting get_pencil_session to a proper async context manager using @asynccontextmanager, or document this usage pattern clearly.
♻️ Proposed refactor using `@asynccontextmanager`
In pencil_service.py:
+from contextlib import asynccontextmanager
+
+@asynccontextmanager
async def get_pencil_session():
"""Async context manager that yields a live Pencil MCP ClientSession."""
params = await pencil_service.create_session()
async with stdio_client(params) as (read, write), ClientSession(read, write) as session:
await session.initialize()
yield sessionThen in transform_service.py:
- try:
- async for session in get_pencil_session():
- return await agent_service.run(
- prompt=prompt,
- markdown=markdown,
- pen_file_path=pen_file_path,
- session=session,
- )
+ try:
+ async with get_pencil_session() as session:
+ return await agent_service.run(
+ prompt=prompt,
+ markdown=markdown,
+ pen_file_path=pen_file_path,
+ session=session,
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/services/transform_service.py` around lines 66 - 73,
The code uses an async generator get_pencil_session() as an async-for that
yields exactly once, which is confusing; refactor get_pencil_session to be an
async context manager (use `@asynccontextmanager`) so callers can use "async with
get_pencil_session() as session:" in transform_service.run where
agent_service.run(...) is called, or alternatively update transform_service to
explicitly document and handle the one-yield generator; specifically change
get_pencil_session to return a context manager and replace the "async for
session in get_pencil_session()" in transform_service with "async with
get_pencil_session() as session:" before calling agent_service.run(prompt=...,
markdown=..., pen_file_path=..., session=session).
| ) | ||
| content = response.choices[0].message.content or "" | ||
| from .agent_service import _extract_final_markdown | ||
|
|
||
| return _extract_final_markdown(content) or content |
There was a problem hiding this comment.
Potential index error if choices is empty.
While rare, if the API returns an empty choices list, accessing choices[0] would raise an IndexError. Consider adding a defensive check.
🛡️ Proposed defensive check
- content = response.choices[0].message.content or ""
+ if not response.choices:
+ logger.warning("OpenAI returned empty choices; returning original markdown")
+ return markdown
+ content = response.choices[0].message.content or ""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/app/services/transform_service.py` around lines 101 - 105,
The code assumes response.choices[0] always exists which can raise IndexError;
update the block that sets content to first validate response.choices and the
first choice's message before accessing .content (e.g., if not response.choices
or not getattr(response.choices[0], "message", None): set content = "" or log a
warning and return early), then call _extract_final_markdown(content) as before;
reference the symbols response.choices, response.choices[0].message.content, and
_extract_final_markdown to locate where to add the defensive checks.
| RUN apt-get update && apt-get install -y \ | ||
| curl \ | ||
| pandoc \ | ||
| # Electron/Chromium shared libraries required by the Pencil MCP server binary | ||
| libgtk-3-0 \ | ||
| libgbm1 \ | ||
| libnspr4 \ | ||
| libnss3 \ | ||
| libasound2 \ | ||
| libxss1 \ | ||
| libxtst6 \ | ||
| libxrandr2 \ | ||
| libxcomposite1 \ | ||
| libxcursor1 \ | ||
| libxdamage1 \ | ||
| libxi6 \ | ||
| libx11-xcb1 \ | ||
| libxkbcommon0 \ | ||
| libdrm2 \ | ||
| libglib2.0-0 \ | ||
| libdbus-1-3 \ | ||
| && rm -rf /var/lib/apt/lists/* |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add --no-install-recommends to reduce image size.
This prevents installation of suggested packages that aren't strictly required, reducing the final image size.
♻️ Proposed fix
-RUN apt-get update && apt-get install -y \
+RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
pandoc \📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN apt-get update && apt-get install -y \ | |
| curl \ | |
| pandoc \ | |
| # Electron/Chromium shared libraries required by the Pencil MCP server binary | |
| libgtk-3-0 \ | |
| libgbm1 \ | |
| libnspr4 \ | |
| libnss3 \ | |
| libasound2 \ | |
| libxss1 \ | |
| libxtst6 \ | |
| libxrandr2 \ | |
| libxcomposite1 \ | |
| libxcursor1 \ | |
| libxdamage1 \ | |
| libxi6 \ | |
| libx11-xcb1 \ | |
| libxkbcommon0 \ | |
| libdrm2 \ | |
| libglib2.0-0 \ | |
| libdbus-1-3 \ | |
| && rm -rf /var/lib/apt/lists/* | |
| RUN apt-get update && apt-get install -y --no-install-recommends \ | |
| curl \ | |
| pandoc \ | |
| # Electron/Chromium shared libraries required by the Pencil MCP server binary | |
| libgtk-3-0 \ | |
| libgbm1 \ | |
| libnspr4 \ | |
| libnss3 \ | |
| libasound2 \ | |
| libxss1 \ | |
| libxtst6 \ | |
| libxrandr2 \ | |
| libxcomposite1 \ | |
| libxcursor1 \ | |
| libxdamage1 \ | |
| libxi6 \ | |
| libx11-xcb1 \ | |
| libxkbcommon0 \ | |
| libdrm2 \ | |
| libglib2.0-0 \ | |
| libdbus-1-3 \ | |
| && rm -rf /var/lib/apt/lists/* |
🧰 Tools
🪛 Hadolint (2.14.0)
[warning] 15-15: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
[info] 15-15: Avoid additional packages by specifying --no-install-recommends
(DL3015)
🪛 Trivy (0.69.3)
[error] 15-36: 'apt-get' missing '--no-install-recommends'
'--no-install-recommends' flag is missed: 'apt-get update && apt-get install -y curl pandoc libgtk-3-0 libgbm1 libnspr4 libnss3 libasound2 libxss1 libxtst6 libxrandr2 libxcomposite1 libxcursor1 libxdamage1 libxi6 libx11-xcb1 libxkbcommon0 libdrm2 libglib2.0-0 libdbus-1-3 && rm -rf /var/lib/apt/lists/*'
Rule: DS-0029
(IaC/Dockerfile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/Dockerfile` around lines 15 - 36, The apt-get install
command in the RUN instruction of the Dockerfile installs recommended packages,
inflating image size; modify the RUN line that starts with "apt-get update &&
apt-get install -y \ curl \ pandoc \ ..." to add the --no-install-recommends
flag to apt-get install so only essential packages are installed, then keep the
existing cleanup (rm -rf /var/lib/apt/lists/*) as-is to minimize final image
size.
| HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \ | ||
| CMD curl -f http://localhost:8005/health || exit 1 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if docker-compose files exist and contain health check configuration
fd -t f -e yml -e yaml . | xargs -I {} sh -c 'echo "=== {} ===" && grep -l "healthcheck" {} 2>/dev/null || true'Repository: tale-project/tale
Length of output: 92
🏁 Script executed:
# Read compose.yml to check designer service health check configuration
cat -n compose.yml | grep -A 20 -B 5 "designer"Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Also check if healthcheck is defined for designer service in compose.yml
rg -A 10 "designer" compose.yml | head -50Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Find compose.yml and related docker compose files
fd -name "compose*" -o -name "docker-compose*"Repository: tale-project/tale
Length of output: 291
🏁 Script executed:
# Check the structure of services directory
ls -la services/Repository: tale-project/tale
Length of output: 562
🏁 Script executed:
# Read the compose.yml file
head -100 compose.ymlRepository: tale-project/tale
Length of output: 3130
🏁 Script executed:
# Find the designer service section in compose.yml
grep -n -A 30 "designer:" compose.ymlRepository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Also check the Dockerfile to see the full context
head -110 services/designer/Dockerfile | tail -15Repository: tale-project/tale
Length of output: 463
🏁 Script executed:
# Extract all service names from compose.yml
grep "^ [a-z]" compose.yml | grep -v "^ " | cut -d: -f1Repository: tale-project/tale
Length of output: 221
🏁 Script executed:
# Get the full length and check for designer service anywhere in the file
wc -l compose.yml
grep -i designer compose.yml || echo "No designer found"Repository: tale-project/tale
Length of output: 94
🏁 Script executed:
# Search for any orchestration files that might define designer service
find . -type f \( -name "*deploy*" -o -name "*kube*" -o -name "*k8s*" -o -name "*.yaml" -o -name "*.yml" \) | grep -v node_modules | head -20Repository: tale-project/tale
Length of output: 421
🏁 Script executed:
# Check if designer is referenced in any compose or config files
rg "designer" -t yml -t yaml --max-count=5Repository: tale-project/tale
Length of output: 88
🏁 Script executed:
# Search for designer in yaml files with correct syntax
rg "designer" -t yaml --max-count=10Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Also search in all files for designer service
rg "designer.*service\|service.*designer" --max-count=5Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Check the deployment docs
grep -n -i "designer\|healthcheck" docs/zero-downtime-deployment.md | head -20Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Search for any references to designer image or service across the codebase
rg "designer" --type rust --type ts --type js --max-count=10Repository: tale-project/tale
Length of output: 360
🏁 Script executed:
# Check if designer image is built anywhere
rg "designer.*image\|dockerfile.*designer\|services/designer" --max-count=10Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Look at the services/designer/Dockerfile name - is it meant to be built?
ls -la services/designer/Repository: tale-project/tale
Length of output: 678
🏁 Script executed:
# Final check - see if designer is referenced as a service anywhere in the repo
rg "designer" compose.yml services/platform/convex/package.json .github/workflows/ 2>/dev/null | head -5Repository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Check git status to see if this is a new file or modification
git log --oneline services/designer/Dockerfile | head -5Repository: tale-project/tale
Length of output: 136
Add designer service to compose.yml with health check configuration.
The designer service was recently added but is missing from compose.yml. Per team practice, health checks should be configured at the orchestration layer in compose.yml rather than in individual Dockerfiles. Add the designer service to compose.yml with the appropriate health check configuration, then remove the HEALTHCHECK instruction from the Dockerfile to maintain consistency with other services.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/Dockerfile` around lines 102 - 103, Remove the HEALTHCHECK
instruction from the Dockerfile (the line starting with HEALTHCHECK
--interval=30s ... CMD curl -f http://localhost:8005/health || exit 1) and
instead add a designer service entry to compose.yml that mirrors other services'
health check patterns: expose or map port 8005, include a healthcheck section
that uses curl -f http://localhost:8005/health (with interval 30s, timeout 10s,
start_period 30s, retries 3) and the same service name used elsewhere
("designer"), ensuring the compose healthcheck configuration matches team
practice and remove the Dockerfile HEALTHCHECK to avoid duplication.
| dependencies = [ | ||
| "fastapi>=0.100.0", | ||
| "uvicorn[standard]>=0.20.0", | ||
| "pydantic>=2.0.0", | ||
| "pydantic-settings>=2.0.0", | ||
| "openai>=1.0.0", | ||
| "python-multipart>=0.0.20", | ||
| "python-dotenv>=1.2.0", | ||
| "loguru>=0.7.0", | ||
| "httpx>=0.28.0", | ||
| "mcp>=1.0.0", | ||
| "tale-telemetry", | ||
| ] |
There was a problem hiding this comment.
Missing tale-shared in dependencies.
tale-shared is defined in [tool.uv.sources] (Line 34) but not listed in [project.dependencies]. The app/config.py imports from tale_shared.config import BaseServiceSettings, which will cause an import error at runtime.
🐛 Proposed fix
dependencies = [
"fastapi>=0.100.0",
"uvicorn[standard]>=0.20.0",
"pydantic>=2.0.0",
"pydantic-settings>=2.0.0",
"openai>=1.0.0",
"python-multipart>=0.0.20",
"python-dotenv>=1.2.0",
"loguru>=0.7.0",
"httpx>=0.28.0",
"mcp>=1.0.0",
"tale-telemetry",
+ "tale-shared",
]Also applies to: 32-34
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/designer/pyproject.toml` around lines 9 - 21, The pyproject is
missing the runtime dependency for the package imported by app/config.py; add
"tale-shared" to the project dependencies so the import from tale_shared.config
(BaseServiceSettings) resolves at runtime—update the dependencies array (the
same place as "tale-telemetry") to include an appropriate version specifier for
tale-shared (e.g., "tale-shared>=<min-version>") so that the module is installed
when the service is built.
|
Found 5 test failures on Blacksmith runners: Failures
|
New
services/designerFastAPI service that accepts a prompt + DOCX and returns a transformed DOCX. The pipeline uses pandoc for DOCX ↔ Markdown conversion, the Pencil MCP server binary (bundled in the tarballs, no desktop app or Xvfb required) as the design tool, and an OpenAI-compatible model (configured via OPENAI_DESIGN_MODEL / OPENAI_BASE_URL) driving an agentic tool-use feedback loop against the Pencil MCP tools.Summary by CodeRabbit
New Features
Infrastructure