Skip to content

refactor: monorepo apps/packages worker stack#29

Merged
michaelmwu merged 3 commits into
mainfrom
michaelmwu/monorepo-compose-redis
Feb 20, 2026
Merged

refactor: monorepo apps/packages worker stack#29
michaelmwu merged 3 commits into
mainfrom
michaelmwu/monorepo-compose-redis

Conversation

@michaelmwu
Copy link
Copy Markdown
Member

@michaelmwu michaelmwu commented Feb 20, 2026

Description

Refactors the repo into a uv workspace monorepo with apps/discord_bot, apps/worker, and packages/shared using the five08 namespace layout under src/.
Adds Redis-backed worker API + consumer services, shared queue/logging/settings modules, and updates Docker/Compose for a single stack deployment.
Moves shared Espo/Kimai clients into packages/shared, updates imports/tests/scripts/CI/pre-commit/docs to the new paths, and cleans up stale path references.
Also aligns compose port wiring and removes worker-local Espo client duplication in favor of shared client usage.

Related Issue

N/A

How Has This Been Tested?

Ran ./scripts/format.sh, ./scripts/lint.sh, ./scripts/mypy.sh, and ./scripts/test.sh (193 passing tests).

Summary by CodeRabbit

  • New Features

    • Multi-service monorepo with Discord bot, worker API, and background job processing.
    • Webhook ingest API (health, generic webhook, EspoCRM, manual contact) that enqueues background jobs.
    • Resume/attachment processing and skills extraction (LLM + heuristic) with EspoCRM updates.
    • Docker & docker-compose support for containerized services.
  • Documentation

    • Rewritten README, development, and agent guides for monorepo workflows.
  • Tests

    • Added unit/integration tests covering worker API, processing, and shared queue.
  • Chores

    • New shared settings, packaging, and CI/lint/mypy scope updates; updated .env example and ignore rules.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 20, 2026

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 47 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Restructures a single-bot repo into a three-component monorepo (discord_bot, worker, shared), adds Docker/docker-compose, workspace packaging, a webhook-ingest API, Redis/RQ job queue and consumer, document/skills extraction pipeline, shared clients/utilities, CI/tooling updates, and extensive documentation and tests.

Changes

Cohort / File(s) Summary
Project config & workspace
pyproject.toml, packages/shared/pyproject.toml, apps/discord_bot/pyproject.toml, apps/worker/pyproject.toml, .dockerignore, .env.example
Converted root project to a workspace-based monorepo; added per-app pyproject files; new .dockerignore; expanded example environment variables for bot, worker, and shared settings.
Container / compose
apps/discord_bot/Dockerfile, apps/worker/Dockerfile, docker-compose.yml
Added Dockerfiles for bot and worker and a docker-compose definition for redis, bot, worker-api, and worker-consumer services.
Shared package & utilities
packages/shared/src/five08/__init__.py, packages/shared/src/five08/clients/__init__.py, packages/shared/src/five08/settings.py, packages/shared/src/five08/queue.py, packages/shared/src/five08/logging.py
New shared package exposing settings, logging, Redis/RQ queue helpers, and client namespaces (espo, kimai).
Discord bot app
apps/discord_bot/src/five08/..., apps/discord_bot/Dockerfile, apps/discord_bot/pyproject.toml
Refactored imports to five08.discord_bot, switched config to use SharedSettings, added synchronous run entrypoint, and packaged the bot with a console script; Dockerfile updated for uv-based image.
Worker service
apps/worker/src/five08/worker/..., apps/worker/pyproject.toml
New worker service: aiohttp webhook API (health, ingest, espocrm, process-contact), RQ consumer entrypoint, config class, document processor, EspoCRM client, skills extractor (LLM + heuristic), job functions, and Pydantic models.
CI / tooling / scripts
.github/workflows/test.yml, .pre-commit-config.yaml, scripts/lint.sh, scripts/format.sh, scripts/mypy.sh
Expanded linting, formatting, mypy, and security scanning targets to include multiple service source trees and added dependencies for type-checking.
Docs & developer guides
README.md, DEVELOPMENT.md, AGENTS.md
Completely rewritten documentation to describe monorepo layout, service commands, docker-compose workflows, environment variables, and developer workflows.
Tests
tests/unit/*, tests/integration/*, tests/unit/test_shared_queue.py, tests/unit/test_worker_*
Updated import paths to new package namespaces; added unit tests for shared queue, worker API, models, processor, and integration tests for email monitor.

Sequence Diagram(s)

sequenceDiagram
    participant Client as External<br/>Webhook Client
    participant API as Worker<br/>API
    participant Queue as RQ<br/>Queue
    participant Redis as Redis
    participant Consumer as Worker<br/>Consumer
    participant Espo as EspoCRM<br/>API

    Client->>API: POST /webhooks/{source} (payload)
    API->>API: Validate X-Webhook-Secret (if set)
    API->>API: Validate payload (JSON / list)
    API->>Redis: Ensure connection
    API->>Queue: enqueue_job(process_webhook_event, source, payload)
    Queue->>Redis: Persist job
    API->>Client: 202 Accepted (job_id)
    Consumer->>Redis: Poll for jobs
    Redis->>Consumer: Return job
    Consumer->>Consumer: Execute process_webhook_event()
    Consumer->>Espo: (if relevant) call EspoCRM client to fetch/update contact
Loading
sequenceDiagram
    participant Consumer as Worker<br/>Consumer
    participant Processor as ContactSkills<br/>Processor
    participant CRM as EspoCRM<br/>Client
    participant DocProc as Document<br/>Processor
    participant SkillExt as Skills<br/>Extractor

    Consumer->>Processor: process_contact_skills_job(contact_id)
    Processor->>CRM: get_contact(contact_id)
    CRM->>Processor: ContactData (existing_skills)
    Processor->>CRM: get_contact_attachments(contact_id)
    CRM->>Processor: attachments list
    loop per resume attachment
        Processor->>CRM: download_attachment(id)
        CRM->>DocProc: file bytes
        DocProc->>Processor: extracted text
        Processor->>SkillExt: extract_skills(text)
        SkillExt->>Processor: ExtractedSkills
    end
    Processor->>CRM: update_contact_skills(contact_id, new_skills) (if any)
    Processor->>Consumer: return SkillsExtractionResult
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A repo reborn from one into three,

Bot, worker, and shared, all hoppity-free.
Containers hum, queues bustle with cheer,
Resumes yield skills both far and near.
Hooray — five08 hops forward, bright and merry!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the primary change: a refactor to convert the codebase into a monorepo structure with separate apps and packages directories using a workspace layout.
Docstring Coverage ✅ Passed Docstring coverage is 82.72% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch michaelmwu/monorepo-compose-redis

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/integration/test_email_monitor.py (1)

43-47: ⚠️ Potential issue | 🟡 Minor

Vacuous isinstance assertion never fails.

isinstance(e, (imaplib.IMAP4.error, Exception)) is always True because every exception is an instance of Exception. The assertion provides zero signal.

🛠️ Suggested fix
-            assert isinstance(e, (imaplib.IMAP4.error, Exception))
+            assert isinstance(e, imaplib.IMAP4.error)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_email_monitor.py` around lines 43 - 47, The test's
assertion is vacuous because isinstance(e, (imaplib.IMAP4.error, Exception))
will always be True; change the check to assert the specific expected IMAP error
type or inspect the exception message instead. Update the except block in
tests/integration/test_email_monitor.py that catches exceptions from
email_monitor.task_poll_inbox to assert isinstance(e, imaplib.IMAP4.error) (or
assert "expected substring" in str(e)) so the test actually verifies an
IMAP-specific failure rather than any Exception.
🧹 Nitpick comments (21)
scripts/mypy.sh (1)

5-5: tests/ is absent from mypy but present in lint.sh — likely intentional (pytest fixtures often emit false-positive mypy errors), but worth confirming. If test type coverage is desired, add tests/ to align with scripts/lint.sh.

🔧 Optional addition
-uv run mypy apps/discord_bot/src/five08/ apps/worker/src/five08/ packages/shared/src/five08/ --ignore-missing-imports
+uv run mypy apps/discord_bot/src/five08/ apps/worker/src/five08/ packages/shared/src/five08/ tests/ --ignore-missing-imports
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/mypy.sh` at line 5, The mypy invocation in scripts/mypy.sh currently
runs "uv run mypy" over the app and package paths but omits tests/ whereas
scripts/lint.sh includes tests/; to align type-checking decisions, either (a)
intentionally document why tests/ is excluded (e.g., note pytest fixture
false-positives) or (b) add the tests/ path to the mypy invocation so tests are
type-checked too—update the command string in scripts/mypy.sh (the line that
calls mypy) accordingly and ensure you preserve the --ignore-missing-imports
flag if keeping that behavior.
packages/shared/src/five08/settings.py (1)

6-21: SharedSettings bundles worker-only fields (Redis, webhooks) into the shared base — Settings in the Discord bot inherits them unnecessarily.

redis_url, redis_queue_name, webhook_ingest_host, etc. are consumed only by apps/worker. The Discord bot's Settings class inherits all these fields via SharedSettings, which pollutes the bot's configuration surface and means Redis/webhook env vars are silently accepted by the bot (even though it ignores them). Consider extracting a thinner SharedSettings (log level only) and a WorkerBaseSettings that adds Redis/webhook fields, so each service inherits only what it needs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/settings.py` around lines 6 - 21, SharedSettings
currently contains worker-only fields (redis_url, redis_queue_name,
redis_key_prefix, job_timeout_seconds, job_result_ttl_seconds,
webhook_ingest_host, webhook_ingest_port, webhook_shared_secret) which leak into
the Discord bot's Settings via inheritance; refactor by shrinking SharedSettings
to only truly common fields (e.g., log_level and model_config) and create a new
WorkerBaseSettings (or WorkerSettingsBase) that defines the Redis and webhook
fields and any job-specific config, then update the worker service to inherit
from WorkerBaseSettings and the Discord bot's Settings to inherit only from the
slimmed SharedSettings so the bot no longer accepts worker-only env vars.
.github/workflows/test.yml (1)

86-86: CI mypy is less strict than pre-commit mypy.

The pre-commit hook runs mypy --strict, but CI here runs only --ignore-missing-imports. Since CI installs all project dependencies via uv sync --all-extras, missing imports are already resolved — meaning --ignore-missing-imports doesn't add value, while the absence of --strict means CI silently misses any type issues the pre-commit hook would catch. This creates a gap where a commit passes CI but fails locally.

♻️ Align CI mypy with pre-commit strictness
-        uv run mypy apps/discord_bot/src/five08/ apps/worker/src/five08/ packages/shared/src/five08/ --ignore-missing-imports
+        uv run mypy apps/discord_bot/src/five08/ apps/worker/src/five08/ packages/shared/src/five08/ --strict
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test.yml at line 86, The CI mypy step currently runs "uv
run mypy ... --ignore-missing-imports", which is less strict than the pre-commit
hook; update the workflow to call mypy with the same strictness as pre-commit
(remove "--ignore-missing-imports" and add "--strict" or the exact flags used by
pre-commit) so the CI mypy invocation matches the pre-commit mypy behavior for
the directories referenced in the current command (apps/discord_bot/src/five08/,
apps/worker/src/five08/, packages/shared/src/five08/).
.pre-commit-config.yaml (1)

18-25: Inconsistent version constraint style for new mypy deps.

The three new deps use open-minimum >= bounds while existing entries use compatible-release ~= (e.g., discord.py~=2.6.0, pydantic~=2.10). Either style is acceptable for pre-commit stubs, but mixing them reduces readability.

♻️ Proposed unification to `>=` for all stub deps
-          - aiohttp>=3.13.1
+          - aiohttp>=3.9
           - discord.py~=2.6.0
           - pydantic~=2.10
           - pydantic-settings~=2.8
-          - redis>=6.4.0
+          - redis>=5.0
           - requests~=2.31
-          - rq>=2.6.0
-          - types-requests>=2.32.4.20250913
+          - rq>=1.16
+          - types-requests>=2.32

Or alternatively align all to ~=. The important thing is internal consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.pre-commit-config.yaml around lines 18 - 25, The version constraint styles
in the .pre-commit-config.yaml deps list are inconsistent (some use >= while
most use ~=); update the new entries (aiohttp, redis, rq, types-requests — and
requests if intended) to match the repository's existing convention by changing
their specifiers to the compatible-release operator (~=) so all package lines
(e.g., aiohttp, redis, rq, types-requests) use ~= for consistency with
discord.py~=2.6.0 and pydantic~=2.10 / pydantic-settings~=2.8; apply the same
operator to any other stub deps to keep the file uniform.
packages/shared/src/five08/logging.py (1)

8-10: logging.basicConfig silently no-ops when root handlers already exist — add force=True.

logging.basicConfig does nothing if the root logger already has handlers configured, unless force=True is passed (added in Python 3.8). Since multiple libraries (pytest's log-capture plugin, discord.py, rq) can attach root handlers before configure_logging reaches this call, the configured level and format may silently never apply. Since the codebase targets Python 3.10+ (str | None union syntax), force=True is unconditionally safe.

🔧 Proposed fix
 def configure_logging(level: str = "INFO") -> None:
     """Configure process-wide logging in a consistent way."""
     logging.basicConfig(
         level=level.upper(),
         format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
+        force=True,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/logging.py` around lines 8 - 10, The logging
configuration currently calls logging.basicConfig(...) which is a no-op if the
root logger already has handlers; update the basicConfig call (in the
configure_logging function / module-level logging setup in five08.logging) to
pass force=True (i.e., logging.basicConfig(..., force=True)) so the provided
level and format are always applied even when other libraries have already
attached handlers.
apps/worker/src/five08/worker/consumer.py (2)

4-4: Use built-in list[str] instead of the deprecated typing.List.

The codebase already uses built-in generic types throughout (e.g., set[str] in config.py, list[dict[str, Any]] in crm.py). typing.List has been deprecated since Python 3.9; using it here is inconsistent.

♻️ Proposed fix
-from typing import List
-
 from rq import Queue, Worker

 ...

-def _queue_names() -> List[str]:
+def _queue_names() -> list[str]:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/consumer.py` at line 4, The file imports
typing.List which is deprecated; replace that import and any uses of List[...]
with the native generic syntax (e.g., list[str]). Update the import line by
removing "from typing import List" and change all type annotations referencing
List (for example in functions or variables in consumer.py that use List[str] or
List[SomeType]) to use the built-in list[...], ensuring forward refs or Any
remain correct.

26-35: No guard against an empty queue list — worker starts silently with nothing to consume.

If settings.worker_queue_names is set to an empty string (or only commas), _queue_names() returns [], queues is empty, and Worker.work() starts successfully but consumes no queues — burning a process while doing nothing. A fast-fail guard is more operator-friendly than a silently idle worker.

🛡️ Proposed fix
 def run() -> None:
     """Start RQ worker and consume configured queues."""
     configure_logging(settings.log_level)

     redis_conn = get_redis_connection(settings)
     queue_names = _queue_names()
+    if not queue_names:
+        raise ValueError("No queue names configured; set WORKER_QUEUE_NAMES.")
     queues = [Queue(name, connection=redis_conn) for name in queue_names]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/consumer.py` around lines 26 - 35, _worker
currently proceeds even when _queue_names() returns an empty list causing Worker
to run with no queues; add a fast-fail guard after computing queue_names (and
before constructing Queue/Worker) that checks if not queue_names, logs an error
including settings.worker_name and the raw settings.worker_queue_names, and
exits non‑zero (e.g., raise SystemExit or call sys.exit(1)); update the flow
around _queue_names(), queues = [Queue(...)] and Worker(...) so the process
never starts a worker with an empty queues list.
apps/discord_bot/src/five08/discord_bot/main.py (1)

15-15: configure_logging called at module scope — runs on import, inconsistent with worker pattern.

apps/worker/src/five08/worker/api.py and apps/worker/src/five08/worker/consumer.py both call configure_logging inside their run() function, not at module level. Calling it at module level here means:

  1. Any code that imports five08.discord_bot.main (including test modules) configures global logging as a side effect, polluting test output and potentially overriding test-suite log capture.
  2. Settings are evaluated at import time; missing required env vars will raise during import rather than at startup.

Move the call into run() to align with the worker pattern:

♻️ Proposed fix
 from five08.discord_bot.bot import create_bot
 from five08.discord_bot.config import settings
 from five08.logging import configure_logging
-
-configure_logging(settings.log_level)

 logger = logging.getLogger(__name__)


 async def main() -> None:
     ...


 def run() -> None:
     """Sync entrypoint for console script."""
+    configure_logging(settings.log_level)
     asyncio.run(main())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/src/five08/discord_bot/main.py` at line 15, The call to
configure_logging(settings.log_level) is happening at module import time in
main.py which causes global logging configuration and settings evaluation as a
side effect; move that call into the bot's startup function (the run() function)
so logging is configured during runtime startup instead of import. Edit main.py
to remove or comment out the module-level configure_logging invocation and add a
call to configure_logging(settings.log_level) at the beginning of the run()
function (or whatever startup entrypoint is defined in main.py) so tests and
importers don't mutate global logging or evaluate settings prematurely.
apps/worker/src/five08/worker/config.py (1)

26-28: allowed_file_extensions doesn't filter empty strings — inconsistent with parsed_resume_keywords.

parsed_resume_keywords guards against empty entries with if keyword.strip(), but allowed_file_extensions does not. A trailing comma or double-comma in ALLOWED_FILE_TYPES (e.g., "pdf,,docx") will produce an empty string "" in the returned set, which could silently match against files with no extension.

♻️ Proposed fix
 `@property`
 def allowed_file_extensions(self) -> set[str]:
     """Allowed resume file extensions."""
-    return {ext.strip().lower() for ext in self.allowed_file_types.split(",")}
+    return {ext.strip().lower() for ext in self.allowed_file_types.split(",") if ext.strip()}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/config.py` around lines 26 - 28,
allowed_file_extensions currently includes empty strings when ALLOWED_FILE_TYPES
has empty entries (e.g., "pdf,,docx"); mirror parsed_resume_keywords by
filtering out empty items: update allowed_file_extensions (the method named
allowed_file_extensions) to only include extensions where ext.strip() is truthy
(e.g., use a comprehension with if ext.strip()) and continue to .strip().lower()
each entry so the returned set contains no empty strings and all normalized
extensions.
.env.example (1)

1-44: dotenv-linter reports multiple UnorderedKey warnings — consider alphabetizing within each section.

The current grouping is intentional (logical sections: shared runtime, worker, discord, email, espo, kimai) and more readable than strict alphabetical order. However, if the linter runs in CI and is configured to fail on warnings, it will block pipelines. Consider either alphabetizing keys within each section, or adding a dotenv-linter skip comment / configuration exclusion to suppress these warnings explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.env.example around lines 1 - 44, Dotenv-linter reports UnorderedKey
warnings for .env.example; fix by either (A) alphabetizing keys inside each
logical section (e.g., section groups containing LOG_LEVEL, REDIS_*, WEBHOOK_*,
WORKER_NAME, WORKER_QUEUE_NAMES, DISCORD_BOT_TOKEN,
EMAIL_USERNAME/EMAIL_PASSWORD/IMAP_SERVER/SMTP_SERVER,
ESPO_API_KEY/ESPO_BASE_URL, KIMAI_BASE_URL/KIMAI_API_TOKEN) so keys are
lexicographically ordered per section, or (B) suppressing the rule by adding a
dotenv-linter suppression (comments or config) to exclude .env.example or
disable the UnorderedKey rule (update dotenv-linter config file or add an inline
skip comment) so CI no longer fails on these warnings.
packages/shared/src/five08/queue.py (1)

24-36: enqueue_job has no **kwargs path for job-function keyword arguments.

Currently only positional args are forwarded; job functions that require keyword arguments cannot be enqueued through this helper. This is an intentional design choice if all current job signatures are positional-only, but worth noting as a future extensibility gap.

♻️ Suggested extension (if needed)
 def enqueue_job(
     queue: Queue,
     fn: Callable[..., Any],
     args: tuple[Any, ...],
     settings: SharedSettings,
+    kwargs: dict[str, Any] | None = None,
 ) -> Job:
     """Enqueue a job with shared timeout/TTL defaults."""
     return queue.enqueue(
         fn,
         *args,
+        **(kwargs or {}),
         job_timeout=settings.job_timeout_seconds,
         result_ttl=settings.job_result_ttl_seconds,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/queue.py` around lines 24 - 36, The helper
enqueue_job currently accepts only positional args and loses any keyword
arguments needed by job functions; update the signature of enqueue_job to accept
**kwargs (e.g., kwargs: dict[str, Any] | None = None) and forward them into the
queue.enqueue call (alongside fn and *args) so job functions receiving keyword
params are supported; modify the function body that calls queue.enqueue to pass
**(kwargs or {}) and keep existing job_timeout and result_ttl behavior
unchanged.
packages/shared/pyproject.toml (1)

7-11: Consider tightening the redis lower bound to a compatible-release constraint.

redis 7.1.1 was released Feb 9, 2026 and redis-py 7.1.0 drops Python 3.9 support and changed compatibility requirements. redis>=6.4.0 currently resolves to 7.x, which may carry breaking API changes relative to 6.x. The uv.lock file pins the installed version and prevents surprises at install time, but the loose bound means a uv lock --upgrade will silently jump to the next major version. Consider redis~=6.4 (i.e., >=6.4, <7) or redis>=6.4.0,<8 if you've validated against 7.x already.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/pyproject.toml` around lines 7 - 11, The redis dependency
bound in pyproject (the "redis>=6.4.0" entry) is too loose and may pick up
incompatible 7.x releases; update that entry to a compatible-release constraint
such as "redis~=6.4" (equivalent to >=6.4,<7) unless you have validated against
7.x, in which case set an explicit range like "redis>=6.4.0,<8" to prevent
accidental major bumps; modify the "redis>=6.4.0" line in
packages/shared/pyproject.toml accordingly.
tests/unit/test_shared_queue.py (2)

9-23: call_args is more idiomatic than indexing mock_calls for single-call assertions.

mock_calls[0] unpacking works, but queue.enqueue.call_args is the conventional way to inspect a single call's arguments and avoids the fragile positional tuple unpack.

♻️ Suggested refactor
     queue.enqueue.assert_called_once()
-    _, args, kwargs = queue.enqueue.mock_calls[0]
-    assert args[1] == "payload"
-    assert kwargs["job_timeout"] == 123
-    assert kwargs["result_ttl"] == 456
+    call_args = queue.enqueue.call_args
+    assert call_args.args[1] == "payload"
+    assert call_args.kwargs["job_timeout"] == 123
+    assert call_args.kwargs["result_ttl"] == 456
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_shared_queue.py` around lines 9 - 23, The test
test_enqueue_job_applies_shared_timeouts should use queue.enqueue.call_args
instead of indexing queue.enqueue.mock_calls[0]; update the call inspection
after enqueue_job to read args, kwargs = queue.enqueue.call_args (or args,
kwargs = queue.enqueue.call_args) and keep the same assertions (check args[1] ==
"payload", kwargs["job_timeout"] == 123, kwargs["result_ttl"] == 456) so you
reference the enqueue_job helper, SharedSettings, and queue.enqueue correctly.

1-23: No tests for get_redis_connection or get_queue.

get_redis_connection and get_queue are untested. While they are simple wrappers, a test for get_queue verifying that it reuses a supplied connection (the connection or get_redis_connection(settings) path) would cover both the happy path and the optional-connection parameter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_shared_queue.py` around lines 1 - 23, Add unit tests for
get_redis_connection and get_queue: write one test that calls
get_redis_connection(settings) to ensure it returns a redis connection object
(or at least a non-None/expected type) and another test for get_queue that
verifies both code paths—when you pass an explicit connection and when you omit
it. For get_queue specifically, mock get_redis_connection and assert
get_queue(settings, connection=explicit_conn) returns a queue bound to
explicit_conn (e.g., queue.connection is explicit_conn or equivalent) and that
get_queue(settings) calls get_redis_connection(settings) and returns a queue
using that mocked connection; reference get_redis_connection, get_queue and
SharedSettings to locate code under test.
apps/discord_bot/Dockerfile (1)

6-12: Source files are copied before uv sync, defeating layer caching.

Any change to apps/discord_bot/src or packages/shared/src invalidates the RUN uv sync layer, forcing a full dependency re-download on every source-only change. The pattern should be: copy manifests → install deps → copy source.

♻️ Proposed fix
 COPY pyproject.toml uv.lock ./
 COPY apps/discord_bot/pyproject.toml apps/discord_bot/pyproject.toml
 COPY packages/shared/pyproject.toml packages/shared/pyproject.toml
+COPY apps/worker/pyproject.toml apps/worker/pyproject.toml
 
 RUN uv sync --frozen --no-dev --package discord-bot-app
 
+COPY apps/discord_bot/src apps/discord_bot/src
+COPY packages/shared/src packages/shared/src
-COPY apps/discord_bot/src apps/discord_bot/src
-COPY packages/shared/src packages/shared/src
-
-RUN uv sync --frozen --no-dev --package discord-bot-app

Note: this assumes uv installs workspace packages from the lock file without requiring the source tree at dependency-resolution time. If uv requires source for the local workspace packages during sync, a --no-install-workspace first pass followed by a second uv sync after copying sources would be needed instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/Dockerfile` around lines 6 - 12, Reorder the Dockerfile so
manifests are copied and dependencies are installed before app source: keep the
COPY pyproject.toml uv.lock ./ and the per-package manifest COPYs (COPY
apps/discord_bot/pyproject.toml and COPY packages/shared/pyproject.toml) first,
then run RUN uv sync --frozen --no-dev --package discord-bot-app to populate the
dependency layer, and only after that COPY the source trees (COPY
apps/discord_bot/src and COPY packages/shared/src). If uv requires local
workspace sources for resolution, perform an initial RUN uv sync
--no-install-workspace (or equivalent) then copy the source dirs and run RUN uv
sync --frozen --no-dev --package discord-bot-app again.
apps/discord_bot/pyproject.toml (1)

7-15: redis, rq, pydantic, and pydantic-settings are already transitive dependencies via five08.

All four are declared as direct dependencies of both apps/discord_bot and packages/shared. Keeping them explicit here is a valid choice (guards against accidental removals in the shared package), but it creates a secondary surface to keep version constraints in sync. If that's intentional, a brief comment would help future maintainers understand why.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/pyproject.toml` around lines 7 - 15, The dependency list
includes redis, rq, pydantic, and pydantic-settings even though they are
transitive via the five08 package; either remove these four explicit entries to
avoid duplicate version surfaces, or keep them but add a short inline comment
explaining the intentional duplication as a guard against accidental removal in
the shared package; update the apps/discord_bot pyproject.toml dependency block
around the five08, redis, rq, pydantic, and pydantic-settings entries
accordingly so maintainers understand the choice.
docker-compose.yml (1)

3-4: RDB-only persistence risks losing queued jobs on Redis restart.

With --save 60 1, jobs enqueued in the last 60 seconds before an unclean shutdown will not be persisted to disk. For an RQ-based job queue, this means silently dropped work. Consider enabling AOF for stronger durability:

⚙️ Suggested change
-    command: ["redis-server", "--save", "60", "1", "--loglevel", "warning"]
+    command: ["redis-server", "--save", "60", "1", "--appendonly", "yes", "--loglevel", "warning"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 3 - 4, The current Redis service uses the
image declaration image: redis:7-alpine and command: ["redis-server", "--save",
"60", "1", "--loglevel", "warning"] which leaves Redis configured for RDB-only
persistence and risks losing recently enqueued RQ jobs; update the command (or
supply a redis.conf) to enable AOF (e.g., set appendonly yes and appendfsync
everysec) and adjust or remove the --save options so AOF is authoritative (or
configure both safely), ensuring Redis persists writes to disk with appendonly
enabled for stronger durability.
apps/worker/src/five08/worker/models.py (1)

27-35: Adding populate_by_name=True to ContactData is optional defensive programming, not required for current code.

The model is only instantiated via model_validate(raw) where raw is an EspoCRM API response containing alias keys (firstName, lastName, emailAddress). Pydantic v2's model_validate() handles aliases correctly without populate_by_name=True. This setting would only matter if code directly constructed the model with Python field names like ContactData(first_name="John", ...), which does not occur in the codebase.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/models.py` around lines 27 - 35, The
ContactData model does not need populate_by_name=True because instances are
created via ContactData.model_validate(raw) with EspoCRM alias keys (firstName,
lastName, emailAddress) and Pydantic v2 will honor aliases; if you added
populate_by_name=True to the ContactData class, remove that setting and ensure
code continues to instantiate via ContactData.model_validate(raw) rather than
constructing with Python field names (first_name, last_name, etc.).
tests/unit/test_worker_models.py (1)

11-13: Consider asserting the second event's identity.

The second event {"id": "contact-2"} is never verified; payload.events[1].id is unchecked. Adding it closes the coverage gap at no cost.

🧪 Proposed addition
 assert len(payload.events) == 2
 assert payload.events[0].id == "contact-1"
 assert payload.events[0].name == "Jane"
+assert payload.events[1].id == "contact-2"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_worker_models.py` around lines 11 - 13, Add an assertion
verifying the second event's id to close the coverage gap: in the test (where
payload.events is inspected) assert that payload.events[1].id == "contact-2" so
the second event identity is validated alongside the existing checks of
payload.events[0].id and payload.events[0].name.
apps/worker/src/five08/worker/jobs.py (1)

20-30: process_webhook_event discards all payload values.

The returned dict includes payload_keys (a sorted list of key names) but not the actual payload content. If this is an intentional audit/acknowledgement stub, a comment or TODO clarifying that is useful. If callers or consumers expect the actual event data to be persisted in the job result for replay or routing, the current implementation silently drops it.

♻️ If payload content should be preserved
 def process_webhook_event(source: str, payload: dict[str, Any]) -> dict[str, Any]:
     """Process a generic webhook payload and return normalized metadata."""
     event_id = str(payload.get("id", "unknown"))
     received_at = datetime.now(timezone.utc).isoformat()
     logger.info("Processing webhook source=%s event_id=%s", source, event_id)
     return {
         "source": source,
         "event_id": event_id,
         "received_at": received_at,
         "payload_keys": sorted(payload.keys()),
+        "payload": payload,
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/jobs.py` around lines 20 - 30,
process_webhook_event currently returns only payload_keys and discards all
payload values; update the function to include the actual event data (e.g., add
a "payload" field containing the payload or a normalized/redacted copy) so
callers can replay or route events, and ensure the returned dict still contains
"source", "event_id", "received_at", and "payload_keys"; alternatively, if
dropping payloads was intentional, add a clear TODO/comment in
process_webhook_event explaining it's an audit/ack stub and why the payload is
omitted.
apps/worker/src/five08/worker/crm/processor.py (1)

30-37: Silently swallowing API errors may produce misleading results.

When get_contact_attachments fails (e.g., network timeout, auth error), it returns [] and logs an error. The caller in process_contact_skills then reports "No resume attachments found" with success=False — masking the real root cause from the job result.

Consider re-raising or returning a sentinel that lets the caller distinguish "zero attachments" from "API failure" so the job result carries an accurate error message.

Proposed approach: let the exception propagate
     def get_contact_attachments(self, contact_id: str) -> list[dict[str, Any]]:
-        try:
-            raw = self.api.request("GET", f"Contact/{contact_id}/attachments")
-            attachments = raw.get("list", [])
-            return attachments if isinstance(attachments, list) else []
-        except EspoAPIError as exc:
-            logger.error("Error getting contact attachments %s: %s", contact_id, exc)
-            return []
+        try:
+            raw = self.api.request("GET", f"Contact/{contact_id}/attachments")
+            attachments = raw.get("list", [])
+            return attachments if isinstance(attachments, list) else []
+        except EspoAPIError as exc:
+            logger.error("Error getting contact attachments %s: %s", contact_id, exc)
+            raise ValueError(f"Failed to get attachments for contact {contact_id}: {exc}") from exc

The broad except Exception in process_contact_skills (line 133) already catches and wraps this into a structured failure with the real error string.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/crm/processor.py` around lines 30 - 37,
get_contact_attachments currently catches EspoAPIError and returns an empty
list, which masks API failures from callers like process_contact_skills; change
it so API errors are not swallowed—either remove the try/except so EspoAPIError
propagates or re-raise the exception inside the except block (do not convert to
an empty list), so process_contact_skills' outer exception handling can record
the real error; reference get_contact_attachments and EspoAPIError (and
process_contact_skills) when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.dockerignore:
- Around line 1-12: The .dockerignore is missing patterns to exclude local
environment files which can leak secrets during a Docker build; update
.dockerignore to add entries like ".env" and ".env*" (and any service-specific
env filenames you use) so that a "COPY . ." in Dockerfiles cannot include those
files in image layers; ensure you add both ".env" and a wildcard like ".env.*"
to cover variants (e.g., .env.local, .env.production) and commit the change.

In `@apps/discord_bot/Dockerfile`:
- Around line 1-14: The Dockerfile runs the app as root; add a non-root user and
switch to it: create a dedicated user/group (e.g., appuser/appgroup), chown the
application directory (/app) and any artifacts created by RUN steps, and add a
USER appuser directive before the CMD so "uv run --package discord-bot-app
discord-bot" executes as that non-root user; ensure the Dockerfile retains
WORKDIR /app and environment vars (UV_LINK_MODE) and that uv sync and subsequent
runtime files are owned by the new user.

In `@apps/discord_bot/src/five08/discord_bot/cogs/crm.py`:
- Around line 24-25: The module-level aliases EspoAPI and EspoAPIError are
defined but never used; change the instantiation that sets self.espo_api
(currently calling espo.EspoAPI) to use the local EspoAPI alias so tests that
patch five08.discord_bot.cogs.crm.EspoAPI will correctly mock the constructor,
and ensure any except clauses consistently reference EspoAPIError (the local
alias) rather than espo.EspoAPIError so the aliases are actually used and
mocking works as intended.

In `@apps/worker/Dockerfile`:
- Around line 1-14: The Dockerfile currently runs as root; create and switch to
a non-root user to satisfy Trivy DS-0002 by adding a non-root user creation and
ownership step and a USER instruction before CMD. Specifically, in the
Dockerfile that starts FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim and
sets WORKDIR /app, add steps to create a user/group (e.g., "integrations" or
"appuser"), chown /app and any runtime directories to that UID/GID, and then add
USER <username> (or USER <uid>:<gid>) prior to the existing CMD ["uv", "run",
"--package", "integrations-worker", "worker-api"] so the container process no
longer runs as root.

In `@apps/worker/src/five08/worker/api.py`:
- Around line 100-105: The handler currently only catches ValidationError from
EspoCRMWebhookPayload.from_list, but non-dict items will raise TypeError (e.g.,
"argument after ** must be a mapping") and leak a 500; update the try/except
around EspoCRMWebhookPayload.from_list to also catch TypeError (in addition to
ValidationError) and return the same web.json_response({"error":
"invalid_webhook_event", "detail": str(exc)}, status=400) so malformed list
elements produce a 400; reference EspoCRMWebhookPayload.from_list,
ValidationError, TypeError, and web.json_response when making the change.
- Around line 36-38: The code performs blocking Redis/RQ operations
(redis_conn.ping(), queue.enqueue(), and enqueue_job calls) on the aiohttp event
loop (e.g., health_handler and handlers that call enqueue_job in ingest,
espocrm, process-contact), which stalls concurrency; fix by offloading each
blocking call to a thread and awaiting it via asyncio.to_thread (for example
await asyncio.to_thread(redis_conn.ping) and await
asyncio.to_thread(queue.enqueue, ...)) and propagate/handle exceptions as
before; update all usages of enqueue_job and any direct
queue.enqueue()/redis_conn.ping() invocations to use asyncio.to_thread to avoid
blocking the event loop.

In `@apps/worker/src/five08/worker/crm/processor.py`:
- Around line 177-178: The current dedup of extracted skills uses
sorted(set(all_skills)) which is case-sensitive and allows near-duplicates like
"Python" and "python" to pass; change this to a case-insensitive dedup that
preserves a single representative for each normalized skill. Replace the
set-based line by building a casefold/lower-keyed ordered mapping from
all_skills to the first-seen original string (e.g., iterate all_skills, use key
= s.casefold() and only keep the first value for each key), then produce
unique_skills = sorted(mapping.values()) so downstream logic sees only one entry
per skill; leave avg_confidence/processed_count logic unchanged.

In `@apps/worker/src/five08/worker/crm/skills_extractor.py`:
- Around line 103-106: _in _extract_skills_heuristic the initial substring-based
detection (detected = {skill for skill in COMMON_SKILLS if skill in lowered})
makes the subsequent regex token loop redundant; change the logic to be
word-boundary–aware only: remove the initial plain substring pass and build
detected from the regex matches (use token_matches =
re.findall(r"\b[a-z][a-z0-9+#\-.]{2,24}\b", lowered) and add tokens that are in
COMMON_SKILLS), so detection uses true word-boundary matching rather than the
superset substring check.

In `@apps/worker/src/five08/worker/jobs.py`:
- Around line 12-17: The ContactSkillsProcessor() instantiation in
process_contact_skills_job is unguarded and can raise before
process_contact_skills() returns a structured error; wrap the creation in a
try/except that catches Exception, log the exception via logger.exception(...)
(or logger.error with details), create a SkillsExtractionResult representing
failure (populate its error/message fields with the exception info or use an
existing failure constructor on SkillsExtractionResult) and return that
result.model_dump() so the job always returns the same failure-dict contract.

In `@apps/worker/src/five08/worker/models.py`:
- Around line 20-24: The from_list classmethod on EspoCRMWebhookPayload
constructs WebhookEvent instances with WebhookEvent(**event), which raises
TypeError for non-dict elements and bypasses the caller's ValidationError
handling in espocrm_webhook_handler; change the construction to use
WebhookEvent.model_validate(event) (or equivalent pydantic/model_validate call)
for each item so type errors are wrapped as ValidationError and then return
cls(events=events) as before; update any import/usage so model_validate is used
on each element and preserve the same return value.

In `@packages/shared/src/five08/__init__.py`:
- Around line 3-5: Add a pkgutil-style namespace initializer to the two
distributions that also contribute to the five08 namespace (the discord-bot and
worker packages): create an __init__.py in each package's five08 package that
imports extend_path from pkgutil and sets __path__ = extend_path(__path__,
__name__) (match the existing implementation in the shared package's
__init__.py) so imports across the namespace resolve correctly.

In `@packages/shared/src/five08/queue.py`:
- Around line 13-15: get_redis_connection currently calls
Redis.from_url(settings.redis_url) which leaves
socket_connect_timeout/socket_timeout as None and can block indefinitely; add
configurable timeout fields (e.g., redis_socket_connect_timeout and
redis_socket_timeout) to SharedSettings and pass them into Redis.from_url as
socket_connect_timeout and socket_timeout (coerce to numbers/None as needed), so
get_redis_connection(settings: SharedSettings) uses these values when
constructing the client to avoid blocking on first I/O.

In `@packages/shared/src/five08/settings.py`:
- Around line 11-15: The default redis_url constant in settings.py (redis_url:
str = "redis://redis:6379/0") only resolves inside a Docker Compose stack named
"redis"; update the file by adding a clear inline comment next to redis_url (and
optionally redis_queue_name/redis_key_prefix) stating that this default is for
Docker Compose, that developers running the worker outside Docker must set
REDIS_URL (or override redis_url) to a reachable host, and suggest validating or
documenting the required environment variable so callers of the settings (e.g.,
any code reading redis_url) know to override it when not using Docker Compose.
- Line 19: The config field webhook_shared_secret currently defaults to None
which permits all webhooks; change this by either removing the default so
webhook_shared_secret: str is required (forcing failures when not set) or update
the authorization logic in _is_authorized (apps/worker/src/five08/worker/api.py)
to explicitly reject requests when webhook_shared_secret is falsy (e.g., return
False and log a clear error) and only validate signatures when a non-empty
secret is present; ensure you update any callers and tests to reflect the new
required configuration or new rejection behavior.

In `@README.md`:
- Around line 88-101: The README currently lists ESPO_BASE_URL and ESPO_API_KEY
only under "Discord Bot"; move those two environment variable entries into the
top-level "Shared" section so both the bot and the worker operators will see
them. Update the README by removing ESPO_BASE_URL and ESPO_API_KEY from the
"Discord Bot" list and adding them to the Shared env var list referenced by both
apps; note that these are consumed by the discord bot and by the worker (see
apps/worker/src/five08/worker/config.py and processor.py where ESPO_BASE_URL and
ESPO_API_KEY are used to initialize the EspoCRM client). Ensure the README text
makes clear the variables are required for both services.

In `@tests/unit/test_kimai.py`:
- Around line 26-29: The patch target string is wrong and will raise
AttributeError because the cog imports the kimai module, not the class; update
the patch call in the test (the context where
patch("five08.discord_bot.cogs.kimai.KimaiAPI") is used) to patch the real class
path—preferably "five08.clients.kimai.KimaiAPI" (or alternatively
"five08.discord_bot.cogs.kimai.kimai.KimaiAPI")—so that the Mock returned by
mock_api_class correctly replaces the KimaiAPI used by the code under test.

---

Outside diff comments:
In `@tests/integration/test_email_monitor.py`:
- Around line 43-47: The test's assertion is vacuous because isinstance(e,
(imaplib.IMAP4.error, Exception)) will always be True; change the check to
assert the specific expected IMAP error type or inspect the exception message
instead. Update the except block in tests/integration/test_email_monitor.py that
catches exceptions from email_monitor.task_poll_inbox to assert isinstance(e,
imaplib.IMAP4.error) (or assert "expected substring" in str(e)) so the test
actually verifies an IMAP-specific failure rather than any Exception.

---

Nitpick comments:
In @.env.example:
- Around line 1-44: Dotenv-linter reports UnorderedKey warnings for
.env.example; fix by either (A) alphabetizing keys inside each logical section
(e.g., section groups containing LOG_LEVEL, REDIS_*, WEBHOOK_*, WORKER_NAME,
WORKER_QUEUE_NAMES, DISCORD_BOT_TOKEN,
EMAIL_USERNAME/EMAIL_PASSWORD/IMAP_SERVER/SMTP_SERVER,
ESPO_API_KEY/ESPO_BASE_URL, KIMAI_BASE_URL/KIMAI_API_TOKEN) so keys are
lexicographically ordered per section, or (B) suppressing the rule by adding a
dotenv-linter suppression (comments or config) to exclude .env.example or
disable the UnorderedKey rule (update dotenv-linter config file or add an inline
skip comment) so CI no longer fails on these warnings.

In @.github/workflows/test.yml:
- Line 86: The CI mypy step currently runs "uv run mypy ...
--ignore-missing-imports", which is less strict than the pre-commit hook; update
the workflow to call mypy with the same strictness as pre-commit (remove
"--ignore-missing-imports" and add "--strict" or the exact flags used by
pre-commit) so the CI mypy invocation matches the pre-commit mypy behavior for
the directories referenced in the current command (apps/discord_bot/src/five08/,
apps/worker/src/five08/, packages/shared/src/five08/).

In @.pre-commit-config.yaml:
- Around line 18-25: The version constraint styles in the
.pre-commit-config.yaml deps list are inconsistent (some use >= while most use
~=); update the new entries (aiohttp, redis, rq, types-requests — and requests
if intended) to match the repository's existing convention by changing their
specifiers to the compatible-release operator (~=) so all package lines (e.g.,
aiohttp, redis, rq, types-requests) use ~= for consistency with
discord.py~=2.6.0 and pydantic~=2.10 / pydantic-settings~=2.8; apply the same
operator to any other stub deps to keep the file uniform.

In `@apps/discord_bot/Dockerfile`:
- Around line 6-12: Reorder the Dockerfile so manifests are copied and
dependencies are installed before app source: keep the COPY pyproject.toml
uv.lock ./ and the per-package manifest COPYs (COPY
apps/discord_bot/pyproject.toml and COPY packages/shared/pyproject.toml) first,
then run RUN uv sync --frozen --no-dev --package discord-bot-app to populate the
dependency layer, and only after that COPY the source trees (COPY
apps/discord_bot/src and COPY packages/shared/src). If uv requires local
workspace sources for resolution, perform an initial RUN uv sync
--no-install-workspace (or equivalent) then copy the source dirs and run RUN uv
sync --frozen --no-dev --package discord-bot-app again.

In `@apps/discord_bot/pyproject.toml`:
- Around line 7-15: The dependency list includes redis, rq, pydantic, and
pydantic-settings even though they are transitive via the five08 package; either
remove these four explicit entries to avoid duplicate version surfaces, or keep
them but add a short inline comment explaining the intentional duplication as a
guard against accidental removal in the shared package; update the
apps/discord_bot pyproject.toml dependency block around the five08, redis, rq,
pydantic, and pydantic-settings entries accordingly so maintainers understand
the choice.

In `@apps/discord_bot/src/five08/discord_bot/main.py`:
- Line 15: The call to configure_logging(settings.log_level) is happening at
module import time in main.py which causes global logging configuration and
settings evaluation as a side effect; move that call into the bot's startup
function (the run() function) so logging is configured during runtime startup
instead of import. Edit main.py to remove or comment out the module-level
configure_logging invocation and add a call to
configure_logging(settings.log_level) at the beginning of the run() function (or
whatever startup entrypoint is defined in main.py) so tests and importers don't
mutate global logging or evaluate settings prematurely.

In `@apps/worker/src/five08/worker/config.py`:
- Around line 26-28: allowed_file_extensions currently includes empty strings
when ALLOWED_FILE_TYPES has empty entries (e.g., "pdf,,docx"); mirror
parsed_resume_keywords by filtering out empty items: update
allowed_file_extensions (the method named allowed_file_extensions) to only
include extensions where ext.strip() is truthy (e.g., use a comprehension with
if ext.strip()) and continue to .strip().lower() each entry so the returned set
contains no empty strings and all normalized extensions.

In `@apps/worker/src/five08/worker/consumer.py`:
- Line 4: The file imports typing.List which is deprecated; replace that import
and any uses of List[...] with the native generic syntax (e.g., list[str]).
Update the import line by removing "from typing import List" and change all type
annotations referencing List (for example in functions or variables in
consumer.py that use List[str] or List[SomeType]) to use the built-in list[...],
ensuring forward refs or Any remain correct.
- Around line 26-35: _worker currently proceeds even when _queue_names() returns
an empty list causing Worker to run with no queues; add a fast-fail guard after
computing queue_names (and before constructing Queue/Worker) that checks if not
queue_names, logs an error including settings.worker_name and the raw
settings.worker_queue_names, and exits non‑zero (e.g., raise SystemExit or call
sys.exit(1)); update the flow around _queue_names(), queues = [Queue(...)] and
Worker(...) so the process never starts a worker with an empty queues list.

In `@apps/worker/src/five08/worker/crm/processor.py`:
- Around line 30-37: get_contact_attachments currently catches EspoAPIError and
returns an empty list, which masks API failures from callers like
process_contact_skills; change it so API errors are not swallowed—either remove
the try/except so EspoAPIError propagates or re-raise the exception inside the
except block (do not convert to an empty list), so process_contact_skills' outer
exception handling can record the real error; reference get_contact_attachments
and EspoAPIError (and process_contact_skills) when making the change.

In `@apps/worker/src/five08/worker/jobs.py`:
- Around line 20-30: process_webhook_event currently returns only payload_keys
and discards all payload values; update the function to include the actual event
data (e.g., add a "payload" field containing the payload or a
normalized/redacted copy) so callers can replay or route events, and ensure the
returned dict still contains "source", "event_id", "received_at", and
"payload_keys"; alternatively, if dropping payloads was intentional, add a clear
TODO/comment in process_webhook_event explaining it's an audit/ack stub and why
the payload is omitted.

In `@apps/worker/src/five08/worker/models.py`:
- Around line 27-35: The ContactData model does not need populate_by_name=True
because instances are created via ContactData.model_validate(raw) with EspoCRM
alias keys (firstName, lastName, emailAddress) and Pydantic v2 will honor
aliases; if you added populate_by_name=True to the ContactData class, remove
that setting and ensure code continues to instantiate via
ContactData.model_validate(raw) rather than constructing with Python field names
(first_name, last_name, etc.).

In `@docker-compose.yml`:
- Around line 3-4: The current Redis service uses the image declaration image:
redis:7-alpine and command: ["redis-server", "--save", "60", "1", "--loglevel",
"warning"] which leaves Redis configured for RDB-only persistence and risks
losing recently enqueued RQ jobs; update the command (or supply a redis.conf) to
enable AOF (e.g., set appendonly yes and appendfsync everysec) and adjust or
remove the --save options so AOF is authoritative (or configure both safely),
ensuring Redis persists writes to disk with appendonly enabled for stronger
durability.

In `@packages/shared/pyproject.toml`:
- Around line 7-11: The redis dependency bound in pyproject (the "redis>=6.4.0"
entry) is too loose and may pick up incompatible 7.x releases; update that entry
to a compatible-release constraint such as "redis~=6.4" (equivalent to >=6.4,<7)
unless you have validated against 7.x, in which case set an explicit range like
"redis>=6.4.0,<8" to prevent accidental major bumps; modify the "redis>=6.4.0"
line in packages/shared/pyproject.toml accordingly.

In `@packages/shared/src/five08/logging.py`:
- Around line 8-10: The logging configuration currently calls
logging.basicConfig(...) which is a no-op if the root logger already has
handlers; update the basicConfig call (in the configure_logging function /
module-level logging setup in five08.logging) to pass force=True (i.e.,
logging.basicConfig(..., force=True)) so the provided level and format are
always applied even when other libraries have already attached handlers.

In `@packages/shared/src/five08/queue.py`:
- Around line 24-36: The helper enqueue_job currently accepts only positional
args and loses any keyword arguments needed by job functions; update the
signature of enqueue_job to accept **kwargs (e.g., kwargs: dict[str, Any] | None
= None) and forward them into the queue.enqueue call (alongside fn and *args) so
job functions receiving keyword params are supported; modify the function body
that calls queue.enqueue to pass **(kwargs or {}) and keep existing job_timeout
and result_ttl behavior unchanged.

In `@packages/shared/src/five08/settings.py`:
- Around line 6-21: SharedSettings currently contains worker-only fields
(redis_url, redis_queue_name, redis_key_prefix, job_timeout_seconds,
job_result_ttl_seconds, webhook_ingest_host, webhook_ingest_port,
webhook_shared_secret) which leak into the Discord bot's Settings via
inheritance; refactor by shrinking SharedSettings to only truly common fields
(e.g., log_level and model_config) and create a new WorkerBaseSettings (or
WorkerSettingsBase) that defines the Redis and webhook fields and any
job-specific config, then update the worker service to inherit from
WorkerBaseSettings and the Discord bot's Settings to inherit only from the
slimmed SharedSettings so the bot no longer accepts worker-only env vars.

In `@scripts/mypy.sh`:
- Line 5: The mypy invocation in scripts/mypy.sh currently runs "uv run mypy"
over the app and package paths but omits tests/ whereas scripts/lint.sh includes
tests/; to align type-checking decisions, either (a) intentionally document why
tests/ is excluded (e.g., note pytest fixture false-positives) or (b) add the
tests/ path to the mypy invocation so tests are type-checked too—update the
command string in scripts/mypy.sh (the line that calls mypy) accordingly and
ensure you preserve the --ignore-missing-imports flag if keeping that behavior.

In `@tests/unit/test_shared_queue.py`:
- Around line 9-23: The test test_enqueue_job_applies_shared_timeouts should use
queue.enqueue.call_args instead of indexing queue.enqueue.mock_calls[0]; update
the call inspection after enqueue_job to read args, kwargs =
queue.enqueue.call_args (or args, kwargs = queue.enqueue.call_args) and keep the
same assertions (check args[1] == "payload", kwargs["job_timeout"] == 123,
kwargs["result_ttl"] == 456) so you reference the enqueue_job helper,
SharedSettings, and queue.enqueue correctly.
- Around line 1-23: Add unit tests for get_redis_connection and get_queue: write
one test that calls get_redis_connection(settings) to ensure it returns a redis
connection object (or at least a non-None/expected type) and another test for
get_queue that verifies both code paths—when you pass an explicit connection and
when you omit it. For get_queue specifically, mock get_redis_connection and
assert get_queue(settings, connection=explicit_conn) returns a queue bound to
explicit_conn (e.g., queue.connection is explicit_conn or equivalent) and that
get_queue(settings) calls get_redis_connection(settings) and returns a queue
using that mocked connection; reference get_redis_connection, get_queue and
SharedSettings to locate code under test.

In `@tests/unit/test_worker_models.py`:
- Around line 11-13: Add an assertion verifying the second event's id to close
the coverage gap: in the test (where payload.events is inspected) assert that
payload.events[1].id == "contact-2" so the second event identity is validated
alongside the existing checks of payload.events[0].id and
payload.events[0].name.

Comment thread .dockerignore
Comment thread apps/discord_bot/Dockerfile
Comment thread apps/discord_bot/src/five08/discord_bot/cogs/crm.py
Comment thread apps/worker/Dockerfile
Comment thread apps/worker/src/five08/worker/api.py Outdated
Comment thread packages/shared/src/five08/queue.py Outdated
Comment thread packages/shared/src/five08/settings.py Outdated
Comment thread packages/shared/src/five08/settings.py
Comment thread README.md
Comment thread tests/unit/test_kimai.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
apps/discord_bot/Dockerfile (1)

10-13: Source files are copied before uv sync, busting the dependency-install cache on every source change.

Lines 10–11 copy application source into the image, then line 13 runs uv sync. Because Docker invalidates a layer whenever any preceding layer changes, any edit to apps/discord_bot/src or packages/shared/src forces a full uv sync re-run — even though no dependency changed. The uv sync layer should run immediately after all pyproject.toml/uv.lock copies and before any source is added.

♻️ Proposed fix — hoist `uv sync` before source COPYs
 COPY pyproject.toml uv.lock ./
 COPY apps/discord_bot/pyproject.toml apps/discord_bot/pyproject.toml
 COPY apps/worker/pyproject.toml apps/worker/pyproject.toml
 COPY packages/shared/pyproject.toml packages/shared/pyproject.toml
-COPY apps/discord_bot/src apps/discord_bot/src
-COPY packages/shared/src packages/shared/src
 
 RUN uv sync --frozen --no-dev --package discord-bot-app
 
+COPY apps/discord_bot/src apps/discord_bot/src
+COPY packages/shared/src packages/shared/src
+
 CMD ["uv", "run", "--package", "discord-bot-app", "discord-bot"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/Dockerfile` around lines 10 - 13, The Dockerfile currently
copies application source (COPY apps/discord_bot/src and COPY
packages/shared/src) before running the dependency step (RUN uv sync --frozen
--no-dev --package discord-bot-app), which busts the cache; fix by first copying
only dependency manifests (pyproject.toml and uv.lock) for the discord bot and
any shared package into the image, run RUN uv sync --frozen --no-dev --package
discord-bot-app to install deps, and only after that COPY the full source
directories (apps/discord_bot/src and packages/shared/src) so source changes do
not invalidate the dependency-install layer.
apps/worker/Dockerfile (1)

1-1: Pin the base image to a digest for reproducible, supply-chain-safe builds.

The tag python3.12-bookworm-slim is mutable; a silent upstream push can change the image between builds without any diff in this file.

🔒 Proposed fix — pin by digest
-FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim
+FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim@sha256:<digest>

Resolve the current digest with:

docker buildx imagetools inspect ghcr.io/astral-sh/uv:python3.12-bookworm-slim \
  --format '{{.Manifest.Digest}}'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/Dockerfile` at line 1, The base image in the Dockerfile uses a
mutable tag (FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim); replace the
tag with the resolved digest (FROM ghcr.io/astral-sh/uv@sha256:<digest>) to pin
the image for reproducible builds and supply-chain safety; obtain the digest by
running docker buildx imagetools inspect
ghcr.io/astral-sh/uv:python3.12-bookworm-slim --format '{{.Manifest.Digest}}'
and update the FROM line accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@apps/discord_bot/Dockerfile`:
- Around line 1-15: The Dockerfile currently leaves the container running as
root; add a non-root user and switch to it after preparing files to satisfy
Trivy DS-0002. Modify the Dockerfile around the WORKDIR /app and after
copying/build steps (after the RUN uv sync --frozen --no-dev --package
discord-bot-app) to create a user/group (e.g., appuser), chown /app and any
runtime dirs to that user, and add a USER appuser directive before CMD so the uv
run command runs unprivileged; ensure file ownership changes reference /app and
any other runtime paths referenced by CMD ["uv", "run", "--package",
"discord-bot-app", "discord-bot"].

In `@apps/worker/Dockerfile`:
- Around line 1-15: The image runs as root because there's no USER directive;
update the Dockerfile to create a non-root runtime user, chown the WORKDIR
(/app) and required files, and switch to that user before CMD so the process
launched by CMD ["uv", "run", "--package", "integrations-worker", "worker-api"]
does not run as UID 0; specifically add RUN steps to create a user/group (e.g.,
useradd/addgroup or adduser), chown /app and any installed files, and then add a
USER <nonroot> line after the file copies and package install but before CMD.

---

Nitpick comments:
In `@apps/discord_bot/Dockerfile`:
- Around line 10-13: The Dockerfile currently copies application source (COPY
apps/discord_bot/src and COPY packages/shared/src) before running the dependency
step (RUN uv sync --frozen --no-dev --package discord-bot-app), which busts the
cache; fix by first copying only dependency manifests (pyproject.toml and
uv.lock) for the discord bot and any shared package into the image, run RUN uv
sync --frozen --no-dev --package discord-bot-app to install deps, and only after
that COPY the full source directories (apps/discord_bot/src and
packages/shared/src) so source changes do not invalidate the dependency-install
layer.

In `@apps/worker/Dockerfile`:
- Line 1: The base image in the Dockerfile uses a mutable tag (FROM
ghcr.io/astral-sh/uv:python3.12-bookworm-slim); replace the tag with the
resolved digest (FROM ghcr.io/astral-sh/uv@sha256:<digest>) to pin the image for
reproducible builds and supply-chain safety; obtain the digest by running docker
buildx imagetools inspect ghcr.io/astral-sh/uv:python3.12-bookworm-slim --format
'{{.Manifest.Digest}}' and update the FROM line accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant