Skip to content

feat: add Dramatiq worker stack with Postgres#30

Merged
michaelmwu merged 4 commits into
mainfrom
michaelmwu/minio-postgres
Feb 21, 2026
Merged

feat: add Dramatiq worker stack with Postgres#30
michaelmwu merged 4 commits into
mainfrom
michaelmwu/minio-postgres

Conversation

@michaelmwu
Copy link
Copy Markdown
Member

@michaelmwu michaelmwu commented Feb 21, 2026

Description

  • Added Postgres-backed job persistence (jobs table schema) with Alembic migrations and worker API startup migrations.
  • Switched queue processing to Dramatiq with Redis transport and introduced queue adapter layer for future backend swaps.
  • Added MinIO as explicit internal transfer storage, loopback-friendly compose bindings, and clearer .env.example defaults and required/optional docs.
  • Updated shared settings to include internal transfer and database defaults plus docs in README.md, AGENTS.md, and DEVELOPMENT.md.
  • Added/updated worker runtime and tests for idempotent job enqueue, retry state handling, and consumer/API behavior.

Related Issue

  • None.

How Has This Been Tested?

  • docker compose up --build (stack builds and services start; bot requires valid DISCORD_BOT_TOKEN for full runtime).

Summary by CodeRabbit

  • New Features

    • Persistent, Postgres-backed job store with retry, idempotency, and scheduled runs.
    • Internal object storage integration for file transfers.
    • New attachment config options: max attachments, max file size, allowed types, resume keywords.
    • Enhanced health reporting now includes database and worker status.
  • Chores

    • Updated compose/deployment to include Postgres and MinIO and new environment variables.
    • Worker runtime moved to a new queue/back-end implementation for improved reliability.

@michaelmwu michaelmwu force-pushed the michaelmwu/minio-postgres branch from d6b0032 to a2e93c6 Compare February 21, 2026 02:42
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 21, 2026

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 21 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Replaces RQ with Dramatiq and a PostgreSQL-backed job store, adds Alembic migrations and MinIO for internal transfers, introduces job retry/idempotency and health checks, updates worker API/consumer and shared queue abstractions, and extends docker-compose and settings for Postgres/MinIO.

Changes

Cohort / File(s) Summary
Configuration & Compose
\.env.example, docker-compose.yml
Added Postgres and MinIO services, new env vars for Postgres, MinIO, and job retry/idempotency; updated service dependencies and volumes.
Documentation
AGENTS.md, DEVELOPMENT.md, README.md
Documented new architecture: Postgres-backed jobs, Dramatiq queue delivery, MinIO internal transfers, Alembic migrations, job model, and expanded health checks.
Project & Dockerfiles
apps/discord_bot/pyproject.toml, apps/worker/pyproject.toml, packages/shared/pyproject.toml, apps/discord_bot/Dockerfile, apps/worker/Dockerfile
Removed RQ dependency, added alembic/dramatiq/psycopg entries where appropriate; pinned base images by digest for reproducibility.
Worker: actors / consumer / dispatcher / jobs
apps/worker/src/five08/worker/actors.py, .../consumer.py, .../dispatcher.py, .../jobs.py
Introduced Dramatiq actor entrypoint, retry/backoff and job routing logic; replaced RQ consumer with Dramatiq Worker; added QueueClient adapter (DramatiqQueueClient) and new job handlers.
Worker migrations & Alembic config
apps/worker/src/five08/worker/alembic.ini, .../migrations/*, .../migrations/versions/20260221_0100_create_jobs_table.py
Added Alembic configuration and migration package; new migration creates persistent jobs table, triggers, indexes, and status constraints.
Worker startup & DB migration runner
apps/worker/src/five08/worker/db_migrations.py, apps/worker/src/five08/worker/api.py
Added run_job_migrations and startup invocation; health endpoint now checks Postgres and Redis; queue client built via factory; idempotency keys propagated into enqueue flows.
Shared queue & settings
packages/shared/src/five08/queue.py, packages/shared/src/five08/settings.py
Major refactor to Postgres-backed persistence: JobStatus, JobRecord, EnqueuedJob, QueueClient protocol, create/get/mark job APIs, idempotency semantics; added postgres/minio and retry settings and helper to normalize SQLAlchemy DSNs.
Namespace packaging
apps/discord_bot/src/five08/__init__.py, apps/worker/src/five08/__init__.py
Added pkgutil.extend_path-based init.py files to establish namespace packages for distributed subpackages.
CRM/skills changes
apps/worker/src/five08/worker/crm/processor.py, .../skills_extractor.py, .../models.py
Improved case-insensitive deduplication preserving casing, tightened heuristic skill detection, and adjusted webhook payload validation to use model validation.
Tests
tests/unit/test_shared_queue.py, tests/unit/test_worker_api.py, tests/integration/test_email_monitor.py
Updated tests to reflect Postgres-backed enqueue behavior, mock Postgres health in API tests, and adjusted one integration test to allow exceptions to propagate.

Sequence Diagram(s)

sequenceDiagram
    participant API as API Handler
    participant DB as PostgreSQL
    participant Queue as Dramatiq Broker
    participant Worker as Worker Process
    participant Handler as Job Handler

    API->>DB: create_job_record (idempotent)
    DB-->>API: job_id, created

    alt created == true
        API->>Queue: enqueue execute_job(job_id)
        Queue-->>API: scheduled
    else created == false
        API-->>API: return existing job_id
    end

    Queue->>Worker: dispatch execute_job(job_id)
    Worker->>DB: mark_job_running(job_id)
    DB-->>Worker: ok

    Worker->>Handler: invoke handler(job_payload)

    alt handler succeeds
        Handler-->>Worker: result
        Worker->>DB: mark_job_succeeded(job_id)
        DB-->>Worker: ok
    else handler fails
        Handler-->>Worker: exception
        Worker->>DB: mark_job_retry_or_dead(job_id, attempts, run_after, last_error)
        DB-->>Worker: ok
        Worker->>Queue: schedule retry (backoff)
        Queue-->>Worker: scheduled
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

A rabbit hops with code to spare 🐇
Postgres stores the jobs with care,
Dramatiq drums the work in line,
MinIO moves each file so fine,
Migrations march — the queues now sing! 🎶

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: introducing a Dramatiq-based worker architecture with Postgres persistence, which is the primary focus across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 88.24% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch michaelmwu/minio-postgres

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/worker/src/five08/worker/api.py (1)

166-172: ⚠️ Potential issue | 🟡 Minor

Permanent idempotency key blocks re-triggering the same contact.

idempotency_key=f"manual:{contact_id}" means once a job exists for a given contact (regardless of outcome), any subsequent manual trigger for the same contact_id will silently return the old job's ID instead of creating a new one. If the intent is to allow re-processing a contact after a previous run succeeded or failed, this key scheme prevents it.

Consider including a distinguishing component (e.g., a timestamp or caller-supplied nonce) if re-triggering is a valid use case:

-        idempotency_key=f"manual:{contact_id}",
+        # allows re-trigger; omit key or scope it to a time window

If permanent dedup per contact is intentional, this is fine as-is — just noting the implication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/api.py` around lines 166 - 172, The current
enqueue_job call uses a permanent idempotency_key=f"manual:{contact_id}" which
prevents re-triggering the same contact; change the idempotency key construction
in the enqueue_job invocation (where enqueue_job, process_contact_skills_job and
contact_id are used) to include a distinguishing component such as a timestamp
or a caller-supplied nonce (e.g., f"manual:{contact_id}:{nonce_or_timestamp}")
or expose an optional parameter to disable/override idempotency so manual
re-runs create new jobs when desired; if permanent deduplication per contact is
intentional, leave as-is.
🧹 Nitpick comments (10)
apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py (2)

42-47: updated_at has no auto-update trigger — it will always retain the insertion timestamp after the row is modified.

server_default=sa.text("NOW()") only fires on INSERT. Every UPDATE path in the application must set updated_at = NOW() explicitly, or the column becomes stale immediately. Adding a trigger here makes the contract DB-enforced and eliminates the risk of it being missed in any future query path.

♻️ Proposed fix: add an auto-update trigger to the migration
     op.create_index("idx_jobs_status_run_after", "jobs", ["status", "run_after"])
     op.create_index("idx_jobs_created_at", "jobs", ["created_at"])
+
+    op.execute("""
+        CREATE OR REPLACE FUNCTION set_updated_at()
+        RETURNS TRIGGER AS $$
+        BEGIN
+            NEW.updated_at = NOW();
+            RETURN NEW;
+        END;
+        $$ LANGUAGE plpgsql;
+    """)
+    op.execute("""
+        CREATE TRIGGER trg_jobs_set_updated_at
+        BEFORE UPDATE ON jobs
+        FOR EACH ROW EXECUTE FUNCTION set_updated_at();
+    """)

And in downgrade():

 def downgrade() -> None:
     """Drop the job persistence table."""
+    op.execute("DROP TRIGGER IF EXISTS trg_jobs_set_updated_at ON jobs")
+    op.execute("DROP FUNCTION IF EXISTS set_updated_at")
     op.drop_index("idx_jobs_created_at", table_name="jobs")
     op.drop_index("idx_jobs_status_run_after", table_name="jobs")
     op.drop_table("jobs")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py`
around lines 42 - 47, The migration defines jobs.updated_at with
server_default=sa.text("NOW()") but no auto-update on UPDATE; add a DB trigger
to enforce auto-updating. In upgrade(), create a SQL function (e.g.
set_updated_at or jobs_set_updated_at_fn) that sets NEW.updated_at = NOW() and
returns NEW, then create a BEFORE UPDATE trigger on the jobs table (e.g.
jobs_set_updated_at_tr) to EXECUTE that function FOR EACH ROW; in downgrade(),
DROP the trigger from jobs then DROP the function to fully revert the change.
Ensure the function/trigger names match between upgrade() and downgrade() so
Alembic migrations run cleanly.

30-31: attempts and max_attempts have no server_default — every INSERT must supply both values explicitly.

Adding server_default=sa.text("0") for attempts in particular removes the possibility of accidentally omitting it on enqueue.

♻️ Proposed fix
-        sa.Column("attempts", sa.Integer(), nullable=False),
-        sa.Column("max_attempts", sa.Integer(), nullable=False),
+        sa.Column("attempts", sa.Integer(), nullable=False, server_default=sa.text("0")),
+        sa.Column("max_attempts", sa.Integer(), nullable=False),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py`
around lines 30 - 31, The migration defines sa.Column("attempts", sa.Integer(),
nullable=False) and sa.Column("max_attempts", sa.Integer(), nullable=False)
without server defaults, forcing every INSERT to provide both; update the Column
definitions for attempts (and optionally max_attempts) to include a
server_default=sa.text("0") so new rows will default to 0 when omitted (modify
the sa.Column calls for "attempts" and "max_attempts" in the migration).
apps/worker/pyproject.toml (1)

10-10: dramatiq[redis]>=1.17.0 will resolve to the current latest 2.0.1, a major version with changed retry backoff behavior.

dramatiq 2.0.0 was released Nov 2025 and 2.0.1 on Jan 18 2026, so the loose >=1.17.0 floor installs the 2.x line today. Dramatiq 2.0.0 changed how retry backoff is calculated so the first backoff is no longer less than min_backoff, meaning each retry backoff is on average twice as long — halving min_backoff is suggested to avoid the effect.

If the worker relies on Dramatiq's built-in Retries middleware (separate from the Postgres-level retry settings), the backoff times will differ from what 1.x produced. Consider capping the constraint while verifying 2.x behavior:

📦 Proposed constraint tightening
-    "dramatiq[redis]>=1.17.0",
+    "dramatiq[redis]>=1.17.0,<3",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/pyproject.toml` at line 10, The dependency line
"dramatiq[redis]>=1.17.0" is too loose and allows Dramatiq 2.x which changes
retry backoff behavior; tighten the constraint in pyproject.toml to avoid
unexpected 2.x behavior (for example change to "dramatiq[redis]>=1.17.0,<2.0" or
pin to a specific 1.x release) or, if you intentionally accept 2.x, update the
Dramatiq Retries middleware configuration (Retries) to adjust min_backoff (halve
existing min_backoff values) to preserve prior timing semantics and then run
tests to verify retry timing.
.env.example (1)

11-17: LGTM — Postgres defaults are clearly scoped to local compose.

MINIO_ROOT_PASSWORD=change-me nicely signals users to update it; consider adding a similar inline comment on POSTGRES_PASSWORD=postgres (e.g. # change in production) since the value itself gives no such hint, unlike the MinIO counterpart.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.env.example around lines 11 - 17, Add a short inline comment to the
POSTGRES_PASSWORD entry to warn users to change the default in production:
update the line with POSTGRES_PASSWORD=postgres to include a note like "# change
in production" (target the POSTGRES_PASSWORD variable in the .env.example so the
default is clearly scoped to local use).
apps/worker/src/five08/worker/migrations/env.py (1)

22-24: URL normalization logic is duplicated in db_migrations.py.

The postgresql://postgresql+psycopg:// substitution is copy-pasted verbatim into db_migrations._sqlalchemy_postgres_url(). Extract it to a single shared utility (e.g., in a db_utils.py module or in db_migrations.py and import it here).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/migrations/env.py` around lines 22 - 24, The
postgres URL normalization (replacing "postgresql://" with
"postgresql+psycopg://") is duplicated; extract it to a single helper (e.g.,
normalize_sqlalchemy_postgres_url) and use it from both places: remove the
inline replace in apps/worker/src/five08/worker/migrations/env.py and instead
import and call the new helper, and update
db_migrations._sqlalchemy_postgres_url to call that same helper; ensure the
helper accepts the url string and returns the normalized string and remove the
duplicate logic from both locations.
apps/worker/src/five08/worker/db_migrations.py (1)

16-22: _sqlalchemy_postgres_url() duplicates _get_url() normalization from migrations/env.py.

The same postgresql://postgresql+psycopg:// replacement exists in both files verbatim. Consolidating into one shared helper eliminates a future drift risk (e.g., if asyncpg or another driver needs to be supported).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/db_migrations.py` around lines 16 - 22, The
function _sqlalchemy_postgres_url duplicates URL normalization logic found in
migrations/env.py's _get_url; extract the normalization (the "postgresql://"' →
"postgresql+psycopg://" transformation) into a single shared helper (e.g.,
normalize_db_url or get_sqlalchemy_url) placed in a common module used by
migrations/env.py and apps/worker/src/five08/worker/db_migrations.py, then
update both _sqlalchemy_postgres_url and _get_url to call that helper (remove
the inline replace in both locations) so the normalization lives in one place;
ensure imports are added and behavior remains identical.
docker-compose.yml (1)

32-32: Pin minio/minio and minio/mc image tags for reproducible builds.

minio/minio:latest and minio/mc:latest can silently pull breaking changes; the other images (postgres:17-alpine, redis:7-alpine) are already pinned. Use specific digest-pinned or semantic version tags (e.g., minio/minio:RELEASE.2025-...).

Also applies to: 50-50

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` at line 32, Replace the floating :latest tags for MinIO
with explicit, pinned tags or digests to ensure reproducible builds: update the
"image: minio/minio:latest" entry (and the related "image: minio/mc:latest"
occurrence) to a specific release tag (e.g., RELEASE.YYYY-MM-DDT...), or use an
immutable sha256 digest (minio/minio@sha256:...) so the service definitions
(image: minio/minio and image: minio/mc) no longer reference :latest and will
always pull the exact same artifact.
packages/shared/src/five08/queue.py (1)

73-75: Every DB call opens a new connection — consider connection pooling.

get_postgres_connection creates a fresh psycopg.connect() connection each time. Every create_job_record, get_job, _mark_job, and is_postgres_healthy call opens and closes its own connection. Under load (especially the actor retry loop and sequential EspoCRM event enqueue), this creates significant connection churn.

psycopg 3 provides psycopg_pool.ConnectionPool which could be initialized once and passed through settings or as a module-level singleton.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/queue.py` around lines 73 - 75,
get_postgres_connection currently calls psycopg.connect() on every DB operation
causing heavy connection churn; replace this with a long-lived
psycopg_pool.ConnectionPool created once (either as a module-level singleton or
stored on SharedSettings) and update callers (create_job_record, get_job,
_mark_job, is_postgres_healthy) to acquire connections via pool.connection()
context managers instead of calling get_postgres_connection() directly;
initialize the pool with sensible sizing (min_size/max_size) at startup and
ensure it is properly closed on shutdown so type hints change from Connection to
the pool (or use a helper that returns a pooled connection context) and all
DB-using functions use that pooled context.
apps/worker/src/five08/worker/actors.py (1)

73-87: Consider adding a top-level try/except in _run_job to catch unexpected errors after mark_job_running.

Beyond the unknown-handler case flagged above, any unexpected exception (e.g., from _extract_call_args type errors) raised outside the inner try/except at line 94 will also leave the job stuck in RUNNING. A defensive wrapper around the entire post-mark_job_running block would ensure proper dead/retry marking for all failure modes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/actors.py` around lines 73 - 87, After calling
mark_job_running in _run_job, wrap the remainder of the post-mark_job_running
logic in a top-level try/except so any unexpected exception (e.g., from
_extract_call_args or call execution) will be caught and the job transitioned
out of RUNNING (call mark_job_dead(job_id, reason) or mark_job_retry(job_id,
reason) as appropriate) instead of leaving it stuck; locate _run_job, the
existing inner try/except block, and mark_job_running to insert the outer
defensive try/except that logs the error and invokes the appropriate job-state
helper to mark the job dead or retry.
apps/worker/src/five08/worker/api.py (1)

129-138: Sequential per-event enqueue opens a new Postgres connection for each event.

Each iteration calls enqueue_jobcreate_job_recordget_postgres_connection, opening and closing a separate Postgres connection per event. For large EspoCRM payloads this adds latency and connection churn. Not critical for typical webhook sizes, but worth noting for future optimization (e.g., batch insert or connection pooling).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/api.py` around lines 129 - 138, The loop
sequentially calls enqueue_job for each event which triggers create_job_record →
get_postgres_connection per event, causing one Postgres connection per event;
refactor to batch the work so a single connection is used: add a new function
(e.g., batch_enqueue_jobs or enqueue_jobs_batch) that accepts a list of event
IDs and other params, opens one connection via get_postgres_connection,
inserts/creates all job records in a single transaction (preserving
idempotency_key per event), returns the created job ids, and replace the
per-event asyncio.to_thread(enqueue_job, ...) calls in the payload.events loop
with a single call to this batch function (or use a connection-pool aware path)
to eliminate per-event connection churn.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/worker/src/five08/worker/actors.py`:
- Around line 89-92: The job can be left stuck RUNNING because
mark_job_running(settings, job_id, worker_name=settings.worker_name) is called
before checking _HANDLERS for the job.type and the subsequent ValueError escapes
the execute_job error handling; fix by resolving the handler lookup (handler =
_HANDLERS.get(job.type)) before calling mark_job_running so unknown types raise
before state change, or alternatively include the handler lookup inside the
execute_job try/except so the existing cleanup path runs for unknown job types
(refer to mark_job_running, _HANDLERS, handler, and execute_job).

In `@apps/worker/src/five08/worker/consumer.py`:
- Around line 14-32: The Worker instantiation is missing the required broker
positional argument and the "Worker burst mode enabled" log is misleading
because Dramatiq has no built-in burst mode; after importing
five08.worker.actors (so actors register and broker is configured) call
dramatiq.get_broker() and pass that broker as the first argument to Worker
(i.e., Worker(broker, queues=queue_set)), and either remove or change the
settings.worker_burst logging in the worker_burst branch (e.g., drop the log or
replace it with a note that burst mode is not supported) to avoid misleading
output.

In `@apps/worker/src/five08/worker/dispatcher.py`:
- Around line 20-23: The subtraction fails when run_at is timezone-naive; in the
dispatcher before computing delay, ensure run_at is normalized to an aware UTC
datetime: check if run_at.tzinfo is None and if so set run_at =
run_at.replace(tzinfo=timezone.utc), otherwise convert with run_at =
run_at.astimezone(timezone.utc); then compute delay = run_at -
datetime.now(tz=timezone.utc) and proceed with the existing
execute_job.send(job_id) logic.

In `@docker-compose.yml`:
- Around line 96-100: The services worker-api and worker-consumer currently
depend on redis and postgres but not on minio-init, so they may start before the
internal-transfers bucket is created; update each service's depends_on block
(the worker-api and worker-consumer service definitions) to include minio-init
with condition: service_completed_successfully (or the project’s equivalent
health/completion condition) so they wait for minio-init to finish before
starting.
- Line 85: Remove the embedded plaintext credentials from the POSTGRES_URL
default; instead either require POSTGRES_URL to be provided (remove the default
fallback) or build it from the existing variables (POSTGRES_USER,
POSTGRES_PASSWORD, POSTGRES_HOST, POSTGRES_DB) so no plaintext password appears
in the compose file — update the POSTGRES_URL environment entry (the
POSTGRES_URL key) accordingly and ensure any dependent services use the new
assembled variables.
- Around line 21-22: The ports mapping for Postgres currently exposes to all
interfaces via "${POSTGRES_PORT:-5432}:5432"; change it to include a host bind
prefix (e.g. "${POSTGRES_HOST_BIND:-127.0.0.1}:${POSTGRES_PORT:-5432}:5432") so
the container binds to the loopback by default; update the ports entry that
references POSTGRES_PORT and add a new POSTGRES_HOST_BIND env variable (default
127.0.0.1) to match the loopback-friendly pattern used for MINIO_HOST_BIND.

In `@packages/shared/src/five08/queue.py`:
- Around line 95-100: The _parse_status function currently maps any unknown DB
status to JobStatus.FAILED silently; update _parse_status to emit a warning when
ValueError is caught, logging the unexpected value and context (e.g., "Unknown
job status from DB: {value}") before returning JobStatus.FAILED; reference the
_parse_status function and the JobStatus enum and use the module logger (or add
a logger via logging.getLogger(__name__)) so unexpected/migrated DB values are
recorded for debugging.
- Around line 125-187: The insert in create_job_record passes a plain Python
dict for payload which psycopg3 won't adapt to a PostgreSQL JSON/JSONB column;
wrap the payload with psycopg.types.json.Json (or Jsonb if the column is JSONB)
before calling cursor.execute so the DB driver serializes it correctly — e.g.,
convert payload to Json/Jsonb when building the parameters for the INSERT in
create_job_record (the tuple passed to cursor.execute) and keep the rest of the
logic (idempotency handling, SELECT on idempotency_key) unchanged.

In `@packages/shared/src/five08/settings.py`:
- Line 16: The defaults for secret-containing settings are hardcoded; change the
declarations for postgres_url and minio_root_password (and any other secrets in
this module) to have no secret defaults by using None (or Field(default=None))
and require them to be supplied from environment variables (e.g., via pydantic
BaseSettings behavior) so secrets are not stored in source; update any
validation or initialization in the Settings class that uses
postgres_url/minio_root_password to raise a clear error if they remain unset at
runtime.

In `@README.md`:
- Around line 100-103: Update the README's environment variable defaults to
match docker-compose and the POSTGRES_URL: change POSTGRES_DB to default
"workflows", POSTGRES_USER to default "postgres", and POSTGRES_PASSWORD to
default "postgres" so they align with the existing POSTGRES_URL
(postgresql://postgres:postgres@postgres:5432/workflows) and docker-compose.yml;
verify the entries for POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD (and the
POSTGRES_URL note) are consistent and correct in the README.

---

Outside diff comments:
In `@apps/worker/src/five08/worker/api.py`:
- Around line 166-172: The current enqueue_job call uses a permanent
idempotency_key=f"manual:{contact_id}" which prevents re-triggering the same
contact; change the idempotency key construction in the enqueue_job invocation
(where enqueue_job, process_contact_skills_job and contact_id are used) to
include a distinguishing component such as a timestamp or a caller-supplied
nonce (e.g., f"manual:{contact_id}:{nonce_or_timestamp}") or expose an optional
parameter to disable/override idempotency so manual re-runs create new jobs when
desired; if permanent deduplication per contact is intentional, leave as-is.

---

Nitpick comments:
In @.env.example:
- Around line 11-17: Add a short inline comment to the POSTGRES_PASSWORD entry
to warn users to change the default in production: update the line with
POSTGRES_PASSWORD=postgres to include a note like "# change in production"
(target the POSTGRES_PASSWORD variable in the .env.example so the default is
clearly scoped to local use).

In `@apps/worker/pyproject.toml`:
- Line 10: The dependency line "dramatiq[redis]>=1.17.0" is too loose and allows
Dramatiq 2.x which changes retry backoff behavior; tighten the constraint in
pyproject.toml to avoid unexpected 2.x behavior (for example change to
"dramatiq[redis]>=1.17.0,<2.0" or pin to a specific 1.x release) or, if you
intentionally accept 2.x, update the Dramatiq Retries middleware configuration
(Retries) to adjust min_backoff (halve existing min_backoff values) to preserve
prior timing semantics and then run tests to verify retry timing.

In `@apps/worker/src/five08/worker/actors.py`:
- Around line 73-87: After calling mark_job_running in _run_job, wrap the
remainder of the post-mark_job_running logic in a top-level try/except so any
unexpected exception (e.g., from _extract_call_args or call execution) will be
caught and the job transitioned out of RUNNING (call mark_job_dead(job_id,
reason) or mark_job_retry(job_id, reason) as appropriate) instead of leaving it
stuck; locate _run_job, the existing inner try/except block, and
mark_job_running to insert the outer defensive try/except that logs the error
and invokes the appropriate job-state helper to mark the job dead or retry.

In `@apps/worker/src/five08/worker/api.py`:
- Around line 129-138: The loop sequentially calls enqueue_job for each event
which triggers create_job_record → get_postgres_connection per event, causing
one Postgres connection per event; refactor to batch the work so a single
connection is used: add a new function (e.g., batch_enqueue_jobs or
enqueue_jobs_batch) that accepts a list of event IDs and other params, opens one
connection via get_postgres_connection, inserts/creates all job records in a
single transaction (preserving idempotency_key per event), returns the created
job ids, and replace the per-event asyncio.to_thread(enqueue_job, ...) calls in
the payload.events loop with a single call to this batch function (or use a
connection-pool aware path) to eliminate per-event connection churn.

In `@apps/worker/src/five08/worker/db_migrations.py`:
- Around line 16-22: The function _sqlalchemy_postgres_url duplicates URL
normalization logic found in migrations/env.py's _get_url; extract the
normalization (the "postgresql://"' → "postgresql+psycopg://" transformation)
into a single shared helper (e.g., normalize_db_url or get_sqlalchemy_url)
placed in a common module used by migrations/env.py and
apps/worker/src/five08/worker/db_migrations.py, then update both
_sqlalchemy_postgres_url and _get_url to call that helper (remove the inline
replace in both locations) so the normalization lives in one place; ensure
imports are added and behavior remains identical.

In `@apps/worker/src/five08/worker/migrations/env.py`:
- Around line 22-24: The postgres URL normalization (replacing "postgresql://"
with "postgresql+psycopg://") is duplicated; extract it to a single helper
(e.g., normalize_sqlalchemy_postgres_url) and use it from both places: remove
the inline replace in apps/worker/src/five08/worker/migrations/env.py and
instead import and call the new helper, and update
db_migrations._sqlalchemy_postgres_url to call that same helper; ensure the
helper accepts the url string and returns the normalized string and remove the
duplicate logic from both locations.

In
`@apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py`:
- Around line 42-47: The migration defines jobs.updated_at with
server_default=sa.text("NOW()") but no auto-update on UPDATE; add a DB trigger
to enforce auto-updating. In upgrade(), create a SQL function (e.g.
set_updated_at or jobs_set_updated_at_fn) that sets NEW.updated_at = NOW() and
returns NEW, then create a BEFORE UPDATE trigger on the jobs table (e.g.
jobs_set_updated_at_tr) to EXECUTE that function FOR EACH ROW; in downgrade(),
DROP the trigger from jobs then DROP the function to fully revert the change.
Ensure the function/trigger names match between upgrade() and downgrade() so
Alembic migrations run cleanly.
- Around line 30-31: The migration defines sa.Column("attempts", sa.Integer(),
nullable=False) and sa.Column("max_attempts", sa.Integer(), nullable=False)
without server defaults, forcing every INSERT to provide both; update the Column
definitions for attempts (and optionally max_attempts) to include a
server_default=sa.text("0") so new rows will default to 0 when omitted (modify
the sa.Column calls for "attempts" and "max_attempts" in the migration).

In `@docker-compose.yml`:
- Line 32: Replace the floating :latest tags for MinIO with explicit, pinned
tags or digests to ensure reproducible builds: update the "image:
minio/minio:latest" entry (and the related "image: minio/mc:latest" occurrence)
to a specific release tag (e.g., RELEASE.YYYY-MM-DDT...), or use an immutable
sha256 digest (minio/minio@sha256:...) so the service definitions (image:
minio/minio and image: minio/mc) no longer reference :latest and will always
pull the exact same artifact.

In `@packages/shared/src/five08/queue.py`:
- Around line 73-75: get_postgres_connection currently calls psycopg.connect()
on every DB operation causing heavy connection churn; replace this with a
long-lived psycopg_pool.ConnectionPool created once (either as a module-level
singleton or stored on SharedSettings) and update callers (create_job_record,
get_job, _mark_job, is_postgres_healthy) to acquire connections via
pool.connection() context managers instead of calling get_postgres_connection()
directly; initialize the pool with sensible sizing (min_size/max_size) at
startup and ensure it is properly closed on shutdown so type hints change from
Connection to the pool (or use a helper that returns a pooled connection
context) and all DB-using functions use that pooled context.

Comment thread apps/worker/src/five08/worker/actors.py Outdated
Comment thread apps/worker/src/five08/worker/consumer.py
Comment thread apps/worker/src/five08/worker/dispatcher.py
Comment thread docker-compose.yml
Comment on lines +21 to +22
ports:
- "${POSTGRES_PORT:-5432}:5432"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Postgres port exposes to all interfaces by default — consider loopback binding.

"${POSTGRES_PORT:-5432}:5432" has no host prefix and defaults to 0.0.0.0, unlike MinIO which explicitly defaults to 127.0.0.1 via ${MINIO_HOST_BIND:-127.0.0.1}. The PR description mentions "loopback-friendly compose bindings", but this was missed for Postgres.

🔒 Proposed fix
-      - "${POSTGRES_PORT:-5432}:5432"
+      - "${POSTGRES_HOST_BIND:-127.0.0.1}:${POSTGRES_PORT:-5432}:5432"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ports:
- "${POSTGRES_PORT:-5432}:5432"
ports:
- "${POSTGRES_HOST_BIND:-127.0.0.1}:${POSTGRES_PORT:-5432}:5432"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 21 - 22, The ports mapping for Postgres
currently exposes to all interfaces via "${POSTGRES_PORT:-5432}:5432"; change it
to include a host bind prefix (e.g.
"${POSTGRES_HOST_BIND:-127.0.0.1}:${POSTGRES_PORT:-5432}:5432") so the container
binds to the loopback by default; update the ports entry that references
POSTGRES_PORT and add a new POSTGRES_HOST_BIND env variable (default 127.0.0.1)
to match the loopback-friendly pattern used for MINIO_HOST_BIND.

Comment thread docker-compose.yml
environment:
REDIS_URL: ${REDIS_URL:-redis://redis:6379/0}
REDIS_QUEUE_NAME: ${REDIS_QUEUE_NAME:-jobs.default}
POSTGRES_URL: ${POSTGRES_URL:-postgresql://postgres:postgres@postgres:5432/workflows}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Embedded credentials in POSTGRES_URL default value flagged by static analysis (CKV_SECRET_4).

The inline fallback postgresql://postgres:postgres@postgres:5432/workflows bakes the password into the Compose file. Prefer assembling the URL from individual credential variables with no plaintext password default, or require POSTGRES_URL to be set explicitly (no default).

🔒 Proposed fix
-      POSTGRES_URL: ${POSTGRES_URL:-postgresql://postgres:postgres@postgres:5432/workflows}
+      POSTGRES_URL: ${POSTGRES_URL:?POSTGRES_URL must be set}

Alternatively, construct the URL from POSTGRES_USER, POSTGRES_PASSWORD, and POSTGRES_HOST which are already separately defined on the postgres service.

Also applies to: 115-115

🧰 Tools
🪛 Checkov (3.2.334)

[medium] 85-86: Basic Auth Credentials

(CKV_SECRET_4)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` at line 85, Remove the embedded plaintext credentials
from the POSTGRES_URL default; instead either require POSTGRES_URL to be
provided (remove the default fallback) or build it from the existing variables
(POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_HOST, POSTGRES_DB) so no plaintext
password appears in the compose file — update the POSTGRES_URL environment entry
(the POSTGRES_URL key) accordingly and ensure any dependent services use the new
assembled variables.

Comment thread docker-compose.yml
Comment on lines 96 to +100
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

worker-api and worker-consumer should wait for minio-init to complete.

Neither service lists minio-init in depends_on, so the internal-transfers bucket may not exist when they start processing jobs that use MinIO.

🔒 Proposed fix (apply to both services)
     depends_on:
+      minio-init:
+        condition: service_completed_successfully
       redis:
         condition: service_healthy
       postgres:
         condition: service_healthy

Also applies to: 124-130

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 96 - 100, The services worker-api and
worker-consumer currently depend on redis and postgres but not on minio-init, so
they may start before the internal-transfers bucket is created; update each
service's depends_on block (the worker-api and worker-consumer service
definitions) to include minio-init with condition:
service_completed_successfully (or the project’s equivalent health/completion
condition) so they wait for minio-init to finish before starting.

Comment thread packages/shared/src/five08/queue.py
Comment thread packages/shared/src/five08/queue.py
redis_key_prefix: str = "jobs"
redis_socket_connect_timeout: float | None = 5.0
redis_socket_timeout: float | None = 5.0
postgres_url: str = "postgresql://postgres:postgres@postgres:5432/workflows"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hardcoded credential defaults violate the secrets-in-env-vars guideline.

postgres_url embeds postgres:postgres in the URL default, and minio_root_password is hardcoded as "change-me". Both are secrets stored in source. Setting these to None (or using Field(default=None)) and requiring them to be supplied via environment variables prevents accidental use of insecure defaults in any environment.

🔒 Proposed fix
-    postgres_url: str = "postgresql://postgres:postgres@postgres:5432/workflows"
+    postgres_url: str  # Required: set POSTGRES_URL
...
-    minio_root_password: str = "change-me"
+    minio_root_password: str  # Required: set MINIO_ROOT_PASSWORD

As per coding guidelines: "Secrets must be stored in environment variables, never hardcoded in source files."

Also applies to: 23-24

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/settings.py` at line 16, The defaults for
secret-containing settings are hardcoded; change the declarations for
postgres_url and minio_root_password (and any other secrets in this module) to
have no secret defaults by using None (or Field(default=None)) and require them
to be supplied from environment variables (e.g., via pydantic BaseSettings
behavior) so secrets are not stored in source; update any validation or
initialization in the Settings class that uses postgres_url/minio_root_password
to raise a clear error if they remain unset at runtime.

Comment thread README.md Outdated
Comment thread apps/worker/src/five08/worker/jobs.py Outdated
Comment thread apps/worker/src/five08/worker/actors.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
tests/integration/test_email_monitor.py (1)

15-27: ⚠️ Potential issue | 🟠 Major

task_poll_inbox is replaced by AsyncMock, making both IMAP tests vacuous

Line 23 assigns monitor.task_poll_inbox = AsyncMock() on the instance. As a result, every call to await email_monitor.task_poll_inbox() in the two IMAP tests invokes the mock — the real task_poll_inbox implementation (which contains the imaplib.IMAP4_SSL call) is never executed.

Concretely in test_poll_inbox_handles_imap_errors:

  • The patch("imaplib.IMAP4_SSL", side_effect=imaplib.IMAP4.error(...)) context manager on lines 39–41 is dead code.
  • The direct await email_monitor.task_poll_inbox() on line 43 always succeeds trivially.
  • The test provides false confidence that IMAP errors are handled gracefully.

The fix is to not replace task_poll_inbox in the fixture (or create a separate fixture that doesn't replace it), and call the real coroutine directly:

🛠️ Suggested fixture split
 `@pytest.fixture`
 def email_monitor(self, mock_bot):
     """Create an EmailMonitor instance for testing."""
     with patch.object(
         EmailMonitor, "__init__", lambda self, bot: setattr(self, "bot", bot)
     ):
         monitor = EmailMonitor(mock_bot)
-        monitor.task_poll_inbox = AsyncMock()
-        monitor.task_poll_inbox.start = Mock()
-        monitor.task_poll_inbox.cancel = Mock()
-        monitor.task_poll_inbox.is_running = Mock(return_value=False)
+        # Keep task_poll_inbox as real method so IMAP tests exercise real logic.
+        # Provide the loop-control helpers as mocks so background scheduling doesn't start.
+        monitor.task_poll_inbox = Mock(
+            start=Mock(),
+            cancel=Mock(),
+            is_running=Mock(return_value=False),
+        )
         return monitor

+@pytest.fixture
+def email_monitor_real_poll(self, mock_bot):
+    """EmailMonitor with the real task_poll_inbox coroutine for IMAP tests."""
+    with patch.object(
+        EmailMonitor, "__init__", lambda self, bot: setattr(self, "bot", bot)
+    ):
+        monitor = EmailMonitor(mock_bot)
+        return monitor

Then in the IMAP error tests, use email_monitor_real_poll and call the underlying coroutine function directly (e.g. await EmailMonitor.task_poll_inbox.coro(monitor) or await monitor.task_poll_inbox.__wrapped__()).

Also applies to: 39-43

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_email_monitor.py` around lines 15 - 27, The fixture
currently replaces EmailMonitor.task_poll_inbox with an AsyncMock, so
IMAP-related tests never execute the real coroutine; either stop overwriting
task_poll_inbox in the email_monitor fixture or add a separate fixture (e.g.,
email_monitor_real_poll) that leaves the real coroutine in place; update the
IMAP tests (like test_poll_inbox_handles_imap_errors) to use that real monitor
and invoke the underlying coroutine directly (for example await
EmailMonitor.task_poll_inbox.__wrapped__(monitor) or await
EmailMonitor.task_poll_inbox.coro(monitor)) so the imaplib.IMAP4_SSL patch is
exercised.
docker-compose.yml (1)

59-73: ⚠️ Potential issue | 🟠 Major

Remove unnecessary postgres dependency from bot service

The bot service declares depends_on: postgres but the bot code has no database connections—it only integrates with Discord, EspoCRM, Kimai, and email. The environment variables only include REDIS_URL and REDIS_QUEUE_NAME. Remove the postgres dependency to avoid coupling bot startup to an unused service.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.yml` around lines 59 - 73, The bot service's docker-compose
depends_on includes an unnecessary postgres entry; edit the bot service block
(service name "bot") in docker-compose.yml and remove the postgres dependency
and its nested condition (the "postgres:" and "condition: service_healthy"
lines) so depends_on only references redis (or remove depends_on entirely if
only redis remains), making sure YAML indentation stays valid.
apps/worker/src/five08/worker/api.py (1)

128-154: ⚠️ Potential issue | 🟠 Major

Sequential per-event enqueue can stall the response for large payloads.

Each iteration awaits a DB insert + queue push via asyncio.to_thread. For a batch of N events this is O(N) blocking round-trips before the 202 is returned. Per the coding guideline, ingest endpoints should "return HTTP 202 status quickly without blocking on long processing."

Consider either:

  • Batching the DB inserts in a single transaction and then enqueuing, or
  • Enqueuing the entire list as a single job and fanning out inside the consumer.
Sketch: batch insert + batch enqueue
     queue = request.app[QUEUE_KEY]
-    jobs: list[dict[str, str]] = []
-    for event in payload.events:
-        job = await asyncio.to_thread(
-            enqueue_job,
-            queue=queue,
-            fn=process_contact_skills_job,
-            args=(event.id,),
-            settings=settings,
-            idempotency_key=f"espocrm:{event.id}",
-        )
-        jobs.append({"contact_id": event.id, "job_id": job.id})
+    jobs = await asyncio.to_thread(
+        _enqueue_espocrm_batch, queue, payload.events, settings
+    )

Where _enqueue_espocrm_batch creates all records in one transaction and then enqueues only the newly-created ones.

As per coding guidelines, "Worker ingest endpoints must validate input, enqueue jobs, and return HTTP 202 status quickly without blocking on long processing".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/api.py` around lines 128 - 154, The current
loop awaits asyncio.to_thread(enqueue_job, ...) for each payload.events entry
causing sequential blocking; instead, stop awaiting per-event enqueues and kick
off a background batch worker: implement a helper (e.g., _enqueue_espocrm_batch)
that accepts the list of event IDs plus queue and settings, does a single DB
batch insert/transaction and enqueues jobs (or enqueues one batch job that fans
out), and from the request handler call
asyncio.create_task(_enqueue_espocrm_batch(queue, [e.id for e in
payload.events], settings)) then immediately return the 202 JSON; remove the
per-event await of enqueue_job and the jobs-building loop from the critical path
so the handler returns quickly while background processing handles DB inserts
and enqueue_job/process_contact_skills_job scheduling.
🧹 Nitpick comments (10)
tests/integration/test_email_monitor.py (1)

46-62: Inconsistent exception-handling pattern vs the updated sibling test

test_poll_inbox_handles_imap_errors (line 43) was updated to a bare await (correctly asserting no exception is raised), but this test still wraps the call in a try/except Exception: pass that silently swallows all errors. Once the fixture issue above is fixed and these tests exercise the real implementation, the two tests should settle on a consistent contract:

  • If the method is expected never to raise → bare await, no try/except.
  • If it may raise for parsing errors and that's acceptable → document this with pytest.raises or an explicit comment tying the behaviour to a spec.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_email_monitor.py` around lines 46 - 62, Update the
test_poll_inbox_handles_email_parsing_errors test to follow the same exception
contract as its sibling test_poll_inbox_handles_imap_errors: remove the
try/except that swallows all exceptions and either (A) call await
email_monitor.task_poll_inbox() directly (if the method is expected never to
raise) or (B) explicitly assert the acceptable parsing failure using
pytest.raises and a short comment (if parsing errors are allowed); reference the
test function test_poll_inbox_handles_email_parsing_errors and the method
task_poll_inbox when making the change.
apps/worker/Dockerfile (1)

1-1: Consider adding a comment noting which tag this digest corresponds to.

Pinning by digest is great for reproducibility, but without a comment like # python3.12-bookworm-slim as of YYYY-MM-DD, maintainers won't know which upstream version is in use or when it's time to bump.

✏️ Suggested change
-FROM ghcr.io/astral-sh/uv@sha256:e5b65587bce7de595f299855d7385fe7fca39b8a74baa261ba1b7147afa78e58
+# python3.12-bookworm-slim as of 2026-02-21
+FROM ghcr.io/astral-sh/uv@sha256:e5b65587bce7de595f299855d7385fe7fca39b8a74baa261ba1b7147afa78e58
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/Dockerfile` at line 1, Add a one-line comment above the FROM line
in the Dockerfile that documents the human-readable tag and date for the pinned
digest (e.g., the upstream image tag like "python3.12-bookworm-slim" and the
date verified), so maintainers can see which tag the digest corresponds to and
when it should be reviewed; update the comment whenever you intentionally bump
the digest referenced in the existing FROM
ghcr.io/astral-sh/uv@sha256:e5b65587bce7de595f299855d7385fe7fca39b8a74baa261ba1b7147afa78e58
entry.
apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py (1)

19-21: Consider adding a server_default for the UUID primary key as defense-in-depth.

If the application layer ever omits the id, the INSERT will fail with a NOT NULL violation rather than auto-generating a UUID. Adding server_default=sa.text("gen_random_uuid()") provides a safety net without changing application behavior.

✏️ Suggested change
         sa.Column(
-            "id", postgresql.UUID(as_uuid=True), nullable=False, primary_key=True
+            "id", postgresql.UUID(as_uuid=True), nullable=False, primary_key=True,
+            server_default=sa.text("gen_random_uuid()"),
         ),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py`
around lines 19 - 21, The UUID primary key column definition for "id" currently
lacks a server-side default; update the sa.Column("id",
postgresql.UUID(as_uuid=True), nullable=False, primary_key=True) definition to
include server_default=sa.text("gen_random_uuid()") so the DB will generate a
UUID if the application omits the id on INSERT; modify the column in the
migration where the "id" column is declared to add that server_default while
keeping nullable=False and primary_key=True.
apps/worker/src/five08/worker/crm/processor.py (1)

109-113: Inconsistent case-folding: .lower() here vs .casefold() in _extract_from_attachments.

Line 109 uses .lower() for the "existing skills" comparison, while line 179 uses .casefold() for deduplication. For most Latin text this won't matter, but casefold() handles certain Unicode cases differently (e.g., German ßss). Consider using casefold() consistently.

✏️ Suggested change
-            existing_lower = {item.lower() for item in existing_skills}
+            existing_lower = {item.casefold() for item in existing_skills}
             new_skills = [
                 skill
                 for skill in extracted.skills
-                if skill.lower() not in existing_lower
+                if skill.casefold() not in existing_lower
             ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/crm/processor.py` around lines 109 - 113, The
comparison uses .lower() when building existing_lower and .casefold() elsewhere
(_extract_from_attachments), causing inconsistent Unicode case handling; update
the existing_lower construction and any other .lower() usages in this
deduplication path (the existing_lower set creation and the new_skills
membership check in processor.py where extracted.skills is iterated) to use
.casefold() so both sides normalize text with the same Unicode-aware method.
apps/discord_bot/Dockerfile (1)

1-1: Same as the worker Dockerfile — add a comment noting the original tag for this pinned digest.

See the corresponding comment on apps/worker/Dockerfile.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/discord_bot/Dockerfile` at line 1, Add a one-line comment above the
existing FROM instruction that records the original image tag that was used to
produce the pinned sha256 digest (mirroring the change in the worker
Dockerfile); locate the FROM line (FROM ghcr.io/astral-sh/uv@sha256:...) in the
Dockerfile and insert a comment describing the corresponding original tag (e.g.,
the tag name or release) and any context so readers know which tag this digest
was pinned from.
apps/worker/pyproject.toml (1)

10-10: dramatiq[redis]==2.0.1 uses exact pinning — consider relaxing or documenting the reason.

All other dependencies use >= or ~= for range-based pinning (e.g., aiohttp>=3.13.1, pydantic~=2.10), but dramatiq is pinned to exactly 2.0.1. While 2.0.1 is currently the latest version available (as of Feb 2026), exact pins will prevent receiving bugfix and security patches once newer versions are released. If this is deliberate (e.g., to avoid breaking changes), add a comment explaining the constraint. Otherwise, consider relaxing to ~=2.0 to allow patch-level updates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/pyproject.toml` at line 10, The dependency pin
"dramatiq[redis]==2.0.1" in pyproject.toml is exact-pinned; either relax it to
allow safe patch updates (e.g., use "~=2.0" or ">=2.0.1,<3.0") or add an inline
comment explaining why the exact pin is required; update the line referencing
dramatiq[redis] accordingly and ensure any CI/docs note the intentional
constraint if you keep "==2.0.1".
packages/shared/src/five08/queue.py (2)

77-90: Every DB operation opens and closes a fresh connection — consider a connection pool.

get_postgres_connection is called from create_job_record, get_job, and every _mark_job variant. Under load, a single job lifecycle (create → running → succeeded) opens three separate TCP connections. A psycopg_pool.ConnectionPool (from the companion psycopg_pool package) would amortise that cost and is straightforward to integrate.

Not blocking, but worth planning for before scaling up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/queue.py` around lines 77 - 90, Replace the
per-call connect() usage with a shared connection pool: introduce a module-level
psycopg_pool.ConnectionPool (create it from SharedSettings.postgres_url) and
update get_postgres_connection to return a pooled connection (or provide a small
wrapper like acquire_postgres_connection that yields pool.connection()). Update
callers (create_job_record, get_job, and the _mark_job* functions) to use the
pooled connection context instead of opening fresh TCP connections, and ensure
is_postgres_healthy uses the pool to acquire a connection for the health check;
also add safe pool teardown or lazy init tied to application lifecycle so tests
and startup work correctly.

282-300: mark_job_retry sets status to FAILED — consider documenting the state machine.

A reader might expect a retry to leave the job in QUEUED or a dedicated RETRYING state. Using FAILED for "most recent attempt failed, will retry" is a valid choice, but the distinction between a retryable FAILED (attempts < max_attempts) and a terminal DEAD job is implicit. A brief docstring or comment on the state transitions would help future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/shared/src/five08/queue.py` around lines 282 - 300, Update the
mark_job_retry function docstring to explicitly document the state machine:
explain that mark_job_retry calls _mark_job with status=JobStatus.FAILED to
indicate a retryable failure (attempts < max attempts) rather than a terminal
dead job, and note that terminal jobs are represented by JobStatus.DEAD; include
a short note about when callers should use mark_job_retry vs the code path that
sets JobStatus.DEAD and reference the involved symbols (_mark_job,
mark_job_retry, JobStatus.FAILED, JobStatus.DEAD) so future maintainers can find
the lifecycle logic.
apps/worker/src/five08/worker/api.py (2)

49-67: Health check opens a new Postgres connection on every call.

is_postgres_healthy calls get_postgres_connection, which creates a fresh connection each time. Under frequent polling (e.g., Kubernetes liveness probes), this can create unnecessary connection churn. Consider caching or using a connection pool for health checks in the future.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/api.py` around lines 49 - 67, Health check
currently opens a new Postgres connection on every call via
is_postgres_healthy/get_postgres_connection; change health_handler to use a
shared pool/connection stored on the app (e.g., app[POSTGRES_POOL_KEY] created
at startup) instead of creating a fresh connection per request: update startup
to create and store an asyncpg pool (or persistent connection), then have
health_handler call a lightweight check using that pool (e.g., with pool.acquire
or pool.fetchval) or call an adjusted is_postgres_healthy that accepts the pool;
ensure teardown closes the pool on shutdown.

157-190: Manual idempotency key uses wall-clock timestamp — not truly unique.

manual:<contact_id>:<isoformat> relies on datetime.now() precision. Two rapid requests for the same contact_id could collide within the same microsecond (unlikely, but possible under high concurrency). Since this is a manual endpoint the practical risk is very low; a short uuid4() suffix would eliminate it entirely if you ever need stronger guarantees.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/api.py` around lines 157 - 190,
process_contact_handler builds an idempotency_key using manual_nonce =
datetime.now(...).isoformat() which can collide under high concurrency; change
the manual_nonce (or the idempotency component) to include a short uuid4 suffix
(e.g. append uuid.uuid4() or uuid.uuid4().hex) when calling enqueue_job so
idempotency_key uses f"manual:{contact_id}:{manual_nonce}:{uuid4}" (and add the
uuid import) to guarantee uniqueness without removing the timestamp.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.env.example:
- Line 15: The POSTGRES_PASSWORD line contains an inline comment that will be
parsed into the value; update the .env.example so the value is clean by either
removing the inline comment and placing the note on its own line as a standalone
comment (e.g. "# change in production") or by quoting the value exactly as
POSTGRES_PASSWORD="postgres" and moving the guidance to a separate comment;
target the POSTGRES_PASSWORD entry to apply this change.

In `@apps/discord_bot/src/five08/__init__.py`:
- Around line 1-5: Delete the explicit package initializers causing
duplicate-module errors: remove the __init__.py files from each five08 package
root (the files in apps/discord_bot/src/five08/, apps/worker/src/five08/, and
packages/shared/src/five08/) to convert five08 into an implicit namespace
package (PEP 420); if you cannot remove them, instead add mypy settings to
enable namespace packages (set namespace_packages = True, explicit_package_bases
= True and configure mypy_path to include
apps/discord_bot/src:apps/worker/src:packages/shared/src) so mypy will not
report duplicate module errors.

In `@apps/worker/src/five08/worker/crm/skills_extractor.py`:
- Around line 101-105: The heuristic regex in token_matches
(r"\b[a-z][a-z0-9+#\-.]{2,24}\b") excludes 2-letter skills like "go", so update
the pattern used when building token_matches in skills_extractor.py (the code
that feeds detected) to allow 2-character tokens (e.g., change the quantifier to
{1,24} or otherwise permit a single additional character) or add an explicit
check to include tokens of length 2 that appear in COMMON_SKILLS; ensure the
change is applied where token_matches is defined so detected can correctly
contain "go".

In `@README.md`:
- Line 117: Update the README note to clarify that MINIO_ACCESS_KEY and
MINIO_SECRET_KEY are code-level `@property` aliases on SharedSettings and not
environment-configurable variables: explain that pydantic-settings only loads
env values into attributes declared with Field(...) and therefore setting
MINIO_ACCESS_KEY or MINIO_SECRET_KEY in the environment will be ignored, and
instruct users to use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD as the actual env
vars; reference the SharedSettings class and the alias property names
(MINIO_ACCESS_KEY, MINIO_SECRET_KEY) and the real env vars (MINIO_ROOT_USER,
MINIO_ROOT_PASSWORD) in the wording.

---

Outside diff comments:
In `@apps/worker/src/five08/worker/api.py`:
- Around line 128-154: The current loop awaits asyncio.to_thread(enqueue_job,
...) for each payload.events entry causing sequential blocking; instead, stop
awaiting per-event enqueues and kick off a background batch worker: implement a
helper (e.g., _enqueue_espocrm_batch) that accepts the list of event IDs plus
queue and settings, does a single DB batch insert/transaction and enqueues jobs
(or enqueues one batch job that fans out), and from the request handler call
asyncio.create_task(_enqueue_espocrm_batch(queue, [e.id for e in
payload.events], settings)) then immediately return the 202 JSON; remove the
per-event await of enqueue_job and the jobs-building loop from the critical path
so the handler returns quickly while background processing handles DB inserts
and enqueue_job/process_contact_skills_job scheduling.

In `@docker-compose.yml`:
- Around line 59-73: The bot service's docker-compose depends_on includes an
unnecessary postgres entry; edit the bot service block (service name "bot") in
docker-compose.yml and remove the postgres dependency and its nested condition
(the "postgres:" and "condition: service_healthy" lines) so depends_on only
references redis (or remove depends_on entirely if only redis remains), making
sure YAML indentation stays valid.

In `@tests/integration/test_email_monitor.py`:
- Around line 15-27: The fixture currently replaces EmailMonitor.task_poll_inbox
with an AsyncMock, so IMAP-related tests never execute the real coroutine;
either stop overwriting task_poll_inbox in the email_monitor fixture or add a
separate fixture (e.g., email_monitor_real_poll) that leaves the real coroutine
in place; update the IMAP tests (like test_poll_inbox_handles_imap_errors) to
use that real monitor and invoke the underlying coroutine directly (for example
await EmailMonitor.task_poll_inbox.__wrapped__(monitor) or await
EmailMonitor.task_poll_inbox.coro(monitor)) so the imaplib.IMAP4_SSL patch is
exercised.

---

Duplicate comments:
In `@apps/worker/src/five08/__init__.py`:
- Around line 1-5: This __init__.py is creating an explicit namespace package
via extend_path/__path__ and causing mypy "Duplicate module" errors; remove this
file (delete the five08/__init__.py that contains extend_path and __path__ =
extend_path(__path__, __name__)) so the package becomes an implicit namespace
package, and ensure the corresponding other five08/__init__.py is also removed
or aligned with implicit namespaces and your packaging config (pyproject/setup)
is updated accordingly.

In `@apps/worker/src/five08/worker/actors.py`:
- Around line 89-91: The handler lookup using _HANDLERS.get(job.type) can raise
a ValueError path that leaves the job permanently QUEUED/FAILED; instead, if no
handler exists you must mark the job DEAD and return before raising or
proceeding. In the execute_job flow (where mark_job_running is called earlier),
replace the current raise ValueError path by detecting handler is None, calling
the existing job-death helper (e.g., mark_job_dead or jobs.mark_job_dead) with
the job and a brief reason like "unknown job type", ensure any DB/state update
happens before returning so the message is terminal, and avoid re-raising the
exception; keep references to _HANDLERS, execute_job, and mark_job_running to
locate the change.

In `@docker-compose.yml`:
- Around line 96-100: The worker services (worker-api and worker-consumer) lack
a depends_on entry for minio-init, so they may start before MinIO buckets are
created; update the depends_on block for both service definitions (worker-api
and worker-consumer) to include minio-init with condition:
service_completed_successfully (or service_healthy if your compose version uses
healthchecks), ensuring minio-init runs to completion before those workers
start.
- Around line 21-22: The Postgres port mapping currently exposes the container
on all interfaces via the ports entry and POSTGRES_PORT; change it to include an
explicit host bind like MINIO's pattern so it defaults to loopback. Update the
ports mapping for Postgres (the ports key that references
${POSTGRES_PORT:-5432}) to include a host bind prefix using a new or existing
env var (e.g., POSTGRES_HOST_BIND) and default to 127.0.0.1 so the mapping reads
the host bind then port then container port; make sure to reference
POSTGRES_PORT and the ports key when applying the change.
- Line 85: The POSTGRES_URL default embeds plaintext credentials
("postgres:postgres"), so remove hardcoded credentials and make the URL
reference separate environment variables or secrets instead; update the
POSTGRES_URL definition to build from POSTGRES_USER and POSTGRES_PASSWORD (or
omit credentials and rely on service-level auth/secrets) and ensure any
duplicate occurrences (e.g., the second POSTGRES_URL usage) are changed
consistently, and add POSTGRES_USER/POSTGRES_PASSWORD (or secret) entries where
the compose file sets DB credentials.

In `@packages/shared/src/five08/queue.py`:
- Around line 130-192: create_job_record already wraps payload with
Jsonb(payload) and correctly implements idempotent insert (ON CONFLICT
(idempotency_key) DO NOTHING) with a fallback SELECT to load the existing job;
no code change required—leave function create_job_record, the Jsonb(payload)
use, and the SELECT fallback as-is since existing is fetched before the
cursor/connection close.
- Around line 99-105: The _parse_status function now logs a warning and returns
JobStatus.FAILED for unknown DB status strings; ensure this behavior is covered
by a unit test that calls _parse_status with an unexpected value and asserts it
returns JobStatus.FAILED and that logger.warning("Unknown job status from DB:
%s", value) is invoked; update or add tests referencing _parse_status to
validate both known enum conversions and the fallback logging/return path.

In `@packages/shared/src/five08/settings.py`:
- Line 23: The file currently hardcodes credentials in postgres_url and
minio_root_password; remove these secrets from defaults and instead load them
from environment variables (e.g., using os.getenv or the project’s settings
loader) and fail/raise validation if they are missing in non-local environments;
update the symbols postgres_url and minio_root_password so their defaults are
not secret values (use None or empty) and add runtime validation in the same
settings class or initializer to require the corresponding environment variables
(e.g., POSTGRES_URL, MINIO_ROOT_PASSWORD) before allowing production runs.

---

Nitpick comments:
In `@apps/discord_bot/Dockerfile`:
- Line 1: Add a one-line comment above the existing FROM instruction that
records the original image tag that was used to produce the pinned sha256 digest
(mirroring the change in the worker Dockerfile); locate the FROM line (FROM
ghcr.io/astral-sh/uv@sha256:...) in the Dockerfile and insert a comment
describing the corresponding original tag (e.g., the tag name or release) and
any context so readers know which tag this digest was pinned from.

In `@apps/worker/Dockerfile`:
- Line 1: Add a one-line comment above the FROM line in the Dockerfile that
documents the human-readable tag and date for the pinned digest (e.g., the
upstream image tag like "python3.12-bookworm-slim" and the date verified), so
maintainers can see which tag the digest corresponds to and when it should be
reviewed; update the comment whenever you intentionally bump the digest
referenced in the existing FROM
ghcr.io/astral-sh/uv@sha256:e5b65587bce7de595f299855d7385fe7fca39b8a74baa261ba1b7147afa78e58
entry.

In `@apps/worker/pyproject.toml`:
- Line 10: The dependency pin "dramatiq[redis]==2.0.1" in pyproject.toml is
exact-pinned; either relax it to allow safe patch updates (e.g., use "~=2.0" or
">=2.0.1,<3.0") or add an inline comment explaining why the exact pin is
required; update the line referencing dramatiq[redis] accordingly and ensure any
CI/docs note the intentional constraint if you keep "==2.0.1".

In `@apps/worker/src/five08/worker/api.py`:
- Around line 49-67: Health check currently opens a new Postgres connection on
every call via is_postgres_healthy/get_postgres_connection; change
health_handler to use a shared pool/connection stored on the app (e.g.,
app[POSTGRES_POOL_KEY] created at startup) instead of creating a fresh
connection per request: update startup to create and store an asyncpg pool (or
persistent connection), then have health_handler call a lightweight check using
that pool (e.g., with pool.acquire or pool.fetchval) or call an adjusted
is_postgres_healthy that accepts the pool; ensure teardown closes the pool on
shutdown.
- Around line 157-190: process_contact_handler builds an idempotency_key using
manual_nonce = datetime.now(...).isoformat() which can collide under high
concurrency; change the manual_nonce (or the idempotency component) to include a
short uuid4 suffix (e.g. append uuid.uuid4() or uuid.uuid4().hex) when calling
enqueue_job so idempotency_key uses
f"manual:{contact_id}:{manual_nonce}:{uuid4}" (and add the uuid import) to
guarantee uniqueness without removing the timestamp.

In `@apps/worker/src/five08/worker/crm/processor.py`:
- Around line 109-113: The comparison uses .lower() when building existing_lower
and .casefold() elsewhere (_extract_from_attachments), causing inconsistent
Unicode case handling; update the existing_lower construction and any other
.lower() usages in this deduplication path (the existing_lower set creation and
the new_skills membership check in processor.py where extracted.skills is
iterated) to use .casefold() so both sides normalize text with the same
Unicode-aware method.

In
`@apps/worker/src/five08/worker/migrations/versions/20260221_0100_create_jobs_table.py`:
- Around line 19-21: The UUID primary key column definition for "id" currently
lacks a server-side default; update the sa.Column("id",
postgresql.UUID(as_uuid=True), nullable=False, primary_key=True) definition to
include server_default=sa.text("gen_random_uuid()") so the DB will generate a
UUID if the application omits the id on INSERT; modify the column in the
migration where the "id" column is declared to add that server_default while
keeping nullable=False and primary_key=True.

In `@packages/shared/src/five08/queue.py`:
- Around line 77-90: Replace the per-call connect() usage with a shared
connection pool: introduce a module-level psycopg_pool.ConnectionPool (create it
from SharedSettings.postgres_url) and update get_postgres_connection to return a
pooled connection (or provide a small wrapper like acquire_postgres_connection
that yields pool.connection()). Update callers (create_job_record, get_job, and
the _mark_job* functions) to use the pooled connection context instead of
opening fresh TCP connections, and ensure is_postgres_healthy uses the pool to
acquire a connection for the health check; also add safe pool teardown or lazy
init tied to application lifecycle so tests and startup work correctly.
- Around line 282-300: Update the mark_job_retry function docstring to
explicitly document the state machine: explain that mark_job_retry calls
_mark_job with status=JobStatus.FAILED to indicate a retryable failure (attempts
< max attempts) rather than a terminal dead job, and note that terminal jobs are
represented by JobStatus.DEAD; include a short note about when callers should
use mark_job_retry vs the code path that sets JobStatus.DEAD and reference the
involved symbols (_mark_job, mark_job_retry, JobStatus.FAILED, JobStatus.DEAD)
so future maintainers can find the lifecycle logic.

In `@tests/integration/test_email_monitor.py`:
- Around line 46-62: Update the test_poll_inbox_handles_email_parsing_errors
test to follow the same exception contract as its sibling
test_poll_inbox_handles_imap_errors: remove the try/except that swallows all
exceptions and either (A) call await email_monitor.task_poll_inbox() directly
(if the method is expected never to raise) or (B) explicitly assert the
acceptable parsing failure using pytest.raises and a short comment (if parsing
errors are allowed); reference the test function
test_poll_inbox_handles_email_parsing_errors and the method task_poll_inbox when
making the change.

Comment thread .env.example
Comment thread apps/discord_bot/src/five08/__init__.py
Comment on lines 101 to 105
token_matches = re.findall(r"\b[a-z][a-z0-9+#\-.]{2,24}\b", lowered)
detected: set[str] = set()
for token in token_matches:
if token in COMMON_SKILLS:
detected.add(token)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

"go" is silently excluded from heuristic detection.

COMMON_SKILLS contains "go" (2 chars), but the regex r"\b[a-z][a-z0-9+#\-.]{2,24}\b" requires a minimum of 3 characters ([a-z] + {2,24}), so "go" will never be emitted by re.findall and can never enter detected. The LLM path is unaffected; only the heuristic fallback silently drops Go.

🐛 Proposed fix
-        token_matches = re.findall(r"\b[a-z][a-z0-9+#\-.]{2,24}\b", lowered)
+        token_matches = re.findall(r"\b[a-z][a-z0-9+#\-.]{1,24}\b", lowered)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
token_matches = re.findall(r"\b[a-z][a-z0-9+#\-.]{2,24}\b", lowered)
detected: set[str] = set()
for token in token_matches:
if token in COMMON_SKILLS:
detected.add(token)
token_matches = re.findall(r"\b[a-z][a-z0-9+#\-.]{1,24}\b", lowered)
detected: set[str] = set()
for token in token_matches:
if token in COMMON_SKILLS:
detected.add(token)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/five08/worker/crm/skills_extractor.py` around lines 101 -
105, The heuristic regex in token_matches (r"\b[a-z][a-z0-9+#\-.]{2,24}\b")
excludes 2-letter skills like "go", so update the pattern used when building
token_matches in skills_extractor.py (the code that feeds detected) to allow
2-character tokens (e.g., change the quantifier to {1,24} or otherwise permit a
single additional character) or add an explicit check to include tokens of
length 2 that appear in COMMON_SKILLS; ensure the change is applied where
token_matches is defined so detected can correctly contain "go".

Comment thread README.md
- `MINIO_HOST_BIND` (default: `127.0.0.1`; set to `0.0.0.0` to expose MinIO)
- `MINIO_API_PORT` (default: `9000`)
- `MINIO_CONSOLE_PORT` (default: `9001`)
- `MINIO_ACCESS_KEY` / `MINIO_SECRET_KEY` (compatibility aliases; use `MINIO_ROOT_USER` and `MINIO_ROOT_PASSWORD` by default for internal transfers)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

MINIO_ACCESS_KEY / MINIO_SECRET_KEY are not settable environment variables.

Both are implemented as @property aliases on SharedSettings, not as Field(...) declarations. pydantic-settings only reads env vars into Field-backed attributes; setting MINIO_ACCESS_KEY or MINIO_SECRET_KEY in the environment will be silently ignored. Update the docs to clarify these are code-level aliases for MINIO_ROOT_USER / MINIO_ROOT_PASSWORD, not independently configurable env vars.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` at line 117, Update the README note to clarify that
MINIO_ACCESS_KEY and MINIO_SECRET_KEY are code-level `@property` aliases on
SharedSettings and not environment-configurable variables: explain that
pydantic-settings only loads env values into attributes declared with Field(...)
and therefore setting MINIO_ACCESS_KEY or MINIO_SECRET_KEY in the environment
will be ignored, and instruct users to use MINIO_ROOT_USER and
MINIO_ROOT_PASSWORD as the actual env vars; reference the SharedSettings class
and the alias property names (MINIO_ACCESS_KEY, MINIO_SECRET_KEY) and the real
env vars (MINIO_ROOT_USER, MINIO_ROOT_PASSWORD) in the wording.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant