Skip to content

Conversation

pjbgf
Copy link

@pjbgf pjbgf commented Sep 17, 2025

Running containers as a non-root user is a long standing security practice. The changes ensure that the sourcebot user is created and has the correct level of permissions to run all its dependencies (postgres, redis and node).

Please note that as a side effect, existing mounted volumes would need to have their ownership reviewed or it may not be able to access the files. This is specially the case for previous versions that would create said files as 0:0.

To fix that, users can run chown -R 1500:1500 /path/.sourcebot. The chmod may also need to be a bit more strict in such cases, so changing that is advised: chmod -R 0750 /path/.sourcebot.

Fixes #302.

Summary by CodeRabbit

  • New Features
    • Container now runs as a non-root user by default for safer execution.
  • Bug Fixes
    • More reliable database initialization and startup with automatic role creation when missing.
    • Readiness checks now detect and fail fast on early database process exits.
    • Reduced permission-related startup issues via tightened directory permissions and ownership.
  • Refactor
    • Improved process supervision and logging with a dedicated application log directory.
  • Chores
    • General container hardening and cleanup while preserving existing port and environment settings.

Copy link

coderabbitai bot commented Sep 17, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The Dockerfile now creates and switches to a non-root user (sourcebot), adjusts ownership/permissions for app and data paths, and adds supervision-related scripts/config. The entrypoint script changes PostgreSQL initialization/startup to run without su, sets directory permissions to 0750, creates the postgres role, adds process liveness checks, and retains DB existence handling.

Changes

Cohort / File(s) Summary of Changes
Non-root container user and permissions
Dockerfile
Creates group/user sourcebot (gid/uid 1500), sets /app and /data ownership to sourcebot, creates /var/log/sourcebot with ownership, adds sourcebot to postgres/redis/node groups, copies supervision assets, ensures prefix-output.sh is executable, inserts USER sourcebot, retains EXPOSE 3000 and env.
Entrypoint PostgreSQL init/run flow
entrypoint.sh
Applies 0750 to DATA_CACHE_DIR, DATABASE_DATA_DIR, REDIS_DATA_DIR; runs initdb -D "$DATABASE_DATA_DIR" directly; starts postgres with postgres -D; adds liveness check during readiness loop; creates superuser role via createuser postgres -s; keeps DB existence check and creates DB if missing; minor log/flow tweaks; newline at EOF.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as Container Runtime
  participant E as entrypoint.sh
  participant PG as postgres (process)
  participant DB as sourcebot DB

  U->>E: Start container
  Note over E: Ensure directories exist (0750)<br/>DATA_CACHE_DIR, DATABASE_DATA_DIR, REDIS_DATA_DIR
  E->>E: initdb -D "$DATABASE_DATA_DIR" (if not initialized)
  E->>PG: Start postgres -D "$DATABASE_DATA_DIR"
  loop Wait for readiness
    E->>PG: Check port/ready
    E->>PG: Verify process alive
    alt PG stopped
      E->>U: Exit 1
    end
  end
  E->>PG: createuser postgres -s (if missing)
  E->>PG: Check DB existence
  alt DB missing
    E->>PG: createdb sourcebot
  end
  E->>U: Proceed to app startup (supervised)
  Note over E,PG: Running as non-root user "sourcebot"
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

Thump-thump! I hop through Docker’s den,
A non-root burrow, safe as Zen.
Postgres wakes without a su,
Checks its pulse—alive anew.
Permissions neat, logs softly hum—
My whiskers twitch: “It’s up! We run!” 🐇🚀

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The PR implements key image-level fixes from issue #302: it creates a sourcebot non-root user, chowns internal directories (/app and /data), adds group memberships, and updates entrypoint logic which together address many permission problems inside the image. However the PR explicitly acknowledges that host-mounted volumes created as root:root will still require manual chown on the host and does not automate that remediation. Because the linked issue requests compatibility when running the container with arbitrary non-root UID:GID (including typical bind mounts), the PR only partially satisfies the issue's objectives. To fully address #302, add startup logic that (when the container starts as root) chowns configured mount points to the runtime UID/GID before dropping privileges, or provide a documented helper script and clear remediation steps plus an integration test showing an arbitrary non-root UID:GID with bind mounts. Ensure entrypoint changes remain compatible with external Postgres/Redis deployments and add a short troubleshooting note in the README explaining ownership issues and remediation. These steps will close the remaining gap for users who rely on host-mounted volumes.
Out of Scope Changes Check ⚠️ Warning The changes include several modifications beyond permission fixes: the entrypoint replaces the su-based Postgres initialization with direct initdb/postgres calls and adds a createuser postgres step, and the Dockerfile copies supervisord configuration and related scripts, altering service startup semantics. These runtime and supervision changes change behavior rather than solely fixing file ownership and therefore appear out-of-scope relative to the linked issue, which focused on making the image usable as a non-root process. Such changes should be justified or separated because they can introduce regressions for users who run Postgres/Redis externally or expect the previous startup flow. Split the PR: isolate non-root user and permission fixes (USER, chown, group memberships) into one change and move Postgres/initdb and supervisord/service supervision changes into a separate PR with explicit rationale, tests, and compatibility notes. Add integration tests covering both bundled and external Postgres/Redis scenarios and document any behavior changes in the README. If the Postgres/init changes are required for non-root support, include comments and tests demonstrating why they are necessary and how prior behavior is preserved.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "security: run Docker container as non-root user" concisely describes the primary change — running the container as a non-root user for security. It matches the Dockerfile and entrypoint edits that add a sourcebot user, adjust ownership, and switch to USER sourcebot. The "security:" prefix appropriately signals intent and is useful for scanning commit history.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Dockerfile Outdated
chown -R postgres:postgres /run/postgresql && \
chmod 775 /run/postgresql

# To run as non-root, the user must be past of postgres, redis and node.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# To run as non-root, the user must be past of postgres, redis and node.
# To run as non-root, the user must be part of postgres, redis and node groups

@brendan-kellam
Copy link
Contributor

@coderabbitai review

Copy link

coderabbitai bot commented Sep 17, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@brendan-kellam
Copy link
Contributor

brendan-kellam commented Sep 17, 2025

Thanks for this PR!

Please note that as a side effect, existing mounted volumes would need to have their ownership reviewed or it may not be able to access the files. This is specially the case for previous versions that would create said files as 0:0.

What would the error message be in this case? And when in the execution flow would it this be hit?

EDIT:
I tried the upgrade path from v4.6.8 -> image built against this branch and hit the following:

➜  /tmp docker run -p 3000:3000 --rm -v $(pwd):/data -e CONFIG_PATH=/data/config.json --name sourcebot sourcebot
[Info] Sourcebot version:
[Warning] NEXT_PUBLIC_POSTHOG_PAPIK was not set. Setting SOURCEBOT_TELEMETRY_DISABLED.
[Info] Disabling telemetry since SOURCEBOT_TELEMETRY_DISABLED was set.
[Warning] SOURCEBOT_ENCRYPTION_KEY is not set.
[Info] Loading environment variables from /data/.sourcebot/.secret
[Warning] AUTH_SECRET is not set.
[Info] Loading environment variables from /data/.sourcebot/.authjs-secret
[Warning] AUTH_URL is not set.
[Info] Upgraded from version v4.6.8 to
localhost:5432 - no response
[Info] Waiting for the database to be ready...
2025-09-17 19:39:54.733 UTC [14] FATAL:  data directory "/data/.sourcebot/db" has wrong ownership
2025-09-17 19:39:54.733 UTC [14] HINT:  The server must be started by the user that owns the data directory.
postgres failed to run

I spun down the instance and changed the owner:

sudo chown -R 1500:1500 .sourcebot

And this the following:

➜  /tmp docker run -p 3000:3000 --rm -v $(pwd):/data -e CONFIG_PATH=/data/config.json --name sourcebot sourcebot
[Info] Sourcebot version:
[Warning] NEXT_PUBLIC_POSTHOG_PAPIK was not set. Setting SOURCEBOT_TELEMETRY_DISABLED.
[Info] Disabling telemetry since SOURCEBOT_TELEMETRY_DISABLED was set.
[Info] Initializing database at \ATA_DIR...
mkdir: can't create directory '/data/.sourcebot/db': Permission denied

Any ideas on what's going on?

To fix that, users can run chown -R 1500:1500 /path/.sourcebot. The chmod may also need to be a bit more strict in such cases, so changing that is advised: chmod -R 0750 /path/.sourcebot.

Is this something that could be done automatically during startup in the entrypoint? If not, it would be great to have a small upgrade doc (example) that we can link in the changelog so that users know what to do if they hit a error.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
Dockerfile (1)

242-252: Typos and group/ownership consistency for non-root setup

  • Fix typo “past” → “part”.
  • Prefer chown user:group for /data and /app to keep primary group consistent.
  • Use -p for /var/log/sourcebot creation and chown user:group.
-# To run as non-root, the user must be past of postgres, redis and node.
+# To run as non-root, the user must be part of the postgres, redis, and node groups.
 RUN addgroup -g 1500 sourcebot && \
     adduser -D -u 1500 -h /app -S sourcebot && \
     adduser sourcebot postgres && \
     adduser sourcebot redis && \
     adduser sourcebot node && \
-    chown -R sourcebot /data && \
-    chown -R sourcebot /app && \
-    mkdir /var/log/sourcebot && \
-    chown sourcebot /var/log/sourcebot
+    chown -R sourcebot:sourcebot /data && \
+    chown -R sourcebot:sourcebot /app && \
+    mkdir -p /var/log/sourcebot && \
+    chown -R sourcebot:sourcebot /var/log/sourcebot
🧹 Nitpick comments (4)
entrypoint.sh (4)

33-33: Quote paths and prefer install(1) for directory creation

Using mkdir -m with -p only applies the mode to the last component and paths aren’t quoted. Safer and clearer to use install -d with quotes.

-    mkdir -m 0750 -p "$DATA_CACHE_DIR"
+    install -d -m 0750 -- "$DATA_CACHE_DIR"

46-46: Quote paths and prefer install(1) for directory creation

Same mkdir -m with -p pattern; quote the var.

-    mkdir -m 0750 -p $REDIS_DATA_DIR
+    install -d -m 0750 -- "$REDIS_DATA_DIR"

152-154: Unnecessary createuser after switching initdb to -U postgres; at minimum make idempotent

If you adopt initdb -U postgres, this block is redundant and should be removed. If you keep current initdb behavior, add --if-not-exists to avoid failure on restarts.

Option A (preferred, when using initdb -U postgres):

-    # Running as non-root we need to ensure the postgres account is created.
-    createuser postgres -s

Option B (keep current initdb user, but fix readiness as above):

-    createuser postgres -s
+    createuser --if-not-exists postgres -s

139-141: POSIX sh: avoid echo -e; use printf

BusyBox/ash does not standardize echo -e. Use printf for portability.

-        echo -e "\e[34m[Info] Waiting for the database to be ready...\e[0m"
+        printf '\033[34m[Info] Waiting for the database to be ready...\033[0m\n'
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83a8d30 and dc29a13.

📒 Files selected for processing (2)
  • Dockerfile (2 hunks)
  • entrypoint.sh (3 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
entrypoint.sh

[warning] 33-33: When used with -p, -m only applies to the deepest directory.

(SC2174)


[warning] 38-38: In POSIX sh, echo flags are undefined.

(SC3037)


[warning] 39-39: When used with -p, -m only applies to the deepest directory.

(SC2174)


[warning] 46-46: When used with -p, -m only applies to the deepest directory.

(SC2174)


[warning] 140-140: In POSIX sh, echo flags are undefined.

(SC3037)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
Dockerfile (1)

261-261: LGTM: switching to non-root user in the final stage

USER sourcebot placement looks correct relative to chown operations and before ENTRYPOINT.

Comment on lines +38 to +41
echo -e "\e[34m[Info] Initializing database at $DATABASE_D\ATA_DIR...\e[0m"
mkdir -m 0750 -p $DATABASE_DATA_DIR

initdb -D "$DATABASE_DATA_DIR"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Deadlock risk: initdb runs as “sourcebot” (superuser becomes sourcebot) but readiness/psql assume user “postgres”

As written, initdb creates DB superuser = current OS user (“sourcebot”). The subsequent until loop waits on pg_isready -U postgres (role doesn’t exist yet), so it can spin forever. Later createuser relies on connecting as “sourcebot”, but you never reach it if pg_isready is gated on a non-existent role. Also fix the log typo and avoid echo -e in /bin/sh.

Apply this to make the superuser “postgres”, fix the log, quote paths, and use printf:

-    echo -e "\e[34m[Info] Initializing database at $DATABASE_D\ATA_DIR...\e[0m"
-    mkdir -m 0750 -p $DATABASE_DATA_DIR
-
-    initdb -D "$DATABASE_DATA_DIR"
+    printf '\033[34m[Info] Initializing database at %s...\033[0m\n' "$DATABASE_DATA_DIR"
+    install -d -m 0750 -- "$DATABASE_DATA_DIR"
+    initdb -D "$DATABASE_DATA_DIR" -U postgres
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo -e "\e[34m[Info] Initializing database at $DATABASE_D\ATA_DIR...\e[0m"
mkdir -m 0750 -p $DATABASE_DATA_DIR
initdb -D "$DATABASE_DATA_DIR"
printf '\033[34m[Info] Initializing database at %s...\033[0m\n' "$DATABASE_DATA_DIR"
install -d -m 0750 -- "$DATABASE_DATA_DIR"
initdb -D "$DATABASE_DATA_DIR" -U postgres
🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 38-38: In POSIX sh, echo flags are undefined.

(SC3037)


[warning] 39-39: When used with -p, -m only applies to the deepest directory.

(SC2174)

🤖 Prompt for AI Agents
In entrypoint.sh around lines 38 to 41, the script currently uses echo -e,
mis-typed DATABASE_D\ATA_DIR, and runs initdb so the DB superuser becomes the
current OS user (sourcebot), which deadlocks later because pg_isready and
createuser expect the "postgres" role; update the lines to: use printf instead
of echo -e, fix the variable to $DATABASE_DATA_DIR and quote it everywhere
(mkdir -m 0750 -p "$DATABASE_DATA_DIR" and initdb -D "$DATABASE_DATA_DIR"), and
ensure initdb creates the postgres superuser by passing the username option
(initdb --username=postgres or -U postgres) so subsequent pg_isready -U postgres
and createuser steps succeed.

Comment on lines 138 to 151
postgres -D "$DATABASE_DATA_DIR" &
until pg_isready -h localhost -p 5432 -U postgres; do
echo -e "\e[34m[Info] Waiting for the database to be ready...\e[0m"
sleep 1

# As postgres runs in the background, we must check if it is still
# running, otherwise the "until" loop will be running indefinitely.
if ! pgrep -x "postgres" > /dev/null; then
echo "postgres failed to run"
exit 1
break
fi
done

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Readiness wait uses -U postgres before the role exists; also unreachable break

Wait without forcing a user, keep the liveness check, and use printf. Remove the unreachable break after exit 1.

-    postgres -D "$DATABASE_DATA_DIR" &
-    until pg_isready -h localhost -p 5432 -U postgres; do
-        echo -e "\e[34m[Info] Waiting for the database to be ready...\e[0m"
+    postgres -D "$DATABASE_DATA_DIR" &
+    until pg_isready -q -h localhost -p 5432; do
+        printf '\033[34m[Info] Waiting for the database to be ready...\033[0m\n'
         sleep 1
-
-        # As postgres runs in the background, we must check if it is still
-        # running, otherwise the "until" loop will be running indefinitely.
-        if ! pgrep -x "postgres" > /dev/null; then
-            echo "postgres failed to run"
-            exit 1
-        break
-    fi
+        # As postgres runs in the background, ensure it hasn't died.
+        if ! pgrep -x "postgres" > /dev/null; then
+            echo "postgres failed to run"
+            exit 1
+        fi
     done
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
postgres -D "$DATABASE_DATA_DIR" &
until pg_isready -h localhost -p 5432 -U postgres; do
echo -e "\e[34m[Info] Waiting for the database to be ready...\e[0m"
sleep 1
# As postgres runs in the background, we must check if it is still
# running, otherwise the "until" loop will be running indefinitely.
if ! pgrep -x "postgres" > /dev/null; then
echo "postgres failed to run"
exit 1
break
fi
done
postgres -D "$DATABASE_DATA_DIR" &
until pg_isready -q -h localhost -p 5432; do
printf '\033[34m[Info] Waiting for the database to be ready...\033[0m\n'
sleep 1
# As postgres runs in the background, ensure it hasn't died.
if ! pgrep -x "postgres" > /dev/null; then
echo "postgres failed to run"
exit 1
fi
done
🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 140-140: In POSIX sh, echo flags are undefined.

(SC3037)

🤖 Prompt for AI Agents
In entrypoint.sh around lines 138 to 151, the readiness loop calls pg_isready
with -U postgres before that role may exist and uses echo -e and an unreachable
break after exit; change the pg_isready invocation to not force a user (e.g.,
use: pg_isready -h localhost -p 5432), replace echo -e with printf for portable
colored output (e.g., printf '\e[34m[Info] Waiting for the database to be
ready...\e[0m\n'), keep the pgrep liveness check as-is to ensure postgres is
still running, and remove the unreachable break after the exit 1 so the control
flow is correct.

Running containers as a non-root user is a long standing security practice.
The changes ensure that the sourcebot user is created and has the correct
level of permissions to run all its dependencies (postgres, redis and node).

Please note that as a side effect, existing mounted volumes would need to
have their ownership reviewed or it may not be able to access the files.
This is specially the case for previous versions that would create said
files as 0:0.

To fix that, users can run chown -R 1500:1500 /path/.sourcebot. The chmod
may also need to be a bit more strict in such cases, so changing that is
advised: chown -R 0750 /path/.sourcebot.

Signed-off-by: Paulo Gomes <pjbgf@linux.com>
@pjbgf
Copy link
Author

pjbgf commented Sep 18, 2025

@brendan-kellam thanks for the review. Looking into your message I noticed two problems, once regarding the migration from existing dbs created using root and the second which was the attempt of creating the user again on follow-up executions. Both as now fixed. I made some changes and both workflows should work fine:

Migrating from existing DB

docker run \
    -p 3000:3000 \
    --rm \
    -v .sourcebot:/data \
    --name sourcebot \
   ghcr.io/sourcebot-dev/sourcebot:latest

# Fix permissions
sudo chown -R 1500:1500 .sourcebot
sudo chmod -R 0750 .sourcebot

docker run \
    -p 3000:3000 \
    --rm \
    -v .sourcebot:/data \
    --name sourcebot \
   docker.io/paulinhu/sourcebot:non-root

Non-root from scratch

mkdir -p 0750 .sourcebot
sudo chown -R 1500:1500 .sourcebot

docker run \
    -p 3000:3000 \
    --rm \
    -v .sourcebot:/data \
    --name sourcebot \
   docker.io/paulinhu/sourcebot:non-root

Any ideas on what's going on?
Is this something that could be done automatically during startup in the entrypoint? If not, it would be great to have a small upgrade doc (example) that we can link in the changelog so that users know what to do if they hit a error.

Yes, the problem there is that you are mounting a dir which the application (running as 1500:1500) does not have permissions to access/change it.
We cannot fix this at startup as we are no longer running as root, nor do we have the CAP_DAC_OVERRIDE cap.

However, we could improve the error messaging around it. For the doc changes, do you prefer that within the same PR or as a separate PR?

@msukkari msukkari self-requested a review September 18, 2025 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Selfhosted docker unable to run as non-root user
3 participants