Skip to content

feat(Content Analytics) #35525 : Add docker-compose examples for Experiments and new CA infrastructure.#35624

Merged
jcastro-dotcms merged 6 commits into
mainfrom
issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure
May 11, 2026
Merged

feat(Content Analytics) #35525 : Add docker-compose examples for Experiments and new CA infrastructure.#35624
jcastro-dotcms merged 6 commits into
mainfrom
issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure

Conversation

@jcastro-dotcms
Copy link
Copy Markdown
Member

@jcastro-dotcms jcastro-dotcms commented May 8, 2026

Summary

This PR restructures the Docker Compose developer environments under docker/docker-compose-examples/ by splitting what was a single mixed-purpose analytics/ stack into two focused, independent environments — one for the Content Analytics (CA) infrastructure and one for the Experiments feature stack.


What Changed

analytics/ — Redirects to dot-ca-event-manager

The Content Analytics infrastructure (ClickHouse cluster + ca-event-manager) has been moved to its own repository to avoid duplicating configuration files across repos:

https://github.com/dotCMS/dot-ca-event-manager

The analytics/ directory now contains only a README.md pointing there. Users should refer to the docker/ directory in that repository for the full setup, including the ClickHouse keeper, replica nodes, initialization scripts, and the event manager service.


experiments/ — New Directory for the Experiments Stack

All files related to the dotCMS Experiments feature (A/B testing, analytics integrations via Keycloak + Jitsu + CubeJS) have been moved from analytics/ into a new dedicated experiments/ directory.

Services included:

  • keycloak — Identity provider for JWT-based authentication
  • jitsu — Event collection pipeline (stream mode for local dev)
  • dotcms-analytics (Configurator) — Routes and configures analytics destinations
  • cube — CubeJS analytics read layer with hot-reload schema
  • ch_server — Single-node ClickHouse instance for Experiments data
  • redis — Jitsu message queue
  • analytics-postgres — Shared DB for Keycloak + Configurator
  • dotcms, dotcms-db, opensearch — Full dotCMS stack (behind --profile full)

Key behaviors:

  • dotCMS reaches analytics services via host.docker.internal (not container-to-container names), enabling the host-side app to share the same endpoints
  • Write URL updated to /api/v1/s2s/event (server-to-server path)
  • Startup script renamed start-analytics.shstart-experiments.sh

Updated README.md — Corrected analytics URL documentation to reflect actual host.docker.internal routing, fixed env var names (DOT_ANALYTICS_* prefix), and added POSTGRESQL_HOST_PORT to the environment variables reference.


Test Plan

  • cd docker/docker-compose-examples/experiments && docker compose up -d — all analytics-only services start without errors
  • Verify Keycloak is accessible at http://localhost:61111
  • docker compose --profile full up -d — dotCMS starts, analytics pre-configured, connection test passes in Apps → dotExperiments-config
  • Verify docker/docker-compose-examples/analytics/ contains only README.md with a link to dot-ca-event-manager

🤖 Generated with Claude Code

This PR fixes: #35525

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 8, 2026

Claude finished @jcastro-dotcms's task in 2m 40s —— View job


Review of latest push

The new commit (99536ceb) deletes the in-repo CA infrastructure example and replaces analytics/README.md with a pointer to dotCMS/dot-ca-event-manager. That side of the PR is clean — no orphan references to the deleted files anywhere in the repo, and the previously-flagged issues that lived in those files (50-users.sql missing IF NOT EXISTS, port 8124/8082 collisions, ANALYTICS_ASYNC_INSERT_* indirection, customer1 vs cust-001) are simply gone with the directory.

What remains in experiments/ still carries the previously-deferred items (acknowledged in earlier comments as a follow-up ticket). One small thing in the experiments script that wasn't called out before:

New finding — minor

experiments/start-experiments.sh:109 — argument-order bug in the post-startup URL output.

if [[ "$1" != "--experiments-only" ]]; then
    echo "  - dotCMS: http://localhost:8082 (admin:admin)"
    echo "  - Glowroot: http://localhost:4000"
fi

The arg parser at lines 47–65 accepts flags in any order (sets the EXPERIMENTS_ONLY boolean), but this block checks $1 directly. Running ./start-experiments.sh --force-recreate --experiments-only will incorrectly print the dotCMS / Glowroot URLs even though dotCMS was never started (because $1 is --force-recreate). Fix is one line — gate on $EXPERIMENTS_ONLY instead of $1. Cosmetic, but inconsistent with how the rest of the script handles args.

Previously deferred (still open, per agreement to handle in a separate ticket)

For traceability, these are still present in this PR and were explicitly deferred:

  • experiments/docker-compose.yml (dotcms service) — no extra_hosts: ["host.docker.internal:host-gateway"]; silently fails on Linux Docker Engine.
  • experiments/start-experiments.sh:68,100 + README — uses the deprecated hyphenated docker-compose v1 binary instead of docker compose.
  • experiments/docker-compose.yml:183JWKS_URI defaults from ${JWKS_URL:-…}; host env var name vs container env var name mismatch.
  • experiments/docker-compose.yml:69–73deploy.resources.limits is a Swarm-only key, ignored by docker compose up.
  • experiments/docker-compose.yml:104dotcms/dotcms-test:1.0.0-SNAPSHOT image; confirm where this is published before users hit a pull failure.
  • experiments/docker-compose.yml:49–73 — OpenSearch 1.x with no plugins.security.disabled / OPENSEARCH_INITIAL_ADMIN_PASSWORD; default-config 1.x with security plugin enabled may not start cleanly with the credentials dotCMS is configured to use.

PR-description test plan items that referenced the now-deleted analytics/ stack are also moot — worth updating the description before merge so the test plan reflects only the experiments/ stack that actually ships in this PR.

Otherwise, the deletion is well-scoped and the change is straightforward.
• Branch: issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure

@jcastro-dotcms
Copy link
Copy Markdown
Member Author

Claude finished @jcastro-dotcms's task in 3m 45s —— View job

PR Review

Setup is reasonable as a dev-only environment, but there are several real issues worth a second look.

Bugs / risks

  1. 50-users.sql is missing IF NOT EXISTS — contradicts the README and breaks on node 2. analytics/docker-compose.yml:58 mounts ./init on both clickhouse-01 and clickhouse-02. The README (lines 141–143, 190–191) tells you that's safe because "All CREATE statements use IF NOT EXISTS". They don't. init/50-users.sql:2 (CREATE USER 'cust-001' IDENTIFIED BY 'abc') and :6 (CREATE ROW POLICY 'cust-001-policy' ON analytics.*) have no IF NOT EXISTS. Row policies on a Replicated database engine will propagate via Keeper, so the second-node invocation will throw "policy already exists" and abort the entrypoint. Add IF NOT EXISTS (also on the GRANTs where it applies) or stop mounting init/ on clickhouse-02. The PR description's test plan even claims "no init/ mount on node 2" — that's not what the compose file does.

  2. Port collisions between the two stacks. Both compose files publish:

    • 8082analytics/docker-compose.yml:84 (ca-event-manager) and experiments/docker-compose.yml:149 (dotcms)
    • 8124analytics/docker-compose.yml:52 (clickhouse-02) and experiments/docker-compose.yml:265 (ch_server default)

    If a developer brings both up on the same host (which is the obvious workflow given the names "analytics" and "experiments" suggest they're complementary), the second up will fail. README should at least state these stacks can't coexist on the default ports.

  3. host.docker.internal does not resolve on Linux Docker Engine without extra_hosts: ["host.docker.internal:host-gateway"]. The dotcms service in experiments/docker-compose.yml:129–134 relies on this name for IDP/config/write/read URLs but never declares the extra host. Works on Docker Desktop, will silently fail on a Linux dev box. Add the extra_hosts entry to the dotcms service.

  4. docker-compose (legacy v1 binary) is hard-coded. experiments/start-experiments.sh:68,100 shells out to docker-compose (with hyphen). That binary was deprecated mid-2023 and is not installed on current Docker Desktop. The README (lines 81–91, 110–113, 191–203, 234–272) is full of docker-compose … examples too. Switch to docker compose (no hyphen) so the script runs on a fresh dev install. Fix this →

  5. The ANALYTICS_ASYNC_INSERT_* env-var indirection is dead code. analytics/docker-compose.yml:78–80 sets ANALYTICS_ASYNC_INSERT_TIMEOUT_MS=200 and references ${ANALYTICS_ASYNC_INSERT_TIMEOUT_MS:-200} in the same environment: block. Compose substitution reads from the host shell / .env file at parse time, not from the service's own environment. So those two lines never feed into the URL substitution — the defaults are always used. Either move them to a .env file (so they actually override) or drop the indirection and hardcode the URL.

  6. JWKS variable name typo in experiments/docker-compose.yml:183. Container env name is JWKS_URI, defaulting from ${JWKS_URL:-…}. The host-side override variable is JWKS_URL, but cube on line 241 also reads JWKS_URL. So JWKS_URI exported on the host is silently ignored. Pick one name and stick with it.

Worth a second look

  1. dotcms/dotcms-test:1.0.0-SNAPSHOT (experiments/docker-compose.yml:104) — SNAPSHOT in a Docker tag is non-standard, and this image is not a public dotCMS release. Confirm it's actually published to the registry users will pull from (or document where to build it).
  2. OpenSearch 1.x image with no security override (experiments/docker-compose.yml:49–73). No plugins.security.disabled=true, no OPENSEARCH_INITIAL_ADMIN_PASSWORD. Default-config OS 1.x with the security plugin enabled requires demo certs / admin password setup — https://opensearch:9200 with DOT_ES_AUTH_BASIC_PASSWORD: 'admin' may or may not work depending on the exact image build. Worth verifying it actually starts cleanly on a fresh pull.
  3. Replicated row policy in 01-init.sql:10–14 (rp_admin_user) is scoped to customer_id = 'customer1' while the user created in 50-users.sql is cust-001. Two unrelated customer identities in the same dev seed — confirm that's intentional and not a leftover.
  4. CT 30-conversion-data.sql:101–102 comment is contradictory — "Refreshing every 30 seconds FOR LOCAL DEVELOPMENT ONLY! For DEV, use at least REFRESH EVERY 15 MINUTE". Reads like both clauses describe "DEV". Probably meant "for non-local / shared DEV use 15 minutes". Minor wording fix.

Minor

  • experiments/docker-compose.yml:69–73 — OpenSearch has deploy.resources.limits which only takes effect under Swarm; ignored by docker compose up. Use mem_limit / cpus if the cap is intended for plain compose.
  • experiments/start-experiments.sh:95 uses eval $COMPOSE_CMD — works, but eval on a constructed command string is unnecessary; an array would be safer.
  • README test plan (PR description) item 3 contradicts the compose file (claims node 2 has no init mount). Fix one or the other.

Todos

@claude based on your feedback:

The provided docker-compose.yml files are meant for local development only. The bugs and risks you mentioned are not relevant at this point as these services won't make it to production instances. The Platform Team will assemble the required services for official environments using the highest security standards.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 8, 2026

Claude finished @jcastro-dotcms's task in 37s —— View job


Acknowledged — with one caveat

Fair point on the production security / hardening items (#7 OpenSearch security defaults, #8 SNAPSHOT image, parts of #2 about coexistence policy) — agreed, those belong to the Platform Team's hardening pass and aren't blocking for a dev-only stack.

That said, a few of my findings aren't about production safety — they're about whether the local dev environment actually works as documented. Wanted to flag them once more so you can decide explicitly to defer (totally valid) vs. fix (cheap):

# Finding Why it still matters even for dev-only
1 50-users.sql missing IF NOT EXISTS on CREATE USER / CREATE ROW POLICY The README explicitly tells devs the init scripts are safe to run on both nodes because all CREATEs use IF NOT EXISTS. They don't, and row policies replicate via Keeper, so the second node's entrypoint will abort on a fresh docker compose up. This is a "stack doesn't start" bug, not a prod concern.
3 Missing extra_hosts: ["host.docker.internal:host-gateway"] on the dotcms service Silently fails on Linux Docker Engine. Any developer not on Docker Desktop hits this on first run.
4 docker-compose (hyphenated v1 binary) hard-coded in start-experiments.sh The v1 binary was deprecated mid-2023 and isn't shipped on current Docker Desktop. The script will fail with command not found on a fresh dev install.
(PR description) Test plan claims "no init/ mount on node 2" — compose file mounts it on both Either fix the compose file or the test plan; right now they disagree.

Items 2 (port collisions), 5 (env-var indirection), 6 (JWKS_URI vs JWKS_URL), 9 (customer1 vs cust-001), 10 (comment wording), and the minor items — I agree those are fine to leave or address opportunistically. Your call.

Happy to push fixes for #1, #3, and #4 (the "doesn't start cleanly" trio) on this branch if you want — just say the word. Otherwise approving the rationale: dev-only, Platform Team owns prod.
• Branch: issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure

@jcastro-dotcms
Copy link
Copy Markdown
Member Author

Claude finished @jcastro-dotcms's task in 37s —— View job

Acknowledged — with one caveat

Fair point on the production security / hardening items (#7 OpenSearch security defaults, #8 SNAPSHOT image, parts of #2 about coexistence policy) — agreed, those belong to the Platform Team's hardening pass and aren't blocking for a dev-only stack.

That said, a few of my findings aren't about production safety — they're about whether the local dev environment actually works as documented. Wanted to flag them once more so you can decide explicitly to defer (totally valid) vs. fix (cheap):

Finding Why it still matters even for dev-only

1 50-users.sql missing IF NOT EXISTS on CREATE USER / CREATE ROW POLICY The README explicitly tells devs the init scripts are safe to run on both nodes because all CREATEs use IF NOT EXISTS. They don't, and row policies replicate via Keeper, so the second node's entrypoint will abort on a fresh docker compose up. This is a "stack doesn't start" bug, not a prod concern.
3 Missing extra_hosts: ["host.docker.internal:host-gateway"] on the dotcms service Silently fails on Linux Docker Engine. Any developer not on Docker Desktop hits this on first run.
4 docker-compose (hyphenated v1 binary) hard-coded in start-experiments.sh The v1 binary was deprecated mid-2023 and isn't shipped on current Docker Desktop. The script will fail with command not found on a fresh dev install.
(PR description) Test plan claims "no init/ mount on node 2" — compose file mounts it on both Either fix the compose file or the test plan; right now they disagree.
Items 2 (port collisions), 5 (env-var indirection), 6 (JWKS_URI vs JWKS_URL), 9 (customer1 vs cust-001), 10 (comment wording), and the minor items — I agree those are fine to leave or address opportunistically. Your call.

Happy to push fixes for #1, #3, and #4 (the "doesn't start cleanly" trio) on this branch if you want — just say the word. Otherwise approving the rationale: dev-only, Platform Team owns prod. • Branch: issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure

Claude finished @jcastro-dotcms's task in 37s —— View job

Acknowledged — with one caveat

Fair point on the production security / hardening items (#7 OpenSearch security defaults, #8 SNAPSHOT image, parts of #2 about coexistence policy) — agreed, those belong to the Platform Team's hardening pass and aren't blocking for a dev-only stack.

That said, a few of my findings aren't about production safety — they're about whether the local dev environment actually works as documented. Wanted to flag them once more so you can decide explicitly to defer (totally valid) vs. fix (cheap):

Finding Why it still matters even for dev-only

1 50-users.sql missing IF NOT EXISTS on CREATE USER / CREATE ROW POLICY The README explicitly tells devs the init scripts are safe to run on both nodes because all CREATEs use IF NOT EXISTS. They don't, and row policies replicate via Keeper, so the second node's entrypoint will abort on a fresh docker compose up. This is a "stack doesn't start" bug, not a prod concern.
3 Missing extra_hosts: ["host.docker.internal:host-gateway"] on the dotcms service Silently fails on Linux Docker Engine. Any developer not on Docker Desktop hits this on first run.
4 docker-compose (hyphenated v1 binary) hard-coded in start-experiments.sh The v1 binary was deprecated mid-2023 and isn't shipped on current Docker Desktop. The script will fail with command not found on a fresh dev install.
(PR description) Test plan claims "no init/ mount on node 2" — compose file mounts it on both Either fix the compose file or the test plan; right now they disagree.
Items 2 (port collisions), 5 (env-var indirection), 6 (JWKS_URI vs JWKS_URL), 9 (customer1 vs cust-001), 10 (comment wording), and the minor items — I agree those are fine to leave or address opportunistically. Your call.

Happy to push fixes for #1, #3, and #4 (the "doesn't start cleanly" trio) on this branch if you want — just say the word. Otherwise approving the rationale: dev-only, Platform Team owns prod. • Branch: issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure

@claude we'll take care of these changes in a separate ticket.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 8, 2026

Claude finished @jcastro-dotcms's task in 19s —— View job


Acknowledged — deferred to separate ticket

Understood — the three "doesn't start cleanly" findings (#1 IF NOT EXISTS on users/policies, #3 extra_hosts for Linux, #4 docker compose v2) plus the PR-description test-plan inconsistency will be handled in a follow-up ticket. No code changes pushed on this PR.

For traceability when that ticket is opened, the concrete locations are:

  • docker/docker-compose-examples/analytics/init/50-users.sql:2,6 — add IF NOT EXISTS to CREATE USER 'cust-001' and CREATE ROW POLICY 'cust-001-policy'
  • docker/docker-compose-examples/experiments/docker-compose.yml (dotcms service) — add extra_hosts: ["host.docker.internal:host-gateway"]
  • docker/docker-compose-examples/experiments/start-experiments.sh:68,100 and experiments/README.md — replace docker-compose with docker compose
  • PR description test plan item — reconcile with compose file (init mounted on both nodes, or remove from node 2)

Closing the loop here. Ping me on the follow-up ticket if you'd like me to pick up the fixes there.
• Branch: issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure

@jcastro-dotcms jcastro-dotcms added this pull request to the merge queue May 11, 2026
Merged via the queue into main with commit c4b7be8 May 11, 2026
29 checks passed
@jcastro-dotcms jcastro-dotcms deleted the issue-35525-Add-docker-compose-examples-for-Experiments-and-new-CA-infrastructure branch May 11, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Add docker-compose examples for Experiments and new CA infrastructure

2 participants