Skip to content

Replace booster-ui with Horizon UI#187

Merged
wu-sheng merged 13 commits into
masterfrom
migrate-to-horizon-ui
May 19, 2026
Merged

Replace booster-ui with Horizon UI#187
wu-sheng merged 13 commits into
masterfrom
migrate-to-horizon-ui

Conversation

@wu-sheng
Copy link
Copy Markdown
Member

Summary

skywalking-booster-ui is deprecated, replaced upstream by Apache SkyWalking Horizon UI. This PR migrates the chart to ship Horizon UI by default.

Horizon is not a drop-in for booster-ui:

  • Bundles a Node-based BFF in front of the SPA — listens on port 8081 and does not pass-through /graphql to OAP.
  • Reads a horizon.yaml config file (Zod-validated, hot-reloaded) instead of env vars.
  • Connects to OAP on two ports: query 12800 (oap.ports.rest) and admin REST 17128 (new oap.ports.admin, requires OAP 10.5+).
  • Requires authentication. Ships with default users admin/admin and skywalking/skywalking so the chart boots out of the box; documentation warns to rotate.
  • Runs as the non-root horizon user; writes runtime state (audit log, setup, alarms, wire debug) to /data.
  • Release images: Docker Hub apache/skywalking-ui:horizon-x.y.z. Pre-release / dev: ghcr.io/apache/skywalking-horizon-ui.

Chart changes

  • New templates/ui-configmap.yaml renders horizon.yaml from a ui.config map deep-merged over chart defaults that wire oap.queryUrl/adminUrl to the in-cluster OAP service.
  • New templates/ui-pvc.yaml optionally backs /data with a PVC (ui.persistence.enabled).
  • templates/ui-deployment.yaml rewritten: port 8081, ConfigMap mount at /app/horizon.yaml, /data volume, HTTP readiness probe on /api/oap/info, Recreate strategy (in-memory session store), fsGroup: 101, optional envFromSecret / extraEnv for ${VAR} interpolation. Drops the booster-ui SW_OAP_ADDRESS / SW_ZIPKIN_ADDRESS env wiring.
  • oap.ports.admin: 17128 added so the OAP service exposes the admin REST port.
  • ui.enabled flag (default true) gates all UI resources for deployments that point an external UI at OAP directly.

E2E

Horizon's BFF does not expose /graphql, so the previous swctl --base-url=http://skywalking-ui/graphql pattern would 404. All three test/e2e/e2e-*.yaml files now:

  • Expose service/skywalking-oap:12800 instead of service/skywalking-ui:80.
  • Send every swctl query to OAP directly (service_skywalking_oap_host:service_skywalking_oap_12800/graphql).
  • Add deployments/skywalking-ui to wait: as a BFF smoke check (so a Horizon regression that breaks startup fails CI).
  • test/e2e/env: UI_REPO=ghcr.io/apache/skywalking-horizon-ui, UI_TAG=main.
  • test/e2e/values.yaml mirrors the same admin/admin + skywalking/skywalking credentials so a developer port-forwarding into the kind cluster can log in.

Docs

  • New chart/skywalking/values-horizon-ui.yaml: single-file helm install -f example mirroring the upstream horizon.example.yaml shipped at /app/horizon.example.yaml inside the image.
  • Root README has a new "Web UI (Horizon UI)" section explaining the credential defaults, one-line install, and ConfigMap+Secret rotation pattern.
  • Chart README parameter table updated for the new UI rows.

Test plan

  • helm dep up chart/skywalking
  • helm lint chart/skywalking (with and without ui.enabled=false, with and without -f values-horizon-ui.yaml)
  • helm template demo chart/skywalking -f chart/skywalking/values-horizon-ui.yamlhorizon.yaml ConfigMap contains both default users with real argon2id hashes; OAP URLs point at the in-cluster service.
  • helm template demo chart/skywalking --set ui.enabled=false — zero ui-* resources rendered.
  • helm template against test/e2e/values.yaml succeeds (covers the e2e path).
  • Run e2e via .github/workflows/e2e.ci.yaml in this PR. Requires UI_TAG to be repinned from main to a stable SHA once horizon-ui has one.
  • Manual smoke: kubectl port-forward svc/<release>-skywalking-helm-ui 8080:80 → log in as admin/admin, confirm OAP query + admin endpoints reachable from the UI.

Caveats

  • UI_TAG=main in test/e2e/env drifts on every horizon-ui merge — pin to a SHA once one is known.
  • Default credentials admin/admin and skywalking/skywalking are publicly known in this repo. README, NOTES.txt, and values.yaml comments all warn to rotate before exposing the UI beyond a trusted network.
  • Requires OAP 10.5+ for the admin port (17128) to bind. Older OAP will leave Horizon's admin features unavailable but the rest of the UI still works.

wu-sheng added 8 commits May 18, 2026 16:51
skywalking-booster-ui is deprecated; this migrates the chart to ship
Apache SkyWalking Horizon UI (apache/skywalking-horizon-ui). Horizon
bundles a BFF in front of the SPA, listens on port 8081, requires a
horizon.yaml config file, talks to OAP on two ports (query 12800 +
admin 17128), and authenticates users — none of which booster-ui did.

Chart:
- New ui-configmap.yaml renders horizon.yaml (deep-merged from
  ui.config over chart defaults that wire queryUrl/adminUrl to the
  in-cluster OAP service)
- New ui-pvc.yaml for the BFF's /data state directory
- ui-deployment.yaml: port 8081, ConfigMap mount, /data volume,
  HTTP readiness probe on /api/oap/info, Recreate strategy, fsGroup
- Add oap.ports.admin: 17128 so the OAP service exposes the admin
  REST surface the BFF needs (OAP 10.5+)
- ui.enabled flag gates all UI resources, for deployments that point
  an external UI at OAP directly
- Ship default users admin/admin and skywalking/skywalking (real
  argon2id hashes) so the chart boots out of the box; NOTES.txt and
  README warn to rotate before production

E2E:
- Horizon does not pass-through /graphql, so swctl queries are
  redirected from the UI service to OAP's 12800 directly
- Kind expose-ports switched from service/skywalking-ui to
  service/skywalking-oap
- skywalking-ui added to the wait block as a BFF smoke check
- env: UI_REPO=ghcr.io/apache/skywalking-horizon-ui, UI_TAG=main
  (release images land at apache/skywalking-ui:horizon-x.y.z on
  Docker Hub)

Docs:
- New chart/skywalking/values-horizon-ui.yaml self-contained example
  mirroring the upstream horizon.example.yaml shape
- README sections explain credential rotation, ConfigMap+Secret
  pattern, and the bigger booster→horizon behavioral diff
- test/e2e/env: replace UI_TAG=main (mutable) with the most recent
  GHCR-built horizon-ui commit SHA. Matches how OAP/Satellite/BanyanDB
  images are pinned and makes e2e re-runs reproducible.
- values-horizon-ui.yaml: drop the oap.image.tag / oap.storageType
  block — this is meant to be a UI-only auth/RBAC overlay, not a
  full install template. The oap fields were copy-pasted from
  values-my-es.yaml where they belonged for a different reason.
- README install example updated: pass oap settings via --set
  alongside -f values-horizon-ui.yaml, mirroring how the file is
  actually intended to be consumed.
… file

Shipping real argon2id hashes in chart/skywalking/values.yaml weakened
Horizon's upstream "no default admin/admin" stance. Reverting:

- values.yaml: ui.config.auth.local.users back to [] with a commented
  example. The chart's defaults no longer give anyone a logged-in UI;
  the operator must consciously provide creds.
- values-horizon-ui.yaml stays as the recommended demo overlay (real
  admin/admin + skywalking/skywalking hashes). README install example
  now requires `-f values-horizon-ui.yaml` to get a working UI.
- NOTES.txt: two-path guidance (apply the example overlay for demos,
  Secret + ${VAR} for production) instead of advertising defaults.
- values-my-es.yaml: drop the placeholder hash; point at the example
  file or to define users locally.
Pinning oap.image.tag and ui.image.tag in the example file means every
SkyWalking / Horizon UI release leaves these values stale. The chart
already requires these to be set explicitly at install time, so the
example file just needs to tell operators that — not pin a version
that'll be wrong in three months.
values-horizon-ui.yaml was ~75 lines, but only ~5 lines (two argon2id
hashes) were unique to this chart — the rest duplicated the server /
oap / auth / rbac / session shape that already lives in horizon-ui's
horizon.example.yaml and per-section docs. Two-copy maintenance with
nothing keeping them in sync.

- Delete chart/skywalking/values-horizon-ui.yaml.
- README "Web UI" section now shows the minimal copy-pastable
  demo-values.yaml snippet (the two hashes + the surrounding
  ui.config.auth shape) and links to the upstream horizon.example.yaml
  and docs/setup/horizon-yaml.md for the rest of the schema.
- values.yaml, values-my-es.yaml, NOTES.txt, chart README: drop refs
  to the deleted file, point at the README / upstream docs instead.
- Demo and production install snippets in README use <release> as the
  image-tag placeholder instead of pinning specific versions that go
  stale every cut.
cfed07a1 had the pnpm-deploy regression where api-client/src/index.ts
was shipped in the image instead of compiled JS, crashing the BFF at
import time. 80565f5 is the first GHCR build after horizon-ui's
Dockerfile move to copy-in-static-dist.
/api/oap/info is gated behind a logged-in session and returns 401 to
the kubelet's unauthenticated probe, so the UI deployment never became
ready and e2e's wait timed out. /api/auth/health is the right
unauthenticated liveness signal — verifies BFF is up + auth backend
is healthy without needing credentials.

Confirmed against horizon-ui 80565f5: GET /api/oap/info → 401
{"error":"unauthenticated"}; GET /api/auth/health → 200.
@wu-sheng wu-sheng added this to the 4.10.0 milestone May 19, 2026
@wu-sheng wu-sheng added the enhancement New feature or request label May 19, 2026
wu-sheng added 5 commits May 19, 2026 08:23
apache/skywalking-swck's UI CRD still deploys booster-ui (port 8080,
env-driven). The Horizon UI image needs a horizon.yaml ConfigMap +
port 8081 + configured auth users, none of which the operator wires.
Pointing it at the Horizon image leaves UI/skywalking-system stuck in
condition=Available=false, timing out the test setup.

Teaching SWCK to deploy Horizon is upstream-only — out of scope for
this helm chart PR. Until then, drop the UI block from the SWCK
component manifests and the UI/skywalking-system wait gates. The
verify cases query OAP's GraphQL on 12800 directly, so the UI was
never on the SWCK test path anyway.
apache/skywalking-swck's chart default is v0.9.0, which only deploys
booster-ui-style manifests (port 8080, env-driven). d299bc0 is the
first SWCK master commit that branches on Spec.Kind, defaults Kind
to horizon, emits port-8081 manifests, and renders a horizon.yaml
ConfigMap.

Override the operator image via --set on every SWCK e2e install so
this PR's CI can validate the horizon path before SWCK cuts a
release. The chart's own default (apache/skywalking-swck:v0.9.0)
stays untouched for end users. Bump SWCK_OPERATOR_TAG when SWCK
ships a release with Horizon support; drop the overrides entirely
once chart/operator's tag points at that release.
chart/operator vendored the v0.9.0 UI CRD, which has no Kind /
OAPServerAdminAddress / OAPServerZipkinAddress / Config fields. The
new operator's deployment template branches on .Spec.Kind, but with
the old CRD installed, k8s strips that field on apply — Kind always
arrives as \"\" at the controller and the booster (port-8080) path
is taken even for Horizon images, leaving the pod unable to pass the
readiness probe.

Replace the UI section of chart/operator/templates/crds.yaml with the
regenerated CRD from apache/skywalking-swck@d299bc0. Verified via
helm template that the new field set + default=horizon + enum
constraint render correctly. Other CRDs in the file are unchanged.
When I synced the UI CRD from apache/skywalking-swck d299bc0, I
copied the bare kubebuilder-generated file verbatim, which:
- dropped two annotations every other CRD in this chart carries:
    cert-manager.io/inject-ca-from
    controller-gen.kubebuilder.io/version
  Without inject-ca-from the conversion webhook talks to the operator
  over plain HTTP (no CA injected), and webhook calls fail.
- dropped the conversion: {strategy: Webhook, ...} block, which would
  break any future v1alpha1 ↔ vNext conversion the operator wires.
- left an in-file Apache license header in the middle of crds.yaml.

Re-added the annotations + conversion block to match the pattern used
by oapservers / satellites / banyandbs / etc. Stripped the duplicate
license header.
@wu-sheng wu-sheng merged commit da0e267 into master May 19, 2026
9 checks passed
@wu-sheng wu-sheng deleted the migrate-to-horizon-ui branch May 19, 2026 02:51
wu-sheng added a commit to apache/skywalking that referenced this pull request May 19, 2026
apache/skywalking-helm#187 (merged as da0e267) wraps
ui-deployment.yaml + ui-svc.yaml + ui-ingress.yaml in
`{{- if .Values.ui.enabled }}`, so `--set ui.enabled=false` alone is
enough to omit the UI Deployment from the rendered manifest. The
upstream chart is now Horizon-UI-ready.

* test/e2e-v2/script/env — SW_KUBERNETES_COMMIT_SHA bumped from
  2850db15 to da0e267.
* 21 e2e cases — drop the `ui.image.repository=placeholder` /
  `ui.image.tag=disabled` workaround we added when the chart still
  required the tag at template-eval time. Single line
  `--set ui.enabled=false` again.

Also genericize the /status/config/ttl rationale comments and docs
to "ecosystem tools that discover TTL via REST before /graphql"
instead of naming baseline-predictor specifically.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants