Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions .cursor/skills/rustfs-operator-contribute/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
name: rustfs-operator-contribute
description: Commits, pushes, and opens pull requests for the RustFS Operator repo per CONTRIBUTING.md and AGENTS.md. Use when the user asks to commit, push to remote my, submit a PR upstream, or follow project contribution workflow.
---

# RustFS Operator — commit, push, PR

## Preconditions

- Run from repository root: `/home/jhw/my/operator` (or clone path).
- Source of truth: [`CONTRIBUTING.md`](../../../CONTRIBUTING.md), [`Makefile`](../../../Makefile), [`.github/pull_request_template.md`](../../../.github/pull_request_template.md).

## Before commit

1. Run **`make pre-commit`** (fmt-check → clippy → test → console-lint → console-fmt-check). Fix failures before committing.
2. User-visible changes: update **[`CHANGELOG.md`](../../../CHANGELOG.md)** under `[Unreleased]` (Keep a Changelog).
3. **Commit message**: [Conventional Commits](https://www.conventionalcommits.org/), **English**, subject **≤ 72 characters** (e.g. `fix(pool): align CEL with console validation`).

## Commit

```bash
git add -A
git status
git commit -m "type(scope): short description"
```

## Push to fork (`my`)

Remote is typically `my` → `git@github.com:GatewayJ/operator.git` (verify with `git remote -v`).

```bash
git push my main
```

If `main` is non-fast-forward on `my`, integrate or use `git push my main --force-with-lease` only when intentionally replacing fork history (dangerous).

## Open PR upstream (`rustfs/operator`)

- **Target**: `rustfs/operator` branch **`main`**.
- **Head**: fork branch (e.g. `GatewayJ:main`).
- **PR title and body**: **English**.
- **Body**: Must follow **every section** in [`.github/pull_request_template.md`](../../../.github/pull_request_template.md); use **`N/A`** where not applicable; keep all headings.

**Do not** pass multiline `--body` to `gh` inline. Write a file and use `--body-file`:

```bash
cat > /tmp/pr_body.md <<'EOF'
## Type of Change
- [x] Bug Fix
...
EOF

gh pr create --repo rustfs/operator --head GatewayJ:main --base main \
--title "fix: concise English title" \
--body-file /tmp/pr_body.md
```

Adjust checkboxes and sections to match the change. Include **`make pre-commit`** under Verification.

## Quick checklist

- [ ] `make pre-commit` passed
- [ ] CHANGELOG updated if user-visible
- [ ] Commit message conventional, English
- [ ] PR template complete, English, `--body-file` used

## References

- [AGENTS.md](../../../AGENTS.md) — language, security, architecture notes
- [`.cursor/rules/pr.mdc`](../../../.cursor/rules/pr.mdc) — PR / path conventions (if present)
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ console-web/.next/
console-web/docs/
console-web/out/
console-web/node_modules/
.cursor/
# Cursor IDE: ignore contents except versioned Agent skills
.cursor/*
!.cursor/skills/
!.cursor/skills/**

# Docs / summaries (local or generated)
CONSOLE-INTEGRATION-SUMMARY.md
Expand Down
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Documentation

- Expanded root [`README.md`](README.md) with overview, quick start, development commands, CI vs `make pre-commit`, and documentation index.
- Aligned [`CLAUDE.md`](CLAUDE.md) and [`ROADMAP.md`](ROADMAP.md) with current code: Tenant status conditions and StatefulSet updates on the successful reconcile path are documented as implemented; remaining work (status on early errors, integration tests, rollout extras) is listed explicitly.
- Clarified the documentation map: [`CONTRIBUTING.md`](CONTRIBUTING.md) (quality gates and CI alignment), [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) (environment setup), [`docs/DEVELOPMENT-NOTES.md`](docs/DEVELOPMENT-NOTES.md) (historical notes, not normative).
- Updated [`examples/README.md`](examples/README.md): Tenant Services document S3 **9000** and RustFS Console **9001**; distinguished the Operator HTTP Console (default **9090**, `cargo run -- console`) from the Tenant `{tenant}-console` Service.
Expand All @@ -19,8 +20,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- **`console-web` / `make pre-commit`**: `npm run lint` now runs `eslint .` (bare `eslint` only printed CLI help). Added `format` / `format:check` scripts; [`Makefile`](Makefile) `console-fmt` and `console-fmt-check` call them so Prettier resolves from `node_modules` after `npm install` in `console-web/`.

- **Tenant `Pool` CRD validation (CEL)**: Match the operator console API — require `servers × volumesPerServer >= 4` for every pool, and `>= 6` total volumes when `servers == 3` (fixes the previous 3-server rule using `< 4` in CEL). Regenerated [`deploy/rustfs-operator/crds/tenant-crd.yaml`](deploy/rustfs-operator/crds/tenant-crd.yaml) and [`tenant.yaml`](deploy/rustfs-operator/crds/tenant.yaml). Added [`validate_pool_total_volumes`](src/types/v1alpha1/pool.rs) as the shared Rust implementation used by [`src/console/handlers/pools.rs`](src/console/handlers/pools.rs).

- **Tenant name length**: [`validate_dns1035_label`](src/types/v1alpha1/tenant.rs) now caps `metadata.name` at **55** characters so derived names like `{name}-console` remain valid Kubernetes DNS labels (≤ 63).

### Changed

- **Deploy scripts** ([`scripts/deploy/deploy-rustfs.sh`](scripts/deploy/deploy-rustfs.sh), [`deploy-rustfs-4node.sh`](scripts/deploy/deploy-rustfs-4node.sh)): Docker builds use **layer cache by default** (`docker_build_cached`); set `RUSTFS_DOCKER_NO_CACHE=true` for a full rebuild. Documented in [`scripts/README.md`](scripts/README.md).
- **4-node deploy**: Help text moved to an early heredoc (avoids trailing `case`/parse issues); see script header.
- **4-node cleanup** ([`cleanup-rustfs-4node.sh`](scripts/cleanup/cleanup-rustfs-4node.sh)): Host storage dirs under `/tmp/rustfs-storage-*` may require `sudo rm -rf` after Kind (root-owned bind mounts).
- **Dockerfile** (operator and [`console-web/Dockerfile`](console-web/Dockerfile)): Build caching and reproducibility tweaks (cargo-chef pin, pnpm in frontend image as applicable).

### Added

- Cursor Agent skill [`.cursor/skills/rustfs-operator-contribute/SKILL.md`](.cursor/skills/rustfs-operator-contribute/SKILL.md) for `make pre-commit`, commit, push to fork `my`, and opening PRs to `rustfs/operator` with the project template.

#### **StatefulSet Reconciliation Improvements** (2025-12-03, Issue #43)

Implemented intelligent StatefulSet update detection and validation to improve reconciliation efficiency and safety:
Expand Down
34 changes: 28 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,47 @@ ARG BASE_IMAGE=debian:bookworm-slim
# Use rust:bookworm so the binary is linked against glibc 2.36, matching final image.
ARG RUST_BUILD_IMAGE=rust:bookworm

# When Docker build cannot reach crates.io (DNS/network), use host network:
# cargo-chef version (pin for reproducible builds; override if needed)
ARG CARGO_CHEF_VERSION=0.1.77

# When Docker build cannot reach crates.io (DNS/network), try:
# docker build --network=host -t rustfs/operator:dev .
# For China mirrors, mount or COPY a .cargo/config.toml (see docs) before cargo install.

# Shared Cargo settings for slow / flaky networks (applies to all Rust stages)
FROM ${RUST_BUILD_IMAGE} AS rust-base
RUN mkdir -p /usr/local/cargo && \
printf '%s\n' \
'[http]' \
'timeout = 300' \
'multiplexing = false' \
'' \
'[net]' \
'retry = 10' \
> /usr/local/cargo/config.toml
ENV CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse

# Install cargo-chef once; planner + cacher only COPY the binary (avoids two slow installs)
FROM rust-base AS cargo-chef-installer
ARG CARGO_CHEF_VERSION
RUN cargo install cargo-chef --version "${CARGO_CHEF_VERSION}"

# Stage 1: Generate recipe for dependency caching
FROM ${RUST_BUILD_IMAGE} AS planner
FROM rust-base AS planner
COPY --from=cargo-chef-installer /usr/local/cargo/bin/cargo-chef /usr/local/cargo/bin/cargo-chef
WORKDIR /app
RUN cargo install cargo-chef
COPY . .
RUN cargo chef prepare --recipe-path recipe.json

# Stage 2: Build dependencies only (cached unless Cargo.lock changes)
FROM ${RUST_BUILD_IMAGE} AS cacher
FROM rust-base AS cacher
COPY --from=cargo-chef-installer /usr/local/cargo/bin/cargo-chef /usr/local/cargo/bin/cargo-chef
WORKDIR /app
RUN cargo install cargo-chef
COPY --from=planner /app/recipe.json recipe.json
RUN cargo chef cook --release --recipe-path recipe.json

# Stage 3: Build the binary
FROM ${RUST_BUILD_IMAGE} AS builder
FROM rust-base AS builder
WORKDIR /app
COPY . .
COPY --from=cacher /app/target target
Expand Down
75 changes: 74 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,64 @@
# RustFS Kubernetes Operator

RustFS Kubernetes operator (under development; not production-ready).
A Kubernetes operator for [RustFS](https://rustfs.com/) object storage, written in Rust with [kube-rs](https://github.com/kube-rs/kube). It reconciles a **`Tenant` custom resource** (`rustfs.com/v1alpha1`) and provisions ConfigMaps, Secrets, RBAC, Services, and StatefulSets so RustFS runs as an erasure-coded cluster inside your cluster.

**Status:** v0.1.0 pre-release — under active development, **not production-ready**.

## Features

- **Tenant CRD** — Declare pools, persistence, scheduling, credentials (Secret or env), TLS, and more; see [`examples/`](examples/).
- **Controller** — Reconciliation loop with status conditions (`Ready` / `Progressing` / `Degraded`), events, and safe StatefulSet update checks.
- **Operator HTTP console** — Optional management API (`cargo run -- console`, default port **9090**) used by [`console-web/`](console-web/) (Next.js UI).
- **Tooling** — CRD YAML generation, Docker multi-stage image, Kind-focused scripts under [`scripts/`](scripts/).

RustFS **S3 API** and **RustFS Console UI** inside a Tenant are exposed on **9000** and **9001** respectively; the operator’s own HTTP API is separate (typically **9090**). See [`CLAUDE.md`](CLAUDE.md) for ports and env vars.

## Requirements

- **Rust** — Toolchain from [`rust-toolchain.toml`](rust-toolchain.toml) (stable; edition 2024).
- **Kubernetes** — Target API **v1.30** (see `Cargo.toml` / `k8s-openapi` features); a reachable cluster for `server` mode.
- **console-web** (optional) — **Node.js ≥ 20** and `npm install` in `console-web/` if you run frontend lint/format or UI dev.

## Quick start

```bash
# Clone and build
git clone https://github.com/rustfs/operator.git
cd operator
cargo build --release

# Emit Tenant CRD YAML (stdout or file)
cargo run -- crd
cargo run -- crd -f tenant-crd.yaml

# Run the controller (needs kubeconfig / in-cluster config)
cargo run -- server

# Run the operator HTTP console API (default :9090)
cargo run -- console
```

**Docker**

```bash
docker build -t rustfs/operator:dev .
```

**End-to-end on Kind** (single-node or multi-node) — see [`scripts/README.md`](scripts/README.md).

## Development

From the repo root:

| Command | Purpose |
|--------|---------|
| `make pre-commit` | Full local gate: Rust `fmt` / `clippy` / `test` + `console-web` ESLint and Prettier (run after `npm install` in `console-web/`). |
| `make fmt` / `make clippy` / `make test` | Individual Rust checks. |
| `make console-lint` / `make console-fmt-check` | Frontend only. |

CI (`.github/workflows/ci.yml`) runs Rust tests (including `nextest`), `cargo fmt --check`, and `clippy`; it does **not** run `console-web` checks — use **`make pre-commit`** before opening a PR so frontend changes are validated.

Contribution workflow, commit style, and PR expectations: [`CONTRIBUTING.md`](CONTRIBUTING.md).

## Repository layout

Expand All @@ -13,4 +71,19 @@ RustFS Kubernetes operator (under development; not production-ready).
- `deploy/k8s-dev/` — Development Kubernetes YAML
- `deploy/kind/` — Kind cluster configs (e.g. 4-node)
- **examples/** — Sample Tenant CRs
- **console-web/** — Operator management UI (Next.js)
- **docs/** — Architecture and development documentation

## Documentation

| Doc | Content |
|-----|---------|
| [CLAUDE.md](CLAUDE.md) | Architecture, reconcile loop, CRD fields, RustFS ports and env (maintainer / AI context). |
| [CONTRIBUTING.md](CONTRIBUTING.md) | Quality gates, `make pre-commit`, PR rules. |
| [docs/DEVELOPMENT.md](docs/DEVELOPMENT.md) | Local environment (kind, IDE, workflows). |
| [docs/architecture-decisions.md](docs/architecture-decisions.md) | ADRs. |
| [CHANGELOG.md](CHANGELOG.md) | Release notes. |

## License

Licensed under the **Apache License 2.0** — see [LICENSE](LICENSE).
5 changes: 4 additions & 1 deletion console-web/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@ FROM node:22-alpine AS builder

WORKDIR /app

RUN corepack enable && corepack prepare pnpm@latest --activate
# Pin pnpm to package.json "packageManager" (avoid corepack fetching pnpm@latest from npm;
# that fetch can fail behind proxies / flaky TLS during docker build).
ARG PNPM_VERSION=10.28.1
RUN npm install -g pnpm@${PNPM_VERSION}

COPY package.json pnpm-lock.yaml* pnpm-workspace.yaml* ./
RUN pnpm install --frozen-lockfile
Expand Down
80 changes: 44 additions & 36 deletions deploy/rustfs-operator/crds/tenant-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -96,33 +96,6 @@ spec:
format: int32
nullable: true
type: integer
securityContext:
description: |-
Override Pod SecurityContext when encryption is enabled.
If not set, the default RustFS Pod SecurityContext is used
(runAsUser/runAsGroup/fsGroup = 10001).
nullable: true
properties:
fsGroup:
description: GID applied to all volumes mounted in the Pod.
format: int64
nullable: true
type: integer
runAsGroup:
description: GID to run the container process as.
format: int64
nullable: true
type: integer
runAsNonRoot:
description: 'Enforce non-root execution (default: true).'
nullable: true
type: boolean
runAsUser:
description: UID to run the container process as.
format: int64
nullable: true
type: integer
type: object
vault:
description: 'Vault-specific settings (required when `backend: vault`).'
nullable: true
Expand All @@ -145,12 +118,21 @@ spec:
type: integer
type: object
authType:
default: token
description: |-
Authentication method: `token` (default, implemented) or `approle`
(type defined in rustfs-kms but backend not yet functional).
description: Authentication method. Defaults to `token` when not set.
enum:
- token
- approle
- null
nullable: true
type: string
customCertificates:
description: |-
Enable custom TLS certificates for the Vault connection.
When `true`, the operator mounts TLS certificate files from the KMS Secret
and configures the corresponding environment variables.
The Secret must contain: `vault-ca-cert`, `vault-client-cert`, `vault-client-key`.
nullable: true
type: boolean
endpoint:
description: Vault server endpoint (e.g. `https://vault.example.com:8200`).
type: string
Expand All @@ -167,7 +149,7 @@ spec:
nullable: true
type: string
tlsSkipVerify:
description: 'Enable TLS verification for Vault connection (default: true).'
description: Skip TLS certificate verification for Vault connection.
nullable: true
type: boolean
required:
Expand Down Expand Up @@ -1199,7 +1181,7 @@ spec:
format: int32
type: integer
x-kubernetes-validations:
- message: servers must be gather than 0
- message: servers must be greater than 0
rule: self > 0
tolerations:
description: Tolerations allow pods to schedule onto nodes with matching taints.
Expand Down Expand Up @@ -1314,12 +1296,12 @@ spec:
- servers
type: object
x-kubernetes-validations:
- messageExpression: '"pool " + self.name + " with 2 servers must have at least 4 volumes in total"'
- messageExpression: '"pool " + self.name + " must have at least 4 total volumes (servers × volumesPerServer)"'
reason: FieldValueInvalid
rule: '!(self.servers * self.persistence.volumesPerServer < 4 && self.servers == 2)'
rule: self.servers * self.persistence.volumesPerServer >= 4
- messageExpression: '"pool " + self.name + " with 3 servers must have at least 6 volumes in total"'
reason: FieldValueInvalid
rule: '!(self.servers * self.persistence.volumesPerServer < 4 && self.servers == 3)'
rule: self.servers != 3 || self.servers * self.persistence.volumesPerServer >= 6
type: array
x-kubernetes-validations:
- message: pools must be configured
Expand All @@ -1330,6 +1312,32 @@ spec:
scheduler:
nullable: true
type: string
securityContext:
description: |-
Override the default Pod SecurityContext (runAsUser/runAsGroup/fsGroup = 10001).
Applies to all RustFS pods in this Tenant.
nullable: true
properties:
fsGroup:
description: GID applied to all volumes mounted in the Pod.
format: int64
nullable: true
type: integer
runAsGroup:
description: GID to run the container process as.
format: int64
nullable: true
type: integer
runAsNonRoot:
description: 'Enforce non-root execution (default: true).'
nullable: true
type: boolean
runAsUser:
description: UID to run the container process as.
format: int64
nullable: true
type: integer
type: object
serviceAccountName:
nullable: true
type: string
Expand Down
Loading
Loading