Skip to content

feat(k8s): unify AWS (EKS) and on-premise Kubernetes deployment under bin/k8s #5891

Description

@aicam

Feature Summary

Today bin/k8s on main ships a single Helm chart that targets a local / on-premise Kubernetes cluster only. The same chart is used for our AWS (EKS) deployment, but everything underneath it — the EKS cluster, S3 bucket (LakeFS backend), IAM/IRSA roles, Elastic IP, DNS, node pools, cert-manager — is created by hand and lives nowhere in the repo. None of that substrate is captured as code, so standing up Texera on a fresh AWS account is an undocumented, manual prerequisite checklist that drifts and is hard to reproduce.

Goal: unify the local and AWS deployment stories so there is one documented, reproducible path from "empty cluster (or empty AWS account)" to "running Texera," while keeping the Helm chart itself cloud-agnostic and the on-prem/local install unchanged as the default.

This issue tracks the big picture; it follows the design discussion in #5641 (current lean: eksctl for cluster substrate, with the Helm chart staying portable, and a small amount of non-cluster substrate — S3 bucket, EIP, DNS — scripted alongside).

Proposed Solution or Design

Land the change as a series of small, independently reviewable, non-breaking PRs (on-prem stays the default throughout). Each becomes a sub-task of this issue:

  • Refactor: reorganize Helm templates into common/aws/onprem + per-component subfolders — establishes the structure later steps build on (no behavior change).
  • Make object storage pluggable — optional in-cluster MinIO + external S3 for the app, LakeFS, and Lakekeeper.
  • Core-services node placement + autoscaler safetynodeSelector/tolerations for singletons + do-not-disrupt annotations (empty/no-op by default).
  • AWS load-balancer front door — extend gatewayConfig with an EnvoyProxy carrying AWS NLB/EIP/AZ annotations (renders nothing off-AWS).
  • Warm computing-unit pool — warm placeholder to mitigate autoscaler cold-start latency on CU launches.
  • eksctl provisioning + util manifests + AWS runbook — example eksctl ClusterConfig, EBS StorageClass / cert-manager util manifests, and an end-to-end provisioning doc.

Each step is verified to keep the default (on-prem) render unchanged and to render the AWS-specific resources only under an opt-in AWS values overlay.

Affected Area

Deployment / Infrastructure

Metadata

Metadata

Assignees

Labels

No labels
No labels
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions