Conversation
Restructures documentation for an open-source audience. The README is now
a slim landing page (project framing, why-Hoptimator, quickstart pointer,
status, license) instead of a mixed dev-guide. Detailed content moves into
a journey-based docs/ tree modeled on linkedin/venice's structure:
docs/
index.md -- top-level landing
getting-started/
index.md -- section landing
quickstart.md -- 5-minute walkthrough on Docker Desktop
concepts.md -- vocabulary reference
architecture.md -- life of a SQL statement, module map
resources/
learn-more.md -- engineering blog posts and case studies
Also cleans up CONTRIBUTING.md: removes placeholder "(link to more info)"
URLs and adds a how-to-file-an-issue / how-to-send-a-PR section.
Phase 2 (user guide) and beyond will follow on the same branch.
Adds the User guide section to docs/. Covers the three client interfaces
(SQL CLI, JDBC, MCP) and the two reference pages (DDL, Hints):
docs/user-guide/
index.md -- section landing
sql-cli.md -- ./hoptimator script, sqlline + custom commands
(!intro, !resolve, !pipeline, !specify)
jdbc.md -- jdbc:hoptimator:// URL format, full connection-
property reference, Java example, system tables
mcp-server.md -- MCP tool surface (discovery / planning /
execution), recommended agent workflow
ddl-reference.md -- CREATE/DROP for views/materialized views/triggers
/functions/tables, PAUSE/RESUME/REFRESH/FIRE,
identifier rules, k8s system schema, what isn't
supported
hints.md -- template hints vs connector hints, where to set
them, how to read what was applied
Also promotes the docs/index.md "User guide" entry from "coming soon" to
linked, and updates the README's Documentation section to match.
Two corrections after a closer look at what the executor actually handles: DDL reference: - Drops CREATE FUNCTION as a documented form. The grammar accepts it, but no executor handler exists. - Drops the standalone REFRESH/FIRE section and the "PAUSE/RESUME also works for materialized views" line. None of these have executor support. - Adds a "Reserved syntax" section that lists what parses but does not execute today (REFRESH MATERIALIZED VIEW, FIRE *, PAUSE/RESUME MATERIALIZED VIEW, CREATE FUNCTION) so readers don't get a false positive from a successful parse. MCP server: - Rewrites the `query` description to explain why it's restricted to ADS / PROFILE / METADATA / K8S: not a safety allowlist, but the only schemas Hoptimator can answer queries from without a configured engine. The Zeppelin POC for notebook-style execution is acknowledged as incomplete. - Updates the limitations bullet accordingly so it doesn't suggest "just widen the allowlist" as a fix.
Six corrections from review: CONTRIBUTING: - Add a "cover your changes" step. Document `make coverage` and an 80% line-coverage target on changed code (CI currently enforces a softer 60% / 40%). README: - Badge: add explicit `&label=CI` so the build status renders as "CI" instead of the default "build". - Drop the link to GitHub Packages. The page is empty in practice; only JFrog has artifacts today. - Soften the "That one statement becomes:" list. The exact resources emitted depend on the registered Databases and templates, not on Hoptimator. Frame it as "with a typical Kafka + Flink setup" and add a paragraph explaining that the same SQL can target a different stack by swapping templates. - Replace the "Kubernetes-native" why-bullet with a more honest "Kubernetes out of the box, not as a hard requirement" — the bundled deployers target K8s, but `Deployer` is the actual extension point. Quickstart: - Add a note on `CREATE OR REPLACE MATERIALIZED VIEW`. Without it, re-running a CREATE for an existing view fails; with it, the development loop is much faster. Concepts (engine clarification): - Split "Engines and connectors" into separate "Connectors" and "Engines (optional)" sections. Connectors do not require an Engine to function — Hoptimator emits YAML (e.g. a FlinkSessionJob) and an unrelated operator (Flink Kubernetes Operator, etc.) runs it. - The Engine CRD is specifically about *query* execution (e.g. running SELECT against tables that need a runtime). Pipeline materialization does not need one. Mark the Engine path as partially developed today. - Strengthen the Deployers section to lead with "Kubernetes is the default, not a hard requirement." - Rename the trailing "Engines today" section to "Bundled adapters and runtimes" and reframe accordingly. Architecture: - Step 4 (Deploy) now leads with "Kubernetes is the path of least resistance, not a hard requirement" and notes that the implementation resources Hoptimator emits aren't run by Hoptimator — the relevant operator (Strimzi, Flink Kubernetes Operator, etc.) runs them. - Reword the "A new engine" extension bullet so it doesn't conflate pipeline runtimes with the Engine CRD's query path. JDBC user guide: - Drop the GitHub Packages link from the dependency section to match the README change.
… MCP DDL) Four corrections from review. DDL reference: - Add a "Partial views (multiple pipelines into one sink)" subsection under CREATE MATERIALIZED VIEW. Explains the `$<suffix>` syntax, shows the multi-writer pattern with two views feeding the same VENICE.AUDIENCE sink, and recommends partial views as the default for production cases. Cross-link from the CREATE MATERIALIZED VIEW bullet list. - Rewrite CREATE TABLE. The previous text underplayed it — CREATE TABLE goes through the Deployer SPI to actually provision real infrastructure (e.g. creating a Kafka topic via the Kafka deployer rather than a separate Strimzi manifest). Show the example of declaring a Kafka topic with partitions and then using a partial view to write to it. Note that AS <query> isn't supported today. - Trigger section: add one-line framing about what triggers enable (backfills, rETL refreshes, downstream notifications, ops hooks) and link to the concepts page for the bigger picture. Concepts: - Expand the TableTrigger section. Lead with what triggers actually let you express (backfills tied to offline-tier arrivals, rETL on cron, downstream notifications, operational hooks). Explain the status-patch mechanism that fires triggers and how that makes them composable with whatever already owns the upstream system. Note that triggers can be auto-generated from TableTemplates so adapters can ship sensible defaults. - Close with the design summary: "pipelines stay pure data-flow expressions, triggers carry the imperative side effects, and the two compose at the table level." MCP server: - Add a limitations bullet flagging that `modify` only accepts CREATE [OR REPLACE] MATERIALIZED VIEW and DROP today, not the full Hoptimator DDL surface. Triggers, plain views, tables, and the inspection-only DDL still need the JDBC driver or the SQL CLI. Plus an intentional in-flight edit to docs/user-guide/sql-cli.md: - Replace placeholder `MY.AUDIENCE` examples with `ADS.AUDIENCE` (matches the demo's registered schemas) and the elided `!pipeline` / `!specify` outputs with the actual ones produced by the demo.
The "Kubernetes-native control plane for multi-hop data pipelines" framing
oversold the K8s coupling and undersold the SQL-first part. Kubernetes is
the default deployer, not the differentiator — Hoptimator's job is to
compile SQL into multi-system pipelines, with the runtime substrate
pluggable underneath.
New phrasing:
- README h3: "A SQL control plane for multi-system data pipelines"
- README intro: "Hoptimator turns SQL into running, multi-hop data
pipelines that span Kafka, Flink, Venice, and anything
else you plug in."
- docs/index: "Hoptimator is a SQL control plane for multi-system data
pipelines. You write SQL; it figures out the topology
across Kafka, Flink, Venice, and whatever else you plug
in, generates the specs, deploys them, and reconciles
them."
This keeps the "control plane" framing (planner-not-runtime) while
removing the K8s lock-in suggestion and naming the actual systems
Hoptimator spans up front.
The previous section described LogicalTables in six lines, framed mostly around the YAML shape. That undersells what they actually do — the abstraction is the point, not the driver. Rewrites the "Logical tables" section in concepts.md to cover: - The abstraction value (one named entity, N physical backends; collapses the typical mess of three names + hand-built sync jobs into a single declaration). - The tier model (nearline / online / offline) with a table mapping each tier to typical backends and the role it plays. - What you get for free at deploy time: physical tier resources via the Deployer SPI, implicit inter-tier sync pipelines (nearline → online, nearline → offline), auto-backfill triggers when offline is bound, and one schema source-of-truth resolved from nearline. - Why this matters as an abstraction: tier-agnostic application code, the right topology being the cheap path, and clean composition with partial views / materialized views. - The classic use case (lambda / kappa for feature stores) so readers recognize the pattern. - An explicit note that LogicalTables ship as a JDBC driver today but function as an abstraction model — the deployer does the heavy lifting at create time, not the driver at query time. Also updates the at-a-glance table entry from "A single logical entity that spans multiple physical storage tiers" to lead with "An abstraction model" and call out the auto-sync/auto-backfill behaviors.
Hoptimator's parser uses Calcite's `'key' 'value'` form (whitespace, no `=`) inside WITH clauses, not the `'key'='value'` form. The `=` form is Flink's syntax and only shows up in auto-generated pipeline output. Fixes the four user-facing DDL signatures and the example: - CREATE MATERIALIZED VIEW - CREATE TRIGGER - CREATE TABLE (both signature and example) - CREATE TABLE example: `'kafka.partitions' '8'` Adds a "WITH options syntax" section explicitly noting the difference and calling out that the `=` form readers see in `!pipeline` / `!specify` output is the Flink engine's syntax, not Hoptimator's input grammar. The `=` instances elsewhere in the docs (auto-generated Flink SQL in sql-cli.md output blocks, the truncated pipeline SQL in quickstart.md) are correct as-is and were left alone.
Adds the Kubernetes section to docs/. Five new pages covering everything
needed to operate Hoptimator on a cluster:
docs/kubernetes/
index.md -- section landing
operator.md -- what hoptimator-operator does, controllers it
runs (PipelineReconciler, TableTriggerReconciler,
ViewReconciler), how to deploy it, RBAC,
namespace scoping, lifecycle of a pipeline,
when not to run the operator
crd-reference.md -- field-by-field for all 10 CRDs (Database, View,
Pipeline, TableTemplate, JobTemplate,
TableTrigger, Subscription, LogicalTable,
Engine, SqlJob) with spec/status/printer-column
tables and one example per kind
templates.md -- TableTemplate / JobTemplate authoring deep
dive: matching rules (databases, methods),
full placeholder syntax (subst, defaults,
conditionals, transforms, multiline), the
default placeholders K8sSourceDeployer and
K8sJobDeployer inject, where hint and
configmap values fit, common patterns
triggers.md -- TableTrigger operational guide: cron vs
status-driven firing, pause/resume,
jobProperties, common patterns (offline-tier
backfill, rETL, downstream notification, ops
hooks), when not to use a trigger
configuration.md -- hoptimator-configmap, ConfigProvider SPI,
three-source precedence (system properties <
configmap < hints), file-like keys and lazy
expansion, pod-namespace detection, writing
a custom ConfigProvider
Also promotes the docs/index.md and README.md "Kubernetes guide
(coming soon)" entries to live links.
- SqlJob: reframe as a primitive consumed by an external SqlJob operator that deploys Flink and Flink-Beam SQL jobs. Drop the "useful when a job doesn't fit CREATE MATERIALIZED VIEW" framing, which conflated SqlJob with materialized-view tooling. - operator.md: drop the "does not yet emit Kubernetes events" line. Confirmed by grep that no events are emitted, but the term is jargon that doesn't help an open-source reader without a side explanation. The remaining "logs are the primary debugging surface" carries the practical guidance.
The full `k8s.*` connection-property table in the JDBC user guide contradicted the "Kubernetes is the default deployer, not a hard requirement" framing established elsewhere — those properties are deployer-specific, not driver-specific. Consolidates so the table lives in one place: - jdbc.md: replaces the "Kubernetes context" subsection with a short "Deployer-specific properties" note pointing at the Kubernetes guide, and explicitly calls out that a different deployer would expose its own `<deployer>.*` properties. - kubernetes/configuration.md: takes the full table (now with the Default column merged in from the jdbc.md version), replaces the pointer to jdbc.md with one that names the driver-level surface (catalogs, hints, fun) so readers know what each page covers. The table content is unchanged; this is a re-homing edit.
The Java `V1alpha1*` model classes under hoptimator-k8s are generated
from the CRD YAMLs by `make generate-models` (which shells out to the
upstream Kubernetes Java client's `crd-model-gen` Docker image). Without
that callout, contributors who add a CRD field have no obvious way to
discover that they need to regenerate.
Adds the reference in the two natural places:
- CONTRIBUTING.md: new step in the PR checklist ("Regenerate Java models
if you touched a CRD"), with the command, the Docker requirement, and
a pointer to the upstream tool.
- docs/kubernetes/crd-reference.md: a callout block near the top of the
page so it's visible to anyone landing on the CRD reference while
modifying one.
The previous text described `{{var==value}}` as an inline conditional
that emits a block of the template when the condition matches. That's
wrong. Reading Template.java's `render` method:
- On a `==` / `!=` marker whose condition is satisfied, the marker
is erased and rendering continues normally.
- On a marker whose condition is *not* satisfied, the renderer
`return null;`s the entire template — it produces nothing.
So the markers are template-level guards, not inline blocks. The
real-world pattern is "two templates with mirrored guards; whichever
condition matches is the one that fires." The bundled flink-template
uses this to swap between SQL-job and Beam-job entry classes — there
isn't an `{{end}}` companion marker (which the docs invented) and the
syntax doesn't conditionally include a single line.
Fixes:
- Placeholder syntax table: drop the imagined `{{end}}` companion;
reword the `==` / `!=` rows as "template-level guard: render this
template only if X; otherwise skip the whole template."
- Drop `{{var toName}}` from the table — it's documented in
Template.java's javadoc but not implemented in `applyTransform()`,
which only handles `toLowerCase`, `toUpperCase`, and `concat`.
Documenting it would invite reports of a "broken" feature.
- Replace the "Conditional rendering example" section with one that
shows the actual pattern: two JobTemplates with mirroring `==`
guards on `flink.app.type`, only one of which renders for a given
pipeline.
- Rename the section "Conditional templates" so the heading no longer
implies block-level conditionals.
Adds the Extending section. Five pages covering the SPI surfaces a
contributor would touch when integrating a new system or customizing
behavior end-to-end:
docs/extending/
index.md -- section landing with a "pick the right
surface" decision table; explains the
ServiceLoader-based loading pattern and
the META-INF/services file layout
data-sources.md -- JDBC adapter walkthrough (DriverVersion,
Schema, register()), Database CRD wiring,
TableTemplate-only path for declarative
integrations, when to reach for a custom
Deployer instead, the connector-only
pattern for pre-existing infrastructure,
when ConnectorProvider applies
deployers.md -- Deployer interface + lifecycle (create /
update / delete / specify / restore),
DeployerProvider with priority semantics,
KafkaDeployerProvider as a concrete shape
to copy, opt-in Validation, testing
surface, common pitfalls (missing restore,
side effects in specify, wrong priority,
wrong exception types)
validators.md -- the three points validation runs
(parsed SQL, resolved object, deployer
collection), Issues tree API with severity
levels, the two participation patterns
(Validated on your own type vs.
ValidatorProvider for cross-cutting
policy), built-in providers as references,
authoring patterns and testing
config-providers.md -- ConfigProvider SPI mechanics, when to write
one (vault, central config service,
dynamic per-connection values), example
sketch, interaction with hints/JDBC props,
caveats (latency on every connection,
tolerated errors, no SPI ordering control)
Also promotes the docs/index.md and README.md "Extending Hoptimator
(coming soon)" entries to live links.
Lands a top-level CLAUDE.md so Claude Code agents pick up the high-leverage context for this repo automatically — what the project is, where the docs live, common commands, the active module layout (plus the marked-for-deletion list), the gotchas that aren't obvious from reading code, the patterns to prefer, distilled testing rules, and a "keep docs in sync" reminder. Sections: - What this repo is — three roles (planner/adapter/operator); Kubernetes is default-not-required; Deployer SPI is the actual extension point. - Read these first — pointers into docs/ rather than duplicated content. - Common commands — make build/test/integration-tests/coverage/ generate-models/deploy*; ./hoptimator; ./start-mcp-server. Calls out that generate-models is required after any CRD edit. - Module layout — active modules with one-line purpose; explicit list of marked-for-deletion modules so contributions don't land there. - Gotchas — the items an agent would otherwise discover by trial: WITH 'key' 'value' (not =), partial-views as default, Engine CRD partially developed, reserved-but-unimplemented DDL, template guards are template-level not inline, toName documented but not implemented, MCP query allowlist isn't a safety mechanism, alpha status (don't add backcompat shims unless asked), Checkstyle + SpotBugs enforced. - Patterns to prefer — declarative > imperative, validators > runtime checks, hints > template edits, configmap > hints, CREATE OR REPLACE for iteration. - Testing — distilled from the internal testing-best-practices file. Focus on the non-obvious anti-patterns an agent would otherwise reach for: never @MockitoSettings LENIENT class-wide, no reflection field injection, no coverage exclusions for new files, doReturn for wildcard generics, MockedStatic as @mock field not try-with-resources, "find bugs not coverage." - Keep docs in sync — concrete mapping from change-type to which doc(s) to update, plus the "README is slim, docs/ is journey-based" framing reminder. ~175 lines. Designed to load fast and survive multiple sessions without rotting.
…e code Four targeted improvements from review. Concepts: - Add a Validators section. Validators were only mentioned in extending/validators.md, so the conceptual surface was missing from the place readers learn vocabulary. Covers the "Deployer makes things real, Validator says whether they're allowed" framing, when validation runs, and what the bundled validators do. - Add a Validator row to the at-a-glance table. - Open the page with an Apache Calcite acknowledgment + link to the Calcite reference, since the parser/planner/JDBC layer all build on it and readers should know where to look for SELECT/expression syntax. DDL reference: - Add a Quidem callout in the page intro pointing at the .id files under each module's src/test/resources/ as a fast way to see currently-passing examples of every DDL form. CONTRIBUTING: - Add a Quidem paragraph to the "Build and test locally" step explaining what .id files are, where they live, and that they're the right place to extend coverage when changing DDL parsing, planning, or any user-facing SQL behavior. Extending → deployers: - Drop the trimmed KafkaDeployerProvider Java snippet; replace with bullets pointing at the real KafkaDeployerProvider and KafkaDeployer files on GitHub. Reading the source is more reliable than reading a drift-prone copy of it. Extending → data-sources: - Drop the synthesized MySystemDriver code block; replace with a short list pointing at the bundled hoptimator-demodb / -kafka / -venice / -mysql modules as progressively richer reference adapters. Keep the prose description of the common shape. Other illustrative code blocks (NamingPolicyValidator, VaultConfigProvider, the deployer testing snippet) are kept — they're not copies of repo code, they're shape-of-what-you'd-write examples that don't drift with the implementation.
Five overlap cleanups identified in review. Net effect: concepts.md
goes from ~430 lines to 328 with no information loss — every cut
already lives in a more authoritative page.
Concepts:
- TableTriggers: drop the four-bullet "what triggers enable" list
and the status-patch explanation, since kubernetes/triggers.md
now owns the operational depth. Keep the one-line decoupling
philosophy ("pipelines stay pure data-flow expressions, triggers
carry the imperative side effects") and link out for patterns.
~35 lines removed.
- Logical tables: drop the "Why this matters as an abstraction"
three-bullet section, which restated "What you get for free" from
a different angle. Drop the "Implementation note" header and
fold its substance into a one-line aside. The "classic use case"
section becomes one paragraph instead of a bulleted breakdown.
~50 lines removed; the unique abstraction-model framing stays.
- Configuration and hints: replace the ~25-line section with a
5-line stub pointing at user-guide/hints.md (full hint mechanics)
and kubernetes/configuration.md (configmap + precedence). The
details lived in three places; now they live in two.
Quickstart:
- Drop the six-row CLI command table that duplicated
user-guide/sql-cli.md. Keep the !intro / !quit pointers and
link out for the rest. ~10 lines removed; one fewer place for
the table to drift.
Plus user's intentional in-flight edits to extending/data-sources.md
and extending/deployers.md trimming "Read the source rather than rely
on a snippet here" to cleaner intro lines.
Cleanups #6-#8 from review. #6 Engine "partially developed" caveat: - crd-reference.md kept its full Engine section content — but the "partially developed" framing was identical to concepts.md's Engines (optional) section. Replace with a one-line pointer to concepts; keep the field tables since those are the unique value of a CRD-reference page. - Trim the at-a-glance row similarly. - concepts.md remains the canonical home of the Engine framing; CLAUDE.md preserves it as a gotcha for agents. #7 K8s-as-pluggable framing: - architecture.md had it twice — once as prose in "Step 4 — Deploy" and once as a bullet in "Where to extend". Trim the Step 4 prose: drop the "path of least resistance, not a hard requirement" sentence and the "every page assumes a cluster" reasoning, since the same point is made in the extension bullet and on the README's Why-Hoptimator list. The Step 4 paragraph still notes that bundled deployers target K8s and that the SPI is the swap point. - README's Why-Hoptimator bullet remains the canonical statement. #8 DDL reference's two non-support sections: - Merge "Reserved syntax" and "What is *not* supported" into one "What's not supported" section with two clearly labeled subgroups: "Out of scope" (INSERT/UPDATE/DELETE against arbitrary tables, ALTER TABLE, transactions, stored procedures) and "Parses but not yet executed" (REFRESH MATERIALIZED VIEW, FIRE *, PAUSE/RESUME MV, CREATE FUNCTION). Same content; readers no longer have to scroll past identifiers/WITH/system-tables to find the second list.
Two normalization passes.
hints.md (user guide):
- Drop the deep "Template hints" / "Connector hints" sub-sections that
duplicated kubernetes templates content; replace with a brief two-flavor
paragraph and a pointer to "Templates and configuration" for the full
story.
- Remove the connector-hint examples block and the source/sink
segment-meaning table (lives in templates/configuration now).
- Keep the user-side surface that's unique to this page: where to set
hints (CLI, JDBC, Subscription, MCP), format, advisory nature,
reading what was applied via `kubectl get pipeline ... -o yaml`,
and the hint-vs-template-default decision rule.
kubernetes/templates.md → kubernetes/configuration.md (merged):
- Templates were already part of "configuration" in the broader sense
(templates are the artifacts; configmap/hints/system-props supply the
values). Merge the two pages into a single "Templates and
configuration" reference, file named configuration.md (the more
general name).
- Top half: template authoring (matching, structure, placeholder
syntax, conditionals, patterns, tips) — content from the old
templates.md.
- Bottom half: where placeholder values come from — deployer-injected
defaults, the three configuration sources (hoptimator-configmap,
JDBC connection properties / Subscription hints, JVM system
properties), precedence, file-like keys, k8s.* connection-properties
reference, pod-namespace detection, ConfigProvider extension pointer.
- Update every cross-link: docs/index.md, docs/user-guide/hints.md,
docs/extending/{data-sources,deployers,index}.md,
docs/kubernetes/{crd-reference,index}.md.
- Delete docs/kubernetes/templates.md.
Plus user-side fixes in this round:
- ddl-reference.md: drop the standalone "WITH options syntax" section
(the form signatures already show the 'key' 'value' shape inline).
- architecture.md, jdbc.md: change example schema MY.AUDIENCE →
ADS.AUDIENCE to match the demo's registered schemas.
Code Coverage
|
Collaborator
|
Nice! Wonder if we can set up some sort of automation to revisit this as changes are made. |
Collaborator
Author
Automate, not sure. I did add a blurb to the CLAUDE.md that will hopefully help it not go out of date, assuming we use Claude. But I did go through this change and purposefully exclude really specific things that seemed likely to drift. |
ryannedolan
approved these changes
Apr 28, 2026
Collaborator
ryannedolan
left a comment
There was a problem hiding this comment.
Crazy to see how many features we have hidden in here :)
jogrogan
added a commit
that referenced
this pull request
May 1, 2026
* Docs: rewrite README and add docs/ tree (Phase 1)
Restructures documentation for an open-source audience. The README is now
a slim landing page (project framing, why-Hoptimator, quickstart pointer,
status, license) instead of a mixed dev-guide. Detailed content moves into
a journey-based docs/ tree modeled on linkedin/venice's structure:
docs/
index.md -- top-level landing
getting-started/
index.md -- section landing
quickstart.md -- 5-minute walkthrough on Docker Desktop
concepts.md -- vocabulary reference
architecture.md -- life of a SQL statement, module map
resources/
learn-more.md -- engineering blog posts and case studies
Also cleans up CONTRIBUTING.md: removes placeholder "(link to more info)"
URLs and adds a how-to-file-an-issue / how-to-send-a-PR section.
Phase 2 (user guide) and beyond will follow on the same branch.
* Docs: add user guide (Phase 2)
Adds the User guide section to docs/. Covers the three client interfaces
(SQL CLI, JDBC, MCP) and the two reference pages (DDL, Hints):
docs/user-guide/
index.md -- section landing
sql-cli.md -- ./hoptimator script, sqlline + custom commands
(!intro, !resolve, !pipeline, !specify)
jdbc.md -- jdbc:hoptimator:// URL format, full connection-
property reference, Java example, system tables
mcp-server.md -- MCP tool surface (discovery / planning /
execution), recommended agent workflow
ddl-reference.md -- CREATE/DROP for views/materialized views/triggers
/functions/tables, PAUSE/RESUME/REFRESH/FIRE,
identifier rules, k8s system schema, what isn't
supported
hints.md -- template hints vs connector hints, where to set
them, how to read what was applied
Also promotes the docs/index.md "User guide" entry from "coming soon" to
linked, and updates the README's Documentation section to match.
* Docs: correct DDL/MCP coverage based on what's actually wired up
Two corrections after a closer look at what the executor actually handles:
DDL reference:
- Drops CREATE FUNCTION as a documented form. The grammar accepts it,
but no executor handler exists.
- Drops the standalone REFRESH/FIRE section and the "PAUSE/RESUME also
works for materialized views" line. None of these have executor support.
- Adds a "Reserved syntax" section that lists what parses but does not
execute today (REFRESH MATERIALIZED VIEW, FIRE *, PAUSE/RESUME
MATERIALIZED VIEW, CREATE FUNCTION) so readers don't get a false
positive from a successful parse.
MCP server:
- Rewrites the `query` description to explain why it's restricted to
ADS / PROFILE / METADATA / K8S: not a safety allowlist, but the only
schemas Hoptimator can answer queries from without a configured engine.
The Zeppelin POC for notebook-style execution is acknowledged as
incomplete.
- Updates the limitations bullet accordingly so it doesn't suggest
"just widen the allowlist" as a fix.
* Docs: address Phase 1 review feedback
Six corrections from review:
CONTRIBUTING:
- Add a "cover your changes" step. Document `make coverage` and an 80%
line-coverage target on changed code (CI currently enforces a softer
60% / 40%).
README:
- Badge: add explicit `&label=CI` so the build status renders as "CI"
instead of the default "build".
- Drop the link to GitHub Packages. The page is empty in practice; only
JFrog has artifacts today.
- Soften the "That one statement becomes:" list. The exact resources
emitted depend on the registered Databases and templates, not on
Hoptimator. Frame it as "with a typical Kafka + Flink setup" and add
a paragraph explaining that the same SQL can target a different stack
by swapping templates.
- Replace the "Kubernetes-native" why-bullet with a more honest
"Kubernetes out of the box, not as a hard requirement" — the bundled
deployers target K8s, but `Deployer` is the actual extension point.
Quickstart:
- Add a note on `CREATE OR REPLACE MATERIALIZED VIEW`. Without it,
re-running a CREATE for an existing view fails; with it, the
development loop is much faster.
Concepts (engine clarification):
- Split "Engines and connectors" into separate "Connectors" and
"Engines (optional)" sections. Connectors do not require an Engine
to function — Hoptimator emits YAML (e.g. a FlinkSessionJob) and an
unrelated operator (Flink Kubernetes Operator, etc.) runs it.
- The Engine CRD is specifically about *query* execution (e.g. running
SELECT against tables that need a runtime). Pipeline materialization
does not need one. Mark the Engine path as partially developed today.
- Strengthen the Deployers section to lead with "Kubernetes is the
default, not a hard requirement."
- Rename the trailing "Engines today" section to "Bundled adapters and
runtimes" and reframe accordingly.
Architecture:
- Step 4 (Deploy) now leads with "Kubernetes is the path of least
resistance, not a hard requirement" and notes that the implementation
resources Hoptimator emits aren't run by Hoptimator — the relevant
operator (Strimzi, Flink Kubernetes Operator, etc.) runs them.
- Reword the "A new engine" extension bullet so it doesn't conflate
pipeline runtimes with the Engine CRD's query path.
JDBC user guide:
- Drop the GitHub Packages link from the dependency section to match
the README change.
* Docs: Phase 2 review feedback (partial views, triggers, CREATE TABLE, MCP DDL)
Four corrections from review.
DDL reference:
- Add a "Partial views (multiple pipelines into one sink)" subsection
under CREATE MATERIALIZED VIEW. Explains the `$<suffix>` syntax,
shows the multi-writer pattern with two views feeding the same
VENICE.AUDIENCE sink, and recommends partial views as the default
for production cases. Cross-link from the CREATE MATERIALIZED VIEW
bullet list.
- Rewrite CREATE TABLE. The previous text underplayed it — CREATE
TABLE goes through the Deployer SPI to actually provision real
infrastructure (e.g. creating a Kafka topic via the Kafka deployer
rather than a separate Strimzi manifest). Show the example of
declaring a Kafka topic with partitions and then using a partial
view to write to it. Note that AS <query> isn't supported today.
- Trigger section: add one-line framing about what triggers enable
(backfills, rETL refreshes, downstream notifications, ops hooks)
and link to the concepts page for the bigger picture.
Concepts:
- Expand the TableTrigger section. Lead with what triggers actually
let you express (backfills tied to offline-tier arrivals, rETL on
cron, downstream notifications, operational hooks). Explain the
status-patch mechanism that fires triggers and how that makes them
composable with whatever already owns the upstream system. Note
that triggers can be auto-generated from TableTemplates so adapters
can ship sensible defaults.
- Close with the design summary: "pipelines stay pure data-flow
expressions, triggers carry the imperative side effects, and the
two compose at the table level."
MCP server:
- Add a limitations bullet flagging that `modify` only accepts
CREATE [OR REPLACE] MATERIALIZED VIEW and DROP today, not the full
Hoptimator DDL surface. Triggers, plain views, tables, and the
inspection-only DDL still need the JDBC driver or the SQL CLI.
Plus an intentional in-flight edit to docs/user-guide/sql-cli.md:
- Replace placeholder `MY.AUDIENCE` examples with `ADS.AUDIENCE`
(matches the demo's registered schemas) and the elided `!pipeline`
/ `!specify` outputs with the actual ones produced by the demo.
* Docs: switch tagline from "Kubernetes-native" to "SQL control plane"
The "Kubernetes-native control plane for multi-hop data pipelines" framing
oversold the K8s coupling and undersold the SQL-first part. Kubernetes is
the default deployer, not the differentiator — Hoptimator's job is to
compile SQL into multi-system pipelines, with the runtime substrate
pluggable underneath.
New phrasing:
- README h3: "A SQL control plane for multi-system data pipelines"
- README intro: "Hoptimator turns SQL into running, multi-hop data
pipelines that span Kafka, Flink, Venice, and anything
else you plug in."
- docs/index: "Hoptimator is a SQL control plane for multi-system data
pipelines. You write SQL; it figures out the topology
across Kafka, Flink, Venice, and whatever else you plug
in, generates the specs, deploys them, and reconciles
them."
This keeps the "control plane" framing (planner-not-runtime) while
removing the K8s lock-in suggestion and naming the actual systems
Hoptimator spans up front.
* Docs: expand LogicalTable concept to explain the abstraction model
The previous section described LogicalTables in six lines, framed mostly
around the YAML shape. That undersells what they actually do — the
abstraction is the point, not the driver.
Rewrites the "Logical tables" section in concepts.md to cover:
- The abstraction value (one named entity, N physical backends; collapses
the typical mess of three names + hand-built sync jobs into a single
declaration).
- The tier model (nearline / online / offline) with a table mapping each
tier to typical backends and the role it plays.
- What you get for free at deploy time: physical tier resources via the
Deployer SPI, implicit inter-tier sync pipelines (nearline → online,
nearline → offline), auto-backfill triggers when offline is bound, and
one schema source-of-truth resolved from nearline.
- Why this matters as an abstraction: tier-agnostic application code,
the right topology being the cheap path, and clean composition with
partial views / materialized views.
- The classic use case (lambda / kappa for feature stores) so readers
recognize the pattern.
- An explicit note that LogicalTables ship as a JDBC driver today but
function as an abstraction model — the deployer does the heavy lifting
at create time, not the driver at query time.
Also updates the at-a-glance table entry from "A single logical entity
that spans multiple physical storage tiers" to lead with "An abstraction
model" and call out the auto-sync/auto-backfill behaviors.
* Docs: fix WITH options syntax in DDL reference
Hoptimator's parser uses Calcite's `'key' 'value'` form (whitespace, no
`=`) inside WITH clauses, not the `'key'='value'` form. The `=` form is
Flink's syntax and only shows up in auto-generated pipeline output.
Fixes the four user-facing DDL signatures and the example:
- CREATE MATERIALIZED VIEW
- CREATE TRIGGER
- CREATE TABLE (both signature and example)
- CREATE TABLE example: `'kafka.partitions' '8'`
Adds a "WITH options syntax" section explicitly noting the difference and
calling out that the `=` form readers see in `!pipeline` / `!specify`
output is the Flink engine's syntax, not Hoptimator's input grammar.
The `=` instances elsewhere in the docs (auto-generated Flink SQL in
sql-cli.md output blocks, the truncated pipeline SQL in quickstart.md)
are correct as-is and were left alone.
* Docs: tone down LogicalTable opening paragraph
* manual cleanup
* Docs: add Kubernetes guide (Phase 3)
Adds the Kubernetes section to docs/. Five new pages covering everything
needed to operate Hoptimator on a cluster:
docs/kubernetes/
index.md -- section landing
operator.md -- what hoptimator-operator does, controllers it
runs (PipelineReconciler, TableTriggerReconciler,
ViewReconciler), how to deploy it, RBAC,
namespace scoping, lifecycle of a pipeline,
when not to run the operator
crd-reference.md -- field-by-field for all 10 CRDs (Database, View,
Pipeline, TableTemplate, JobTemplate,
TableTrigger, Subscription, LogicalTable,
Engine, SqlJob) with spec/status/printer-column
tables and one example per kind
templates.md -- TableTemplate / JobTemplate authoring deep
dive: matching rules (databases, methods),
full placeholder syntax (subst, defaults,
conditionals, transforms, multiline), the
default placeholders K8sSourceDeployer and
K8sJobDeployer inject, where hint and
configmap values fit, common patterns
triggers.md -- TableTrigger operational guide: cron vs
status-driven firing, pause/resume,
jobProperties, common patterns (offline-tier
backfill, rETL, downstream notification, ops
hooks), when not to use a trigger
configuration.md -- hoptimator-configmap, ConfigProvider SPI,
three-source precedence (system properties <
configmap < hints), file-like keys and lazy
expansion, pod-namespace detection, writing
a custom ConfigProvider
Also promotes the docs/index.md and README.md "Kubernetes guide
(coming soon)" entries to live links.
* Docs: tighten SqlJob and operator-logging notes
- SqlJob: reframe as a primitive consumed by an external SqlJob
operator that deploys Flink and Flink-Beam SQL jobs. Drop the
"useful when a job doesn't fit CREATE MATERIALIZED VIEW" framing,
which conflated SqlJob with materialized-view tooling.
- operator.md: drop the "does not yet emit Kubernetes events" line.
Confirmed by grep that no events are emitted, but the term is
jargon that doesn't help an open-source reader without a side
explanation. The remaining "logs are the primary debugging surface"
carries the practical guidance.
* Docs: move k8s connection properties out of jdbc.md
The full `k8s.*` connection-property table in the JDBC user guide
contradicted the "Kubernetes is the default deployer, not a hard
requirement" framing established elsewhere — those properties are
deployer-specific, not driver-specific.
Consolidates so the table lives in one place:
- jdbc.md: replaces the "Kubernetes context" subsection with a short
"Deployer-specific properties" note pointing at the Kubernetes guide,
and explicitly calls out that a different deployer would expose its
own `<deployer>.*` properties.
- kubernetes/configuration.md: takes the full table (now with the
Default column merged in from the jdbc.md version), replaces the
pointer to jdbc.md with one that names the driver-level surface
(catalogs, hints, fun) so readers know what each page covers.
The table content is unchanged; this is a re-homing edit.
* Docs: document `make generate-models` for CRD model regeneration
The Java `V1alpha1*` model classes under hoptimator-k8s are generated
from the CRD YAMLs by `make generate-models` (which shells out to the
upstream Kubernetes Java client's `crd-model-gen` Docker image). Without
that callout, contributors who add a CRD field have no obvious way to
discover that they need to regenerate.
Adds the reference in the two natural places:
- CONTRIBUTING.md: new step in the PR checklist ("Regenerate Java models
if you touched a CRD"), with the command, the Docker requirement, and
a pointer to the upstream tool.
- docs/kubernetes/crd-reference.md: a callout block near the top of the
page so it's visible to anyone landing on the CRD reference while
modifying one.
* Docs: correct conditional template syntax (template-level, not inline)
The previous text described `{{var==value}}` as an inline conditional
that emits a block of the template when the condition matches. That's
wrong. Reading Template.java's `render` method:
- On a `==` / `!=` marker whose condition is satisfied, the marker
is erased and rendering continues normally.
- On a marker whose condition is *not* satisfied, the renderer
`return null;`s the entire template — it produces nothing.
So the markers are template-level guards, not inline blocks. The
real-world pattern is "two templates with mirrored guards; whichever
condition matches is the one that fires." The bundled flink-template
uses this to swap between SQL-job and Beam-job entry classes — there
isn't an `{{end}}` companion marker (which the docs invented) and the
syntax doesn't conditionally include a single line.
Fixes:
- Placeholder syntax table: drop the imagined `{{end}}` companion;
reword the `==` / `!=` rows as "template-level guard: render this
template only if X; otherwise skip the whole template."
- Drop `{{var toName}}` from the table — it's documented in
Template.java's javadoc but not implemented in `applyTransform()`,
which only handles `toLowerCase`, `toUpperCase`, and `concat`.
Documenting it would invite reports of a "broken" feature.
- Replace the "Conditional rendering example" section with one that
shows the actual pattern: two JobTemplates with mirroring `==`
guards on `flink.app.type`, only one of which renders for a given
pipeline.
- Rename the section "Conditional templates" so the heading no longer
implies block-level conditionals.
* Docs: add Extending Hoptimator guide (Phase 4)
Adds the Extending section. Five pages covering the SPI surfaces a
contributor would touch when integrating a new system or customizing
behavior end-to-end:
docs/extending/
index.md -- section landing with a "pick the right
surface" decision table; explains the
ServiceLoader-based loading pattern and
the META-INF/services file layout
data-sources.md -- JDBC adapter walkthrough (DriverVersion,
Schema, register()), Database CRD wiring,
TableTemplate-only path for declarative
integrations, when to reach for a custom
Deployer instead, the connector-only
pattern for pre-existing infrastructure,
when ConnectorProvider applies
deployers.md -- Deployer interface + lifecycle (create /
update / delete / specify / restore),
DeployerProvider with priority semantics,
KafkaDeployerProvider as a concrete shape
to copy, opt-in Validation, testing
surface, common pitfalls (missing restore,
side effects in specify, wrong priority,
wrong exception types)
validators.md -- the three points validation runs
(parsed SQL, resolved object, deployer
collection), Issues tree API with severity
levels, the two participation patterns
(Validated on your own type vs.
ValidatorProvider for cross-cutting
policy), built-in providers as references,
authoring patterns and testing
config-providers.md -- ConfigProvider SPI mechanics, when to write
one (vault, central config service,
dynamic per-connection values), example
sketch, interaction with hints/JDBC props,
caveats (latency on every connection,
tolerated errors, no SPI ordering control)
Also promotes the docs/index.md and README.md "Extending Hoptimator
(coming soon)" entries to live links.
* Add CLAUDE.md for agent onboarding
Lands a top-level CLAUDE.md so Claude Code agents pick up the
high-leverage context for this repo automatically — what the project
is, where the docs live, common commands, the active module layout
(plus the marked-for-deletion list), the gotchas that aren't obvious
from reading code, the patterns to prefer, distilled testing rules,
and a "keep docs in sync" reminder.
Sections:
- What this repo is — three roles (planner/adapter/operator);
Kubernetes is default-not-required; Deployer SPI is the actual
extension point.
- Read these first — pointers into docs/ rather than duplicated
content.
- Common commands — make build/test/integration-tests/coverage/
generate-models/deploy*; ./hoptimator; ./start-mcp-server. Calls
out that generate-models is required after any CRD edit.
- Module layout — active modules with one-line purpose; explicit
list of marked-for-deletion modules so contributions don't land
there.
- Gotchas — the items an agent would otherwise discover by trial:
WITH 'key' 'value' (not =), partial-views as default, Engine CRD
partially developed, reserved-but-unimplemented DDL, template
guards are template-level not inline, toName documented but not
implemented, MCP query allowlist isn't a safety mechanism, alpha
status (don't add backcompat shims unless asked), Checkstyle +
SpotBugs enforced.
- Patterns to prefer — declarative > imperative, validators >
runtime checks, hints > template edits, configmap > hints,
CREATE OR REPLACE for iteration.
- Testing — distilled from the internal testing-best-practices file.
Focus on the non-obvious anti-patterns an agent would otherwise
reach for: never @MockitoSettings LENIENT class-wide, no reflection
field injection, no coverage exclusions for new files, doReturn
for wildcard generics, MockedStatic as @mock field not
try-with-resources, "find bugs not coverage."
- Keep docs in sync — concrete mapping from change-type to which
doc(s) to update, plus the "README is slim, docs/ is journey-based"
framing reminder.
~175 lines. Designed to load fast and survive multiple sessions
without rotting.
* Docs: validators in concepts, Calcite + Quidem links, drop drift-prone code
Four targeted improvements from review.
Concepts:
- Add a Validators section. Validators were only mentioned in
extending/validators.md, so the conceptual surface was missing from
the place readers learn vocabulary. Covers the "Deployer makes things
real, Validator says whether they're allowed" framing, when validation
runs, and what the bundled validators do.
- Add a Validator row to the at-a-glance table.
- Open the page with an Apache Calcite acknowledgment + link to the
Calcite reference, since the parser/planner/JDBC layer all build on
it and readers should know where to look for SELECT/expression syntax.
DDL reference:
- Add a Quidem callout in the page intro pointing at the .id files
under each module's src/test/resources/ as a fast way to see
currently-passing examples of every DDL form.
CONTRIBUTING:
- Add a Quidem paragraph to the "Build and test locally" step
explaining what .id files are, where they live, and that they're
the right place to extend coverage when changing DDL parsing,
planning, or any user-facing SQL behavior.
Extending → deployers:
- Drop the trimmed KafkaDeployerProvider Java snippet; replace with
bullets pointing at the real KafkaDeployerProvider and KafkaDeployer
files on GitHub. Reading the source is more reliable than reading a
drift-prone copy of it.
Extending → data-sources:
- Drop the synthesized MySystemDriver code block; replace with a
short list pointing at the bundled hoptimator-demodb / -kafka /
-venice / -mysql modules as progressively richer reference
adapters. Keep the prose description of the common shape.
Other illustrative code blocks (NamingPolicyValidator,
VaultConfigProvider, the deployer testing snippet) are kept — they're
not copies of repo code, they're shape-of-what-you'd-write examples
that don't drift with the implementation.
* slim CLAUDE.md
* Docs: trim concepts.md and quickstart for duplication
Five overlap cleanups identified in review. Net effect: concepts.md
goes from ~430 lines to 328 with no information loss — every cut
already lives in a more authoritative page.
Concepts:
- TableTriggers: drop the four-bullet "what triggers enable" list
and the status-patch explanation, since kubernetes/triggers.md
now owns the operational depth. Keep the one-line decoupling
philosophy ("pipelines stay pure data-flow expressions, triggers
carry the imperative side effects") and link out for patterns.
~35 lines removed.
- Logical tables: drop the "Why this matters as an abstraction"
three-bullet section, which restated "What you get for free" from
a different angle. Drop the "Implementation note" header and
fold its substance into a one-line aside. The "classic use case"
section becomes one paragraph instead of a bulleted breakdown.
~50 lines removed; the unique abstraction-model framing stays.
- Configuration and hints: replace the ~25-line section with a
5-line stub pointing at user-guide/hints.md (full hint mechanics)
and kubernetes/configuration.md (configmap + precedence). The
details lived in three places; now they live in two.
Quickstart:
- Drop the six-row CLI command table that duplicated
user-guide/sql-cli.md. Keep the !intro / !quit pointers and
link out for the rest. ~10 lines removed; one fewer place for
the table to drift.
Plus user's intentional in-flight edits to extending/data-sources.md
and extending/deployers.md trimming "Read the source rather than rely
on a snippet here" to cleaner intro lines.
* Docs: dedupe Engine caveat, K8s framing, and DDL non-support sections
Cleanups #6-#8 from review.
#6 Engine "partially developed" caveat:
- crd-reference.md kept its full Engine section content — but the
"partially developed" framing was identical to concepts.md's
Engines (optional) section. Replace with a one-line pointer to
concepts; keep the field tables since those are the unique value
of a CRD-reference page.
- Trim the at-a-glance row similarly.
- concepts.md remains the canonical home of the Engine framing; CLAUDE.md
preserves it as a gotcha for agents.
#7 K8s-as-pluggable framing:
- architecture.md had it twice — once as prose in "Step 4 — Deploy"
and once as a bullet in "Where to extend". Trim the Step 4 prose:
drop the "path of least resistance, not a hard requirement"
sentence and the "every page assumes a cluster" reasoning, since
the same point is made in the extension bullet and on the README's
Why-Hoptimator list. The Step 4 paragraph still notes that bundled
deployers target K8s and that the SPI is the swap point.
- README's Why-Hoptimator bullet remains the canonical statement.
#8 DDL reference's two non-support sections:
- Merge "Reserved syntax" and "What is *not* supported" into one
"What's not supported" section with two clearly labeled subgroups:
"Out of scope" (INSERT/UPDATE/DELETE against arbitrary tables,
ALTER TABLE, transactions, stored procedures) and "Parses but not
yet executed" (REFRESH MATERIALIZED VIEW, FIRE *, PAUSE/RESUME MV,
CREATE FUNCTION). Same content; readers no longer have to scroll
past identifiers/WITH/system-tables to find the second list.
* Docs: surface !tables and !schemas in the SQL CLI command table
* Docs: trim hints.md and merge templates.md into configuration.md
Two normalization passes.
hints.md (user guide):
- Drop the deep "Template hints" / "Connector hints" sub-sections that
duplicated kubernetes templates content; replace with a brief two-flavor
paragraph and a pointer to "Templates and configuration" for the full
story.
- Remove the connector-hint examples block and the source/sink
segment-meaning table (lives in templates/configuration now).
- Keep the user-side surface that's unique to this page: where to set
hints (CLI, JDBC, Subscription, MCP), format, advisory nature,
reading what was applied via `kubectl get pipeline ... -o yaml`,
and the hint-vs-template-default decision rule.
kubernetes/templates.md → kubernetes/configuration.md (merged):
- Templates were already part of "configuration" in the broader sense
(templates are the artifacts; configmap/hints/system-props supply the
values). Merge the two pages into a single "Templates and
configuration" reference, file named configuration.md (the more
general name).
- Top half: template authoring (matching, structure, placeholder
syntax, conditionals, patterns, tips) — content from the old
templates.md.
- Bottom half: where placeholder values come from — deployer-injected
defaults, the three configuration sources (hoptimator-configmap,
JDBC connection properties / Subscription hints, JVM system
properties), precedence, file-like keys, k8s.* connection-properties
reference, pod-namespace detection, ConfigProvider extension pointer.
- Update every cross-link: docs/index.md, docs/user-guide/hints.md,
docs/extending/{data-sources,deployers,index}.md,
docs/kubernetes/{crd-reference,index}.md.
- Delete docs/kubernetes/templates.md.
Plus user-side fixes in this round:
- ddl-reference.md: drop the standalone "WITH options syntax" section
(the form signatures already show the 'key' 'value' shape inline).
- architecture.md, jdbc.md: change example schema MY.AUDIENCE →
ADS.AUDIENCE to match the demo's registered schemas.
* Rename configuration -> templates
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Provide comprehensive documentation for an open-source audience.
The README becomes a slim landing page; detailed content moves into a journey-based
docs/tree.Motivation
The previous
README.mdwas ~230 lines of mixed elevator-pitch + quickstart + Kafka/Flink runbooks + extension API. There was no place that explained concepts to a newcomer, no CRD reference, no extension guide. The two LinkedIn engineering blog posts position Hoptimator as a "control plane for data planes" — that framing wasn't reflected anywhere in the repo.New SQL syntax wasn't documented nor was a clear path how to extend this repo.
What's new
docs/
├── index.md -- top-level landing
├── getting-started/
│ ├── index.md
│ ├── quickstart.md -- 5-min walkthrough on Docker Desktop
│ ├── concepts.md -- vocabulary reference
│ └── architecture.md -- life of a SQL statement, module map
├── user-guide/
│ ├── index.md
│ ├── sql-cli.md -- ./hoptimator + custom commands
│ ├── jdbc.md -- driver-level connection properties
│ ├── mcp-server.md -- MCP tool surface, agent workflow
│ ├── ddl-reference.md -- CREATE/DROP/PAUSE/RESUME, partial views, WITH syntax
│ └── hints.md -- template vs connector hints
├── kubernetes/
│ ├── index.md
│ ├── operator.md -- controllers, RBAC, lifecycle
│ ├── crd-reference.md -- field-by-field for all 10 CRDs
│ ├── templates.md -- TableTemplate / JobTemplate authoring
│ ├── triggers.md -- TableTrigger operational guide
│ └── configuration.md -- configmap, ConfigProvider, k8s.* properties
├── extending/
│ ├── index.md
│ ├── data-sources.md -- JDBC adapter + Database CRD
│ ├── deployers.md -- Deployer SPI
│ ├── validators.md -- Validator SPI
│ └── config-providers.md -- ConfigProvider SPI
└── resources/
└── learn-more.md -- engineering blog posts
Plus:
CLAUDE.md— context file for Claude Code agents working in this repo.