Skip to content

SQL: Query topics#574

Open
kbatuigas wants to merge 10 commits into
rp-sqlfrom
DOC-1990-document-feature-query-redpanda-topics
Open

SQL: Query topics#574
kbatuigas wants to merge 10 commits into
rp-sqlfrom
DOC-1990-document-feature-query-redpanda-topics

Conversation

@kbatuigas
Copy link
Copy Markdown
Contributor

@kbatuigas kbatuigas commented May 4, 2026

Description

This pull request updates and expands the Redpanda SQL documentation to clarify table mapping, schema requirements, and streaming topic queries. It refines the CREATE TABLE reference, introduces new how-to guides for querying topics, and streamlines catalog documentation.

Documentation improvements for querying and mapping topics:

  • Added a new "Query streaming topics" how-to guide (query-streaming-topics.adoc) that walks users through mapping a Redpanda topic to a SQL table and running analytical queries directly on live data. This guide covers prerequisites, table creation, querying, and links to further resources.
  • Introduced a new index page for querying data (query-data/index.adoc) to provide an entry point for users learning to query Redpanda topics with SQL.

Enhancements and clarifications in the SQL reference:

  • Updated the CREATE TABLE documentation to clarify that schema_subject is required and that Redpanda SQL needs a schema to query a topic. Improved the explanation of struct_mapping_policy, especially regarding handling of nested and recursive types, and added documentation for the confluent_wire_protocol option. [1] [2]
  • Improved and updated SQL usage examples to demonstrate required options and multi-message Protobuf schema usage in table creation.

Catalog documentation simplification:

  • Replaced the detailed "Redpanda Catalogs" reference page with a stub, likely to be reworked or replaced by more focused documentation elsewhere.

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 21 May

Page previews

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

@netlify
Copy link
Copy Markdown

netlify Bot commented May 4, 2026

Deploy Preview for rp-cloud ready!

Name Link
🔨 Latest commit 177b97c
🔍 Latest deploy log https://app.netlify.com/projects/rp-cloud/deploys/6a0f504201fc760008e08c39
😎 Deploy Preview https://deploy-preview-574--rp-cloud.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 156f94da-4ac3-4e74-899e-c7b9171afb9f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DOC-1990-document-feature-query-redpanda-topics

Comment @coderabbitai help to get the list of available commands and usage tips.

@kbatuigas kbatuigas force-pushed the DOC-1990-document-feature-query-redpanda-topics branch 2 times, most recently from 4d7b551 to 20ad041 Compare May 11, 2026 20:00
@kbatuigas kbatuigas force-pushed the DOC-1990-document-feature-query-redpanda-topics branch from 20ad041 to ddefdad Compare May 14, 2026 02:01
kbatuigas and others added 7 commits May 18, 2026 20:28
Renames modules/sql/pages/query/ to modules/sql/pages/query-data/ and
renames the streaming-topic how-to from query-redpanda-topics.adoc to
query-streaming-topics.adoc to match the SQL GA IA. Retitles the page
"Query streaming topics" and reframes the description and learning
objectives around live streaming data; bridge-query and Iceberg content
stays out of this page (DOC-2006 owns the Iceberg-topics how-to).

Adds a pointer to the Iceberg topics how-to under the intro and lists
it under Next steps. Updates the enable-prereq xref to point to the
Enable Redpanda SQL page. Drops the CREATE REDPANDA CATALOG link from
Next steps to align with the v1 framing that users do not typically
create their own Redpanda catalog. Reframes the Query data index page
description for v1 Iceberg scope (live and historical data in Redpanda
topics; no external Iceberg lakehouse).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	modules/sql/pages/query-data/redpanda-catalogs.adoc
@kbatuigas kbatuigas force-pushed the DOC-1990-document-feature-query-redpanda-topics branch from 75ae890 to 48ead8c Compare May 19, 2026 03:30
@kbatuigas kbatuigas marked this pull request as ready for review May 19, 2026 23:39
@kbatuigas kbatuigas requested a review from a team as a code owner May 19, 2026 23:39
Copy link
Copy Markdown
Contributor

@Feediver1 Feediver1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: SQL: Query topics (#574)

Files reviewed: 4 .adoc files (109 additions / 94 deletions)
Overall assessment: Solid documentation structure and content. Same integration-branch xref challenges as #571 — six unresolved cross-PR xrefs. One nav-linked stub page with no body. No What's New entry. A couple of em dashes that violate the style guide.

What this PR does

Expands Redpanda SQL query documentation on the rp-sql integration branch:

  • modules/sql/pages/query-data/index.adoc (new, 3 lines) — section index for "Query Data".
  • modules/sql/pages/query-data/query-streaming-topics.adoc (new, 80 lines) — how-to: map a topic to a SQL table and run analytical queries.
  • modules/sql/pages/query-data/redpanda-catalogs.adoc (1+ / 80−) — heavily reduced from a full reference to a 1-line stub.
  • modules/reference/pages/sql/sql-statements/create-table.adoc (25+ / 14−) — updated reference: schema_subject now required, expanded struct_mapping_policy (with cyclic-type guidance), new confluent_wire_protocol option, three full examples.

Jira ticket alignment

Ticket: DOC-1990 — "Document feature query Redpanda topics" (extracted from branch name).

Status: The PR delivers the planned query how-to and refreshes the CREATE TABLE reference. The stubbed redpanda-catalogs.adoc is mentioned in the PR description as "likely to be reworked" — worth confirming what the eventual replacement plan is before the integration branch lands.

Critical issues (must fix)

  1. Six broken xrefs to pages that aren't on rp-sql or in this branch:

    File:line xref target Provided by
    query-streaming-topics.adoc:10 sql:query-data/query-iceberg-topics.adoc PR #575 (still OPEN)
    query-streaming-topics.adoc:23 sql:get-started/deploy-sql-cluster.adoc PR #571 (still OPEN)
    query-streaming-topics.adoc:24 sql:manage/manage-access.adoc PR #580 (still OPEN)
    query-streaming-topics.adoc:25 sql:get-started/sql-quickstart.adoc PR #571 (still OPEN)
    query-streaming-topics.adoc:50 sql:query-data/query-nested-fields.adoc No known PR provides this — confirm it's planned, or remove the reference
    query-streaming-topics.adoc:77 sql:query-data/query-iceberg-topics.adoc (Next steps) PR #575 (still OPEN)
    • Fix: Coordinate merge ordering — all sibling PRs need to land on rp-sql before rp-sql lands on main, otherwise the build will surface six target of xref not found errors. Specifically check on query-nested-fields.adoc — if no PR is in flight for it, the inline reference at line 50 should be removed for now.
  2. redpanda-catalogs.adoc is a 1-line stub but nav.adoc:355 links to it as "Redpanda Catalogs". Users clicking that nav entry hit an empty page. The PR description acknowledges this is intentional ("likely to be reworked"), but a nav-linked empty page is bad UX.

    • Fix: Either (a) put a 2–3 sentence placeholder with "Coming soon — see [other page]" pointer, (b) leave the original content until the replacement lands and gut it in a later PR, or (c) remove the line from nav.adoc:355 and re-add when the page has content.
  3. Missing What's New entry. Same gap as #571: the May 2026 section of whats-new-cloud.adoc has no entry for the Redpanda SQL query workflow. Since this is GA documentation, a coordinated What's New entry should cover both PRs (and the broader SQL GA story across #571 / #575 / #580).

    • Fix: Add a single "Redpanda SQL: General availability" entry under == May 2026 that covers the get-started + query + auth pages together, rather than fragmenting into per-PR entries.
  4. Em dashes in create-table.adoc (style guide says no em dashes):

    • Line 7: "CREATE TABLE in Redpanda SQL maps Redpanda topics to SQL tables it does not create standalone tables with user-defined schemas."

    • Line 56: "Cyclic types are not supported in COMPOUND mode use JSON for recursive schemas."

    • Fix: Replace both em dashes with either a period + new sentence, a colon, or restructure the clause. Example for line 56: "Cyclic types are not supported in COMPOUND mode. Use JSON for recursive schemas."

Suggestions (should consider)

  1. Page-title case mismatch on the index. query-data/index.adoc:1 has = Query data (sentence case), but nav.adoc:354 labels it as "Query Data" (title case). Per team convention, page titles use title case to match the nav label.

    • Current: = Query data
    • Suggested: = Query Data
  2. Stub page comment. The 1-line redpanda-catalogs.adoc uses // stub as the only body marker. If you keep the stub approach, consider a more user-facing placeholder (e.g., a NOTE block or an xref to the related how-to) so the rendered page isn't blank.

  3. Checks boxes in PR body are all empty. Tick the relevant one ("New feature" or "Content gap") for tracking.

Impact on other files

  • modules/ROOT/nav.adoc ✓ — new pages already in nav at lines 354–357, including the (still-missing) query-iceberg-topics.adoc entry at line 357 — consistent with the rp-sql integration plan.
  • modules/get-started/pages/whats-new-cloud.adoc ❌ — no SQL GA entry (Critical #3).
  • Cross-component xrefs verified:
    • xref:reference:sql/sql-statements/create-table.adoc
    • xref:reference:sql/index.adoc
    • xref:reference:sql/sql-data-types/row.adoc (in create-table.adoc:56) — exists in rp-sql ✓
    • xref:reference:sql/sql-statements/create-redpanda-catalog.adoc (in create-table.adoc:7) — exists in rp-sql ✓
    • xref:sql:connect-to-sql/index.adoc
    • All other xref:sql:* xrefs — listed as broken in Critical #1.
  • Sibling PR dependencies: #571 (deploy + quickstart), #575 (query-iceberg), #580 (manage-access). Plus the unknown source for query-nested-fields.adoc.

CodeRabbit findings worth considering

None. CodeRabbit's check passed with no review summary or actionable comments.

What works well

  • Clean module layout: index + how-to + reference, all in the right places.
  • Comprehensive prerequisites section lists exactly what a reader needs before they can succeed: SQL engine enabled, RBAC permission, psql connection, registered Schema Registry schema.
  • Real-world SQL examples beyond toy SELECT * — aggregation with GROUP BY, ORDER BY, WHERE filters, LIMIT.
  • CREATE TABLE reference is thorough: required/optional column in the options table, three full examples (basic, multi-message Protobuf, error handling) covering distinct use cases.
  • Frontmatter compliance: :page-topic-type: how-to for the how-to, :page-topic-type: reference for the reference, learning objectives observable and measurable, personas correctly scoped (app_developer, data_engineer — query-side audience, not platform admins).
  • Sentence case correct on every H2+ heading in the new content.
  • Source-block syntax is consistent with the rest of the SQL module (long-form [source,sql] — matches the convention used in get-started/*.adoc).
  • schema_subject is now correctly marked Required in the reference table, addressing the schema-required guidance that was unclear before.
  • Helpful guidance on cyclic types in struct_mapping_policy — clearly tells users to switch to JSON mode for recursive schemas.
  • confluent_wire_protocol option fully documented with defaults and when to use each value.
  • CI is fully green and Netlify preview links cover the two main new pages.

Final-pass review via /docs-team-standards:pr-review.

@Feediver1
Copy link
Copy Markdown
Contributor

@kbatuigas Ping me again after you get your SME approvals and I can do a more thorough review


Map a Redpanda topic to a SQL table to run analytical queries directly against live streaming data without building ETL pipelines. Redpanda SQL reads each record's fields from the topic's registered schema.

To extend queries past your Redpanda retention window by reading the Iceberg history of Iceberg-enabled topics, see xref:sql:query-data/query-iceberg-topics.adoc[Query Iceberg-enabled Topics].
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly @kbatuigas ! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants