Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -335,10 +335,12 @@
*** xref:develop:connect/cookbooks/jira.adoc[]

* xref:sql:index.adoc[Redpanda SQL]
// ** quickstart.adoc
** xref:sql:get-started/what-is-redpanda-sql.adoc[Overview]
*** xref:sql:get-started/oltp-vs-olap.adoc[]
*** xref:sql:get-started/redpanda-sql-vs-postgresql.adoc[]
** xref:sql:get-started/index.adoc[Get Started]
*** xref:sql:get-started/sql-quickstart.adoc[Quickstart]
*** xref:sql:get-started/deploy-sql-cluster.adoc[Enable Redpanda SQL]
*** xref:sql:get-started/what-is-redpanda-sql.adoc[Overview]
**** xref:sql:get-started/oltp-vs-olap.adoc[]
**** xref:sql:get-started/redpanda-sql-vs-postgresql.adoc[]
** xref:sql:connect-to-sql/index.adoc[Connect to Redpanda SQL]
*** xref:sql:connect-to-sql/language-clients/psycopg2.adoc[]
*** xref:sql:connect-to-sql/language-clients/java-jdbc.adoc[]
Expand Down
162 changes: 162 additions & 0 deletions modules/sql/pages/get-started/deploy-sql-cluster.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
= Enable Redpanda SQL on a BYOC cluster
:description: Enable the Redpanda SQL engine on a BYOC cluster so that users can query streaming data with standard PostgreSQL syntax.
:page-topic-type: how-to

Enable Redpanda SQL on a BYOC cluster to give your team the ability to query streaming data in Redpanda topics using standard PostgreSQL syntax.

== Prerequisites

To enable Redpanda SQL, you need:

* Admin permissions in your Redpanda Cloud organization.
* For the Cloud API path, a valid bearer token for the link:/api/doc/cloud-controlplane/topic/topic-cloud-api-overview[Cloud API]. See link:/api/doc/cloud-controlplane/authentication[Authenticate to the API].

== Enable Redpanda SQL

You can enable Redpanda SQL when you create a new BYOC cluster or on an existing cluster.

=== On a new cluster

[tabs]
=====
Cloud Console::
+
--
. Log in to https://cloud.redpanda.com[Redpanda Cloud^].
. Start creating a new BYOC cluster on AWS. For details and prerequisites, see xref:get-started:cluster-types/byoc/aws/create-byoc-cluster-aws.adoc[].
. In the cluster creation form, select the option to enable SQL.
. Complete the remaining cluster configuration and deploy.
--

Cloud API::
+
--
. Authenticate to the link:/api/doc/cloud-controlplane/topic/topic-cloud-api-overview[Cloud API]. For details, see link:/api/doc/cloud-controlplane/authentication[Authenticate to the API].
. Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_createcluster[`POST /v1/clusters`] request with `oxla.enabled` set to `true` in the cluster spec:
+
[,bash]
----
curl -X POST "https://api.redpanda.com/v1/clusters" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"cluster": {
"name": "<cluster-name>",
"cloud_provider": "CLOUD_PROVIDER_AWS",
"type": "TYPE_BYOC",
"region": "<region>",
"zones": [ <zones> ],
"throughput_tier": "<tier>",
"resource_group_id": "<resource-group-id>",
"oxla": {
"enabled": true
}
}
}'
----
+
For the full request body and field reference, see the link:/api/doc/cloud-controlplane/operation/operation-clusterservice_createcluster[Create cluster API].
. The request returns the ID of a long-running operation. Poll the link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] endpoint until the operation completes.
--
=====

=== On an existing cluster

To enable, scale, or disable SQL on an existing cluster, you also need the cluster ID, which you can find in the *Details* section of the cluster overview in the Cloud Console.

. Authenticate to the link:/api/doc/cloud-controlplane/topic/topic-cloud-api-overview[Cloud API]. For details, see link:/api/doc/cloud-controlplane/authentication[Authenticate to the API].
. Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_updatecluster[`PATCH /v1/clusters/{cluster.id}`] request, replacing `{cluster.id}` with your cluster ID:
+
[,bash]
----
curl -X PATCH "https://api.redpanda.com/v1/clusters/{cluster.id}" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"oxla":{"enabled":true}}'
----
+
The request returns the ID of a long-running operation. Poll the link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] endpoint until the operation completes:
+
[,bash]
----
curl -X GET "https://api.redpanda.com/v1/operations/{operation.id}" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json"
----
+
When the operation is complete, the response shows `"state": "STATE_COMPLETED"`.

== Scale Redpanda SQL

Redpanda SQL supports horizontal scaling from 1 to 12 nodes per cluster. Scaling to 0 is not supported. To remove Redpanda SQL from a cluster, disable the SQL engine instead.

[tabs]
=====
Cloud Console::
+
--
. Log in to https://cloud.redpanda.com[Redpanda Cloud^].
. Go to your BYOC cluster and open the *SQL* tab.
. Set the node count to a value between 1 and 12, then save.
--

Cloud API::
+
--
Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_updatecluster[`PATCH /v1/clusters/{cluster.id}`] request with the new replica count. Replace `{cluster.id}` with your cluster ID and `<n>` with a value between 1 and 12:

[,bash]
----
curl -X PATCH "https://api.redpanda.com/v1/clusters/{cluster.id}" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"oxla":{"replicas":<n>}}'
----

The request returns the ID of a long-running operation. Poll link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] until the operation completes.
--
=====

== Verify the SQL engine is running

After you enable Redpanda SQL, provisioning may take several minutes. To verify the SQL engine is running:

// TODO: Confirm with engineering if there are any other specific status indicators that users can see in the Console when SQL is running. Is the SQL tab only visible when SQL is enabled?

== Disable Redpanda SQL

[WARNING]
====
Disabling Redpanda SQL purges the stored catalog state for the SQL engine and deletes its data from object storage. In-flight queries fail when SQL is disabled. To temporarily stop SQL compute without losing state, <<scale-redpanda-sql,scale the node count down>> instead.
====

[tabs]
=====
Cloud Console::
+
--
. Log in to https://cloud.redpanda.com[Redpanda Cloud^].
. Go to your BYOC cluster and open the *SQL* tab.
. Click *Remove* and confirm.
--

Cloud API::
+
--
Make a link:/api/doc/cloud-controlplane/operation/operation-clusterservice_updatecluster[`PATCH /v1/clusters/{cluster.id}`] request with `oxla.enabled` set to `false`. Replace `{cluster.id}` with your cluster ID:

[,bash]
----
curl -X PATCH "https://api.redpanda.com/v1/clusters/{cluster.id}" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"oxla":{"enabled":false}}'
----

The request returns the ID of a long-running operation. Poll link:/api/doc/cloud-controlplane/operation/operation-operationservice_getoperation[`GET /v1/operations/{operation.id}`] until the operation completes.
--
=====

== Next steps

* xref:sql:get-started/sql-quickstart.adoc[Quickstart]: Connect to Redpanda SQL with `psql` and run your first query.
3 changes: 3 additions & 0 deletions modules/sql/pages/get-started/index.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
= Get Started with Redpanda SQL
:description: Get started with Redpanda SQL, a column-oriented OLAP query engine built into Redpanda Cloud that lets you query streaming topics using standard SQL.
:page-layout: index
167 changes: 167 additions & 0 deletions modules/sql/pages/get-started/sql-quickstart.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
= Redpanda SQL quickstart
:description: Connect to Redpanda SQL on a BYOC cluster and run your first query on streaming data.
:page-topic-type: guide

Redpanda SQL is a PostgreSQL-compatible SQL engine built into Redpanda BYOC. It lets you query streaming data in your Redpanda topics with standard SQL, without building ETL pipelines or deploying a separate analytics system. In this quickstart, you connect with `psql` and run your first query against a Redpanda topic.

== Prerequisites

* A Redpanda BYOC cluster on AWS with Redpanda SQL enabled. See xref:sql:get-started/deploy-sql-cluster.adoc[].
* A Redpanda topic with a schema registered in Schema Registry. If you don't have one, follow the optional <<optional-produce-sample-data,Produce sample data>> section below to create a sample `orders` topic.
* https://www.postgresql.org/download/[`psql`^] (PostgreSQL client) installed on your local machine.

// TODO: Verify the exact connection string format and where users get credentials.
// From PRD: SCRAM auth preserved, connection string available in Cloud Console and API response.
// Confirm with engineering what SCRAM credentials does the user use - superuser auto-created by Control Plane?

== Get connection details

After Redpanda SQL is provisioned on your cluster:

. In the Redpanda Cloud Console, go to your cluster and open the *SQL* tab.
. Copy the connection string.

== Connect to Redpanda SQL

Use `psql` to connect to the SQL engine. Paste the connection string you copied from the Console:

// TODO: Replace with actual connection string format once confirmed with engineering.
[,bash]
----
psql "<connection-string>"
----

On a successful connection, you should see output similar to:

// TODO: Verify current psql banner text.
[.no-copy]
----
psql (14.x, server 16.0)
SSL connection (protocol: TLSv1.3)
Type "help" for help.

=>
----

[#optional-produce-sample-data]
== (Optional) Produce sample data

[TIP]
====
Skip this section if you already have a Redpanda topic with a schema registered in Schema Registry that you want to query.
====

If you don't have a schema-registered topic to query yet, follow these steps to create an `orders` topic with a small set of sample records. Redpanda SQL reads the topic's schema from Schema Registry to map fields to SQL columns, so the topic must have a registered schema before you can query it.

This section uses xref:reference:rpk/index.adoc[`rpk`], which you can install by following xref:get-started:rpk-install.adoc[]. You also need a user with permissions to create topics, register schemas, and produce records.

. Create a topic:
+
[,bash]
----
rpk topic create orders
----

. Save the following Protobuf schema as `order.proto`:
+
// TODO: Confirm the GA-supported schema format(s) with engineering. If JSON Schema is supported at GA, consider switching this example to JSON for simpler UX.
[,proto]
----
syntax = "proto3";

message Order {
int64 order_id = 1;
string customer = 2;
string product = 3;
int64 amount = 4; // amount in cents
string status = 5; // "pending", "shipped", "completed"
}
----

. Register the schema against the topic's value subject:
+
[,bash]
----
rpk registry schema create orders-value --schema order.proto
----

. Produce a few sample records. The `--schema-id=topic` flag tells `rpk` to use the topic name strategy to look up the schema you just registered:
+
[,bash]
----
rpk topic produce orders --schema-id=topic <<EOF
{"order_id": 1, "customer": "alice", "product": "keyboard", "amount": 7500, "status": "completed"}
{"order_id": 2, "customer": "bob", "product": "monitor", "amount": 32000, "status": "shipped"}
{"order_id": 3, "customer": "carol", "product": "mouse", "amount": 4500, "status": "pending"}
{"order_id": 4, "customer": "alice", "product": "monitor", "amount": 32000, "status": "completed"}
{"order_id": 5, "customer": "dave", "product": "keyboard", "amount": 7500, "status": "pending"}
EOF
----

When you continue to the next section, use `orders` as the topic name in `CREATE KAFKA SOURCE`.

== Query a Redpanda topic

When you enable Redpanda SQL, default connections to your Redpanda cluster and Iceberg catalog are automatically configured. You don't need to set up any connections manually.

To query a Redpanda topic as a SQL table, create a Kafka source that maps the topic to a queryable table. The following example uses the `orders` topic from the previous section. Replace `orders` with the name of your topic if you're using your own data:

[,sql]
----
CREATE KAFKA SOURCE orders
TOPIC 'orders'
CONNECTION default_redpanda_connection;
----

The SQL engine fetches the topic's schema from Schema Registry automatically.

Redpanda SQL supports Protobuf, Avro, and JSON schemas registered in Schema Registry. The supported data types at GA are primitive scalar types (`INT`, `BIGINT`, `FLOAT`, `BOOLEAN`, `VARCHAR`, `TIMESTAMP`, and others) and `Struct`.

NOTE: Nested JSON objects may be surfaced as `VARCHAR` containing the raw JSON string rather than as a navigable `Struct` column. The behavior depends on the serialization format: Avro and Protobuf schemas define field types explicitly and map to `Struct` where supported. JSON Schema fields typed as `object` may fall back to `VARCHAR`.

== Run queries

After you create the source, query your topic data with standard SQL. The following examples use the `orders` schema from the optional sample data section. If you're using your own topic, substitute the topic name and column names below.

View a sample of records:

[,sql]
----
SELECT * FROM orders LIMIT 10;
----

Count orders by status:

[,sql]
----
SELECT status, COUNT(*) AS total_orders
FROM orders
GROUP BY status;
----

Find the largest orders:

[,sql]
----
SELECT order_id, customer, product, amount
FROM orders
WHERE amount > 10000
ORDER BY amount DESC
LIMIT 20;
----

== Inspect your SQL cluster

Redpanda SQL provides built-in commands to inspect the state of your SQL cluster:

[,sql]
----
SHOW NODES; -- List SQL compute nodes and their status
SHOW QUERIES; -- List currently running queries
----

== Next steps

* xref:reference:sql/index.adoc[Redpanda SQL reference]: Explore the full SQL syntax, data types, functions, and clauses.
* xref:sql:connect-to-sql/language-clients/psycopg2.adoc[Connect with Python (psycopg2)]: Query Redpanda SQL programmatically.
* xref:sql:connect-to-sql/language-clients/java-jdbc.adoc[Connect with Java (JDBC)]: Integrate with Java applications.