docs(connectors): add Generic HTTP Sink page and update connectors table#41
docs(connectors): add Generic HTTP Sink page and update connectors table#41mlevkov wants to merge 1 commit into
Conversation
Surfaces the Generic HTTP Sink connector (shipped in 0.8.0 via apache/iggy#2925) on the docs site as a curated reference subset of the upstream README. Adds the new sinks/http page with a Callout linking to the canonical README, inserts it into the sidebar between iceberg and stdout, and lists "Generic HTTP" in the Available Connectors table. Closes apache#39
| |--------|----------------------------------------------------------------| | ||
| | Source | PostgreSQL, Elasticsearch, Random | | ||
| | Sink | PostgreSQL, MongoDB, Elasticsearch, Quickwit, Apache Iceberg, Stdout | | ||
| | Sink | PostgreSQL, MongoDB, Elasticsearch, Quickwit, Apache Iceberg, Generic HTTP, Stdout | |
There was a problem hiding this comment.
the table cell here says "Generic HTTP" but the page it links to has frontmatter title "HTTP Sink" and never uses the word "Generic" anywhere in the body. clicking through gives the reader two different names for the same thing. either rename the new page's frontmatter to something like "HTTP Sink (Generic)" and add a one-liner in the opening paragraph noting it's the transport-level connector (vs Elasticsearch/Quickwit which speak HTTP under the hood), or drop "Generic" from this row.
|
|
||
| ### Configuration Options | ||
|
|
||
| | Option | Type | Default | Description | |
There was a problem hiding this comment.
this whole config table is a verbatim copy of the upstream core/connectors/sinks/http_sink/README.md. defaults, types, and the transient retry codes (429/500/502/503/504 a few sections down) are all hardcoded constants in lib.rs and will silently drift the moment that file changes. options: trim this to the 5-6 most-used knobs plus a clear pointer to the upstream README for the full list, or add a CI step that diffs this table against the upstream README on each build. note MAX_CONSECUTIVE_FAILURES = 3 is also a const but only surfaced in prose - if it changes upstream nothing flags it here.
personally, i prefer linking upstream README.
| | Option | Type | Default | Description | | ||
| | ------ | ---- | ------- | ----------- | | ||
| | `url` | string | **required** | Target URL for HTTP requests | | ||
| | `method` | string | `POST` | HTTP method: `GET`, `HEAD`, `POST`, `PUT`, `PATCH`, `DELETE` | |
There was a problem hiding this comment.
the method list includes GET and HEAD without flagging that those methods with non-individual batch modes produce a warning at runtime ("may be rejected by the server") since GET/HEAD don't conventionally carry bodies. either drop GET/HEAD from this list (rarely useful for a sink) or add a one-liner about the batch-mode interaction.
| | `retry_delay` | string | `1s` | Base delay between retries | | ||
| | `retry_backoff_multiplier` | u32 | `2` | Exponential backoff multiplier (min 1) | | ||
| | `max_retry_delay` | string | `30s` | Maximum retry delay cap | | ||
| | `success_status_codes` | [u16] | `[200, 201, 202, 204]` | Status codes considered successful | |
There was a problem hiding this comment.
the description is fine but misses a useful behavior: codes in this set are also never retried, even normally-transient ones like 429. so users who want to treat 429 as "queued/accepted" can put 429 here and it will short-circuit retries. worth a sentence on this - it's a non-obvious knob.
| - **`json_array`**: all messages as a single JSON array. Best for APIs that expect array payloads. `Content-Type: application/json`. | ||
| - **`raw`**: raw bytes, one request per message. For non-JSON payloads (Protobuf, FlatBuffers, binary). The metadata envelope is not applied. `Content-Type: application/octet-stream`. | ||
|
|
||
| For production throughput, prefer `ndjson` or `json_array` over `individual` — they collapse N round trips per poll cycle into one. |
There was a problem hiding this comment.
raw has the same N-round-trips problem as individual (one HTTP request per message - confirmed at send_raw in lib.rs) but isn't called out alongside in this throughput note. either include raw here or be explicit that raw shares the per-message cost.
| ``` | ||
|
|
||
| - `iggy_id` is a 32-character lowercase hex string (no dashes). | ||
| - For non-JSON payloads (`raw`, `flatbuffer`, `proto` schemas), the payload is base64-encoded and an `iggy_payload_encoding: "base64"` field is added. |
There was a problem hiding this comment.
the prose says "an iggy_payload_encoding: "base64" field is added" but never shows the actual JSON shape. for raw/flatbuffer/proto schemas the sink emits the payload as {"data": "<base64>", "iggy_payload_encoding": "base64"} (see EncodedPayload struct). worth a 4-line example block - non-obvious that the bytes live in a data field.
| [plugin_config] | ||
| url = "https://hooks.slack.com/services/T00/B00/xxx" | ||
| batch_mode = "individual" | ||
| include_metadata = false # Slack expects bare JSON payload |
There was a problem hiding this comment.
the comment "Slack expects bare JSON payload" is true but easy to misread. flipping include_metadata = false doesn't transform arbitrary payloads into Slack's {"text": "..."} shape - the sink does no payload transformation on outbound. add a one-liner: "your producer must publish Slack-compatible JSON; the sink does not transform payloads."
Summary
content/docs/connectors/sinks/http.mdx(~179 lines) — a curated subset of the upstreamcore/connectors/sinks/http_sink/README.md, with a<Callout>at the top pointing readers back to the canonical README for the full surface."http"incontent/docs/connectors/sinks/meta.jsonbetweenicebergandstdoutso the sidebar reflects the new page.Generic HTTPto the Sink row of the Available Connectors table incontent/docs/connectors/introduction.mdx.Naming and content-scope decisions (file slug, page title, table cell wording, what to include vs. summarize-and-link) are documented in #39, including the rationale for
Generic HTTP(vs. plainHTTP) — the qualifier disambiguates a transport-level connector from the several sinks that already speak HTTP under the hood (Elasticsearch, Quickwit).Test plan
npm run buildpasses locally — 77/77 static pages generated, no MDX/TS errors/docs/connectors/sinks/httprenders at HTTP 200 with all expected section headings (Configuration, Configuration Options, Batch Modes, Metadata Envelope, Authentication, Retry & Delivery Semantics, Example Configurations, Deployment & Performance, Known Limitations)/docs/connectors/introduction#available-connectorsshowsApache Iceberg, Generic HTTP, Stdoutin the Sink rowsinks/*page lists the new entry betweenicebergandstdout<Callout type="info">renders at the top of the new pageCloses #39