ECI-1486 scaleway onboard setup script by rheei · Pull Request #166 · DataDog/integrations-management

rheei · 2026-04-17T15:23:08Z

Summary

Adds scaleway/log_forwarding/setup-logs.sh, an end-to-end setup script that provisions a Scaleway project for Datadog log forwarding with no manual tile entry required. Pre-filled credentials and the audit-trail toggle state come from the integration tile UI (web-ui#306167).

Bootstrap

scw CLI install (Linux only): if scw isn't on PATH, the script offers to install it via the official Scaleway install script. macOS users get a brew install scw hint. Skipped in --dry-run.
scw init (when needed): if scw is installed but has no credentials, runs scw init interactively so the user is prompted for their Scaleway access/secret keys, with a pointer to https://console.scaleway.com/iam/api-keys.

Pre-flights (fail fast before any Scaleway-side work)

Datadog integrations_manage permission: probes GET /api/v2/web-integrations/scaleway/accounts with the supplied DD_API_KEY/DD_APP_KEY. Bails on 401/403 with an actionable fix.
Scaleway IAM SSH key: when audit trail will provision an Instance, confirms at least one SSH key is registered in the project (Scaleway auto-installs registered keys on new Instances). Skipped when SCW_INSTANCE_IP is set (BYO) or audit trail is disabled.

Step 0 – IAM provisioning

Creates a least-privilege datadog-integration IAM application with an ObservabilityFullAccess policy scoped to the target project, generates an API key pair, and switches all subsequent calls to those credentials. Idempotent — reuses the app and policy if they already exist.

Part 1 – Cockpit native exports

Creates Datadog log exporters for all Scaleway-managed data sources across all supported regions. Supported regions are discovered dynamically from the scw CLI so the list stays in sync as Scaleway adds new ones. Idempotent per exporter.

Part 2 – Audit trail export

Gated by SCW_AUDIT_TRAIL_ENABLED=true (the UI toggle). The script:

Auto-provisions a Scaleway Instance (DEV1-S, ~€6.34/mo) tagged datadog-audit-trail via scw instance server create -w. Idempotent: subsequent runs reuse the same Instance by tag.
Or honors SCW_INSTANCE_IP for customers who want to deploy the collector to an existing Instance they already operate (advanced/BYO).
Builds an OpenTelemetry Collector locally with the scwaudittrail receiver and deploys it over SSH with a pinned host key.
The --teardown flag deletes only auto-provisioned Instances (filtered by tag) — never touches BYO Instances.

Part 3 – Datadog account registration

Calls POST /api/v2/web-integrations/scaleway/accounts to create (or PATCH to update) the Datadog integration account with the provisioned credentials. Skipped if Cockpit produced zero working exporters AND audit trail didn't deploy, so a permission failure or partial outage doesn't leave a dangling integration account record with no data flowing.

Flags & env vars

--dry-run: prints every API call (method, URL, body) without executing — fake credentials are accepted.
--teardown: deletes everything tagged datadog-audit-trail and exits.
SCW_AUDIT_TRAIL_ENABLED=true|false — pre-filled by the tile toggle.
SCW_INSTANCE_IP — BYO Instance escape hatch.
PROVISION_INSTANCE=auto|true|false — controls the cost-confirmation prompt.
SCW_AUDIT_INSTANCE_{TYPE,ZONE,IMAGE} — provisioning overrides.

Test plan

--dry-run smoke test passes with fake credentials: SCW_SECRET_KEY=x SCW_ACCESS_KEY=x SCW_ORGANIZATION_ID=x SCW_PROJECT_ID=x DD_API_KEY=x DD_APP_KEY=x DD_SITE=datadoghq.com SCW_AUDIT_TRAIL_ENABLED=false bash scaleway/log_forwarding/setup-logs.sh --dry-run
--dry-run --teardown smoke test passes
Parts 0–2 verified against a real Scaleway organization (IAM app + policy created, exporters detected as already-existing on re-run)
Part 3 verified end-to-end against a real Scaleway project: IAM key rotation, Cockpit exporter dedup, audit collector redeploy, and account PATCH all succeed
Auto-provisioning verified live: DEV1-S create → SSH ready in ~24s, idempotent re-run reused by tag, --teardown cleaned up Instance + IP + volumes
Pre-flight checks verified with bad creds (clear 401 message) and missing SSH key
bash -n and shellcheck -S warning clean
End-to-end run against a fresh Scaleway test project to validate the full happy path before marking ready for review

@rheei where do we anticipate this being run? In a local shell on the user's laptop, or in a cloud shell? If it's the latter, I'd check whether writing files/requests to external domains is blocked by default in that environment. Otherwise, I wouldn't have concerns with either of these approaches. The Azure install script does both.

For the sake of maintainability, I think it'd be preferable to not stick these files as strings in a bash script if possible, but also wouldn't be the end of the world if there aren't other good options.

gpalmz · 2026-04-23T19:51:56Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 428426fe78

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

rheei · 2026-04-27T19:07:47Z

@codex review

chatgpt-codex-connector · 2026-04-27T19:07:53Z

To use Codex here, create a Codex account and connect to github.

gpalmz · 2026-04-29T16:04:09Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf14ee08d8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gpalmz · 2026-04-29T18:03:19Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9c30afcba8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gpalmz · 2026-05-11T18:21:45Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96a227e641

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-11T18:26:23Z

+  local policy_resp
+  if [[ -n "$policy_id" ]]; then
+    log "Updating IAM policy '${IAM_POLICY_NAME}' (id=${policy_id})..."
+    policy_resp=$(scw_request PATCH "/iam/v1alpha1/policies/${policy_id}" "$policy_body") \


Update IAM policy rules through the rules endpoint

When the datadog-integration-policy already exists but is missing the current rule set, this PATCH does not actually refresh the permissions used by the new API key. Scaleway's IAM SDK/API docs describe UpdatePolicyRequest as only accepting name, description, tags, and principal fields, while rule replacement is done via SetRulesRequest (Rules []*RuleSpecs). In that scenario the script prints "Updated" but leaves stale rules in place, so later Cockpit or audit-trail calls can still fail with insufficient permissions.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-11T18:26:23Z

+    local old_key
+    while IFS= read -r old_key; do
+      [[ -z "$old_key" ]] && continue
+      if scw iam api-key delete "access-key=${old_key}" 2>/dev/null; then


Pass the API key as the delete positional argument

For rotations where _IAM_OLD_KEYS is non-empty, this cleanup never deletes the stale keys because the Scaleway CLI documents scw iam api-key delete <access-key ...> and shows scw iam api-key delete SCW000...; access-key=${old_key} is treated as the positional value itself rather than the access key. The setup still succeeds, but every rerun leaves the previous Datadog integration key active instead of revoking it.

Useful? React with 👍 / 👎.

gpalmz

Approving to unblock, but I have a few comments, and I see codex has some more too

gpalmz · 2026-05-11T18:31:49Z

+  audit_dir="${script_dir}/audit-trail"
+
+  local _audit_ref="${AUDIT_TRAIL_REF:-main}"
+  local _audit_base="https://raw.githubusercontent.com/DataDog/integrations-management/${_audit_ref}/scaleway/log_forwarding/audit-trail"


Since the in-code comment above flags this as a pre-GA TODO, dropping a reminder: any push to main can change what gets downloaded onto a deployed instance on the next re-run. Pinning to a tagged release or a SHA would close that. Easy to forget once this merges.

gpalmz · 2026-05-11T18:31:49Z

+      application_id:  $app_id,
+      rules: [
+        {
+          permission_set_names: ["ObservabilityFullAccess", "AllProductsReadOnly"],


The Step 0 docstring describes the policy as "least-privilege," but AllProductsReadOnly here grants read access to every Scaleway product on the project, which is more than log forwarding needs. If something specific requires it (AMS tag resolution, account validation, etc.), a one-line note would help; otherwise it could be dropped to match the docstring.

EamonBrady1 · 2026-05-11T18:52:06Z

Would be great to test with another fresh user like @gpalmz before merging, just to see if they run into any more blockers

Addresses Grant's review comment on PR #166: after provision_iam_application swaps SCW_SECRET_KEY to the new app's secret (least-privilege), the cleanup loop calls `scw iam api-key delete` which would normally need IAM write permissions the app doesn't have. The script's bash-local var swap happens to "work" in interactive shells because the env vars aren't exported, so the `scw` child process falls back to the user's owner-level config-file creds. But in CI / containerized / fresh-env runs where SCW_SECRET_KEY is exported before invocation, the child process inherits the restricted app key and the delete 403s, leaking IAM keys across re-runs (matches the 4-key leak observed in earlier testing). Fix: stash _ORIG_SCW_{ACCESS,SECRET}_KEY in provision_iam_application before the swap, then run the cleanup-loop delete with an env-var override so the child `scw` process picks up the original (owner-level) creds.

rheei · 2026-05-14T15:16:55Z

@codex review

chatgpt-codex-connector · 2026-05-14T15:17:04Z

To use Codex here, create a Codex account and connect to github.

rheei · 2026-05-14T15:29:10Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4bcb529faf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T15:32:18Z

+: "${DD_API_KEY:?DD_API_KEY is required (your Datadog API key)}"
+: "${DD_APP_KEY:?DD_APP_KEY is required (your Datadog application key)}"
+: "${DD_SITE:?DD_SITE is required (e.g. datadoghq.com)}"


Don't require Datadog keys for teardown

When the user runs the documented --teardown path, the script still evaluates these Datadog key assertions before main() gets a chance to return from the teardown branch. That makes bash setup-logs.sh --teardown fail immediately unless unrelated Datadog credentials are present, even though teardown only deletes Scaleway audit-trail instances; gate these checks on TEARDOWN=false or move them after the teardown early return.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-14T15:32:18Z

+      if grep -qxF "$ds_id" <<< "$exported_ids"; then
+        ok "Already exported  project=$SCW_PROJECT_ID  region=$region  datasource=$ds_id"
+        skipped=$((skipped + 1))


Update existing Cockpit exporters on rerun

For an existing exporter, the script skips solely because the datasource ID is already exported with this name. On a rerun for the same DD_SITE after rotating DD_API_KEY or changing SCALEWAY_PRODUCTS, this leaves the old Datadog destination/products in Scaleway and logs can continue going to a revoked or unintended key; compare the existing exporter configuration and PATCH it instead of treating every name/datasource match as current.

Useful? React with 👍 / 👎.

rheei · 2026-05-15T14:15:27Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4bcb529faf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-15T14:18:19Z

+          permission_set_names: ["ObservabilityFullAccess", "AllProductsReadOnly"],
+          project_ids:          [$project_id]


Grant instance provisioning permissions before switching keys

When SCW_SECRET_KEY/SCW_ACCESS_KEY are supplied as environment variables (the tile pre-filled flow), reassigning them to the new application key keeps them exported, so the later scw instance server create inherits this policy. The policy only grants Observability/AllProductsReadOnly/AuditTrailReadOnly, while Scaleway documents InstancesFullAccess as the permission set for “create, read, list, edit and delete Instances” (https://www.scaleway.com/en/docs/iam/reference-content/permission-sets/). With audit trail enabled and no SCW_INSTANCE_IP, auto-provisioning therefore fails with an authorization error; provision the VM before switching credentials or grant the instance permissions required by the create/delete path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-15T14:18:19Z

+    local datasource_ids=()
+    while IFS= read -r _id; do
+      [[ -n "$_id" ]] && datasource_ids+=("$_id")
+    done < <(get_log_datasource_ids "$SCW_PROJECT_ID" "$region" 2>/dev/null || true)


Track Cockpit list failures instead of counting them as success

If listing data sources fails for a region (for example because the new app key lacks the expected Cockpit permission, the region is invalid, or Scaleway returns a transient error), this || true converts the failure into an empty list. The loop then increments _COCKPIT_SKIPPED, and main treats skipped entries as cockpit_working, so with audit trail disabled/not deployed the script can still register a Datadog account even though no exporter was created or confirmed. Propagate these list failures into _COCKPIT_FAILED separately from the legitimate empty-project case.

Useful? React with 👍 / 👎.

End-to-end onboarding for Scaleway → Datadog log forwarding. Single bash script with no manual tile entry required. Bootstrap - Detects whether `scw` CLI is installed; offers to install via the official Scaleway installer (Linux) or Homebrew / direct binary download from GitHub releases (macOS, including Apple Silicon). - Runs `scw init` if scw is installed but no credentials are configured. Pre-flights (fail fast — no Scaleway resources created on failure) - Verifies the Datadog API/App key has the `integrations_manage` permission via a side-effect-free DELETE probe against a zero-UUID. - When audit trail will provision an Instance, confirms at least one Scaleway IAM SSH key is registered (Scaleway auto-installs them on new Instances so the script can SSH in to deploy the collector). Step 0 — IAM provisioning - Creates a least-privilege `datadog-integration` IAM application with policy rules scoped to the target project (ObservabilityFullAccess + AllProductsReadOnly) and org (AuditTrailReadOnly). Rules are set via `PUT /iam/v1alpha1/rules` so re-runs actually refresh them (Scaleway's policy PATCH silently drops the rules field). - Generates an API key pair and switches subsequent calls to those credentials. Stashes the original (owner-level) creds for end-of-run IAM key cleanup, since the new app's policy lacks IAM write. Part 1 — Cockpit native exports - Creates Datadog log exporters for every Scaleway-managed Cockpit data source across all supported regions. Regions are discovered dynamically from the scw CLI so new regions are picked up automatically. Idempotent per exporter (dedup-by-name). Part 2 — Audit trail collector - Gated by the `SCW_AUDIT_TRAIL_ENABLED=true` toggle from the integration tile UI. - Auto-provisions a small Scaleway Instance (DEV1-S, ~€6.34/mo) tagged `datadog-audit-trail`, or honors a user-supplied `SCW_INSTANCE_IP` for BYO use. - Builds an OpenTelemetry Collector locally with the `scwaudittrail` receiver, deploys the binary over SSH with a pinned host key, installs a systemd unit. - `--teardown` flag deletes only the auto-provisioned Instance (by tag) along with its IP and volumes; never touches BYO Instances. Part 3 — Datadog account registration - POST or PATCH to `/api/v2/web-integrations/scaleway/accounts` with the provisioned credentials. Paginated list to find an existing account by name. Skipped if Cockpit failed to produce any working exporters and audit trail didn't deploy, so a permission failure or partial outage doesn't leave a dangling integration entry. Flags / env vars - `--dry-run`: print every API call (method, URL, body) without executing; fake credentials accepted. - `--teardown`: delete the auto-provisioned audit-trail Instance and exit. - `SCW_AUDIT_TRAIL_ENABLED`, `SCW_INSTANCE_IP`, `PROVISION_INSTANCE`, `SCW_AUDIT_INSTANCE_{TYPE,ZONE,IMAGE}` for overrides. Adds `@DataDog/emerging-cloud-integrations` as the CODEOWNER for `scaleway/`. Verified end-to-end against a real Scaleway organization and Datadog account: provisioning ~24s to SSH-ready, idempotent re-runs reuse the Instance by tag, teardown leaves user-owned resources untouched, and audit trail events land in Datadog under `service:scaleway.audit-trail`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

macOS ships bash 3.2.x by default. The only bash-4.2 dependency was `printf '%(...)T'` for log timestamps; swap to `date '+%H:%M:%S'` which works on bash 3.2. Drops the version gate, so macOS users can run the script without first installing a newer bash via Homebrew.

rheei changed the title ~~scaleway'~~ ECI-1486 scaleway onboard setup script Apr 22, 2026

rheei force-pushed the rheei/eci-1486-scaleway-integration-setup branch from 4a1f185 to 066c892 Compare April 22, 2026 20:49

rheei requested a review from EamonBrady1 April 23, 2026 13:25