fix(webhook): per-instance webhook secret separate from master API key (Phase 1.1c — Phase 1 complete) by mescon · Pull Request #189 · mescon/Healarr

mescon · 2026-05-21T17:06:33Z

Closes P0 finding S2 — the last Phase 1 item. With this merged, Phase 1 of the remediation plan is complete.

The webhook handler accepted the master API key as the credential for incoming Sonarr/Radarr webhook calls — one secret, two roles, no separation. Per the audit: anyone who could read the webhook config in a compromised *arr instance gained full Healarr admin access.

Changes:

Migration 006 adds a nullable webhook_secret column to arr_instances.
createArrInstance generates a per-instance secret (32-byte cryptographic token, same primitive as auth.GenerateAPIKey), stores encrypted, returns plaintext in response so the user can paste it into Sonarr/Radarr.
getArrInstances surfaces webhook_secret (decrypted for display); NULL on legacy rows so the UI can prompt to generate one.
New endpoint POST /api/config/arr/:id/webhook-secret rotates the secret for incident response.
handleWebhook auth: if the instance has a webhook_secret, ONLY that secret authenticates (master key rejected); if NULL (legacy), master key works with a Warnf nudge to generate a per-instance secret.

Tests: 3 existing webhook tests updated to create instances first (the reorder made instance lookup the first step). New TestWebhook_PerInstanceSecret_Accepted verifies the per-instance secret authenticates AND that the master key is rejected once the instance has one.

Backward-compat: pre-migration instances continue to work via the master-key fallback; new instances get a webhook_secret automatically and require it for webhook auth. Admin UI / CLI auth via master key is unaffected (this PR only changes webhook handler auth).

Phase 1 of the remediation plan is COMPLETE after this PR.

Summary by CodeRabbit

New Features
- Introduced per-instance webhook secrets for enhanced security and isolation between instances
- Added the ability to generate and regenerate webhook secrets for individual arr instances
- New API endpoint to manage webhook secret regeneration
- Maintained backward compatibility for instances without configured secrets

Closes the P0 finding S2 from the audit — the LAST remaining Phase 1 item. The webhook handler accepted the master API key (used by the admin UI and CLI integrations) as the credential for incoming Sonarr/Radarr webhook calls. Anyone who could read the webhook config in a compromised *arr instance gained full Healarr admin access; the two credentials shared one secret with no separation. Changes: 1. Migration 006_webhook_secret.sql — adds nullable webhook_secret column to arr_instances. Additive, no breakage for existing rows. 2. createArrInstance — generates a per-instance webhook secret via auth.GenerateAPIKey (32-byte cryptographic token), stores encrypted, and returns the plaintext in the response so the user can paste it into the Sonarr/Radarr webhook config. The instance ID is also returned so the UI can link to the new instance. 3. getArrInstances — surfaces webhook_secret (decrypted for display) alongside api_key. Legacy rows with NULL webhook_secret report webhook_secret=null so the UI can prompt the user to generate one. 4. regenerateWebhookSecret — new endpoint POST /api/config/arr/:id/webhook-secret that rotates the secret. Existing webhooks using the old value will start failing until the user updates their Sonarr/Radarr config; that's intended for incident response (e.g., if a secret leaked). 5. handleWebhook — authentication path changed: - If the instance has a webhook_secret, the supplied apikey MUST match THAT secret. The master key is no longer accepted for instances that have moved to per-instance secrets — that's the whole point of the separation. - If webhook_secret is NULL (legacy instances pre-migration), fall back to the master API key. A Warnf log line nudges the operator to generate a per-instance secret to close the gap. The instance lookup now happens BEFORE credential validation (previously it was after) because the per-instance secret is the authoritative credential and we need to know which instance is being asked about before we can validate. Tests: - setupTestDB and the webhook test DB schema include webhook_secret. - Three existing failure-mode webhook tests (TestWebhook_InvalidAPIKey, _DBError, _DecryptError) updated to create an arr instance first; previously they bypassed the lookup but my reorder made instance lookup the first step. The legacy master-key fallback path is what they're actually testing now. - TestWebhook_PerInstanceSecret_Accepted (new): inserts an instance with a known webhook_secret, verifies that secret authenticates successfully, AND that the master key is rejected — the regression guarantee for this PR. Backward compatibility: - Existing webhooks (pre-migration) continue to authenticate via the master API key. The Warnf log indicates which instances need per-instance secret generation. - Existing tools (CLI, manual curl) authenticating via the master API key for the admin UI / protected endpoints are unaffected — this PR only changes webhook handler auth, not the general authMiddleware. - New instances created after this migration get a webhook_secret automatically and require it for webhook auth — the master key stops working for them immediately. Frontend follow-up (deferred): - UI display of webhook_secret per instance with copy-to-clipboard and "regenerate" button - Setup wizard prompting users with legacy instances to generate per-instance secrets Addresses Phase 1.1c — the last Phase 1 item. With this merged, Phase 1 (Critical correctness — security & silent failures) is COMPLETE.

coderabbitai · 2026-05-21T17:06:45Z

📝 Walkthrough

Walkthrough

The PR adds per-instance webhook secret support to Healarr. A database migration introduces a nullable webhook_secret column on arr_instances. Instance handlers generate and encrypt secrets at creation; webhook authentication checks per-instance secrets first, then falls back to the master API key for legacy instances. A new regeneration endpoint allows updating secrets. Tests cover both authentication paths.

Changes

Per-Instance Webhook Secrets

Layer / File(s)	Summary
Database schema and test setup `internal/db/migrations/006_webhook_secret.sql`, `internal/api/handlers_test.go`, `internal/api/handlers_webhook_test.go`	Migration adds `webhook_secret` TEXT column to `arr_instances`; API and webhook test schemas add the new column to their test table definitions.
Instance management with webhook secrets `internal/api/handlers_arr.go`	Imports `database/sql` and `auth` packages; updates `getArrInstances` to query and decrypt `webhook_secret` in instance listings; extends `createArrInstance` to auto-generate and encrypt per-instance secrets; adds new `regenerateWebhookSecret` handler to update and return plaintext secrets.
Webhook authentication with per-instance secret and legacy fallback `internal/api/handlers_webhook.go`	Parses `instance_id` up front, loads instance `enabled` state and nullable `webhook_secret`; authenticates using per-instance secret when present (decrypt and constant-time compare); falls back to master `settings.api_key` for legacy instances (with warning log); removes unconditional master key check from main flow.
API endpoint for webhook secret regeneration `internal/api/rest.go`	Registers new authenticated `POST /config/arr/:id/webhook-secret` route mapped to `regenerateWebhookSecret` handler.
Webhook tests for legacy and per-instance authentication `internal/api/handlers_webhook_test.go`	Updates `TestWebhook_InvalidAPIKey`, `TestWebhook_DBError`, and `TestWebhook_DecryptError` to create and use dynamic legacy instance ids instead of hardcoded paths; adds new `TestWebhook_PerInstanceSecret_Accepted` test verifying per-instance secret acceptance and master key rejection for configured instances.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Secrets per instance, no more the single key,
Rabbits guard their gardens with authenticity,
Legacy paths fade, yet still the old way works,
In backward-compat magic, the gentle fallback lurks,
Fresh secrets bloom anew at regeneration's call! 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: implementing per-instance webhook secrets separate from the master API key, which is the core focus of all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/webhook-per-instance-secret

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/api/handlers_webhook.go (1)

64-67: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Differentiate sql.ErrNoRows from real DB failures.

This branch currently turns every query failure into 404. Database faults should return 500 to avoid masking operational issues.

Suggested fix

 if err != nil {
-	logger.Errorf("Webhook rejected: Instance %d not found", instanceID)
-	c.JSON(http.StatusNotFound, gin.H{"error": "Instance not found"})
+	if errors.Is(err, sql.ErrNoRows) {
+		logger.Errorf("Webhook rejected: Instance %d not found", instanceID)
+		c.JSON(http.StatusNotFound, gin.H{"error": "Instance not found"})
+	} else {
+		logger.Errorf("Webhook auth lookup failed for instance %d: %v", instanceID, err)
+		c.JSON(http.StatusInternalServerError, gin.H{"error": "Authentication error"})
+	}
 	return
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/api/handlers_webhook.go` around lines 64 - 67, The current error
branch in the webhook handler treats any DB error as "not found"; change it to
distinguish sql.ErrNoRows from real DB failures by using errors.Is(err,
sql.ErrNoRows) (or equivalent) and only return 404
(c.JSON(http.StatusNotFound,...)) when the error is ErrNoRows for the instance
lookup (reference: instanceID, logger.Errorf, the err variable in the webhook
handler in internal/api/handlers_webhook.go); for any other error log it as a DB
failure and return 500 (http.StatusInternalServerError). Ensure you
import/qualify database/sql and/or errors as needed.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/api/handlers_arr.go`:
- Line 222: The code currently ignores errors from result.LastInsertId() (and
similarly result.RowsAffected()) which can mask DB driver failures; update the
handler(s) that call LastInsertId and RowsAffected (search for the statements
using result.LastInsertId() and result.RowsAffected()) to check and handle the
returned error explicitly: if LastInsertId/RowsAffected returns an error, log it
and return an appropriate HTTP error response (or propagate the error) instead
of proceeding with a zero value; ensure you use the actual id/rows only when err
== nil and adjust subsequent success/404 logic accordingly.

In `@internal/api/handlers_webhook_test.go`:
- Line 202: The webhook test builds URLs using string(rune('0'+arrID)) which
only works for single digit IDs; replace that with a numeric string conversion
(e.g., strconv.Itoa(arrID) or fmt.Sprintf("%d", arrID)) wherever used
(specifically in the POST request constructions that reference arrID) so IDs >=
10 produce correct path segments—apply the same replacement at all affected
occurrences mentioned in the comment.

---

Outside diff comments:
In `@internal/api/handlers_webhook.go`:
- Around line 64-67: The current error branch in the webhook handler treats any
DB error as "not found"; change it to distinguish sql.ErrNoRows from real DB
failures by using errors.Is(err, sql.ErrNoRows) (or equivalent) and only return
404 (c.JSON(http.StatusNotFound,...)) when the error is ErrNoRows for the
instance lookup (reference: instanceID, logger.Errorf, the err variable in the
webhook handler in internal/api/handlers_webhook.go); for any other error log it
as a DB failure and return 500 (http.StatusInternalServerError). Ensure you
import/qualify database/sql and/or errors as needed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 206daab2-5011-4b50-b34f-918fd14cf10e

📥 Commits

Reviewing files that changed from the base of the PR and between f053cc9 and 9983477.

📒 Files selected for processing (6)

internal/api/handlers_arr.go
internal/api/handlers_test.go
internal/api/handlers_webhook.go
internal/api/handlers_webhook_test.go
internal/api/rest.go
internal/db/migrations/006_webhook_secret.sql

coderabbitai · 2026-05-21T17:10:45Z

 		return
 	}
-	c.Status(http.StatusCreated)
+	id, _ := result.LastInsertId()


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle LastInsertId / RowsAffected errors explicitly.

Both DB result calls ignore returned errors. If the driver cannot provide these values, handlers may return incorrect success/404 behavior.

Suggested fix

- id, _ := result.LastInsertId() + id, err := result.LastInsertId() + if err != nil { + logger.Errorf("Failed to get inserted instance ID: %v", err) + c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create instance"}) + return + } - n, _ := result.RowsAffected() + n, err := result.RowsAffected() + if err != nil { + logger.Errorf("Failed to read update result for instance %s: %v", id, err) + c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to rotate webhook secret"}) + return + }

As per coding guidelines, "Handle all errors explicitly in Go - do not ignore error returns".
Also applies to: 260-260

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/api/handlers_arr.go` at line 222, The code currently ignores errors from result.LastInsertId() (and similarly result.RowsAffected()) which can mask DB driver failures; update the handler(s) that call LastInsertId and RowsAffected (search for the statements using result.LastInsertId() and result.RowsAffected()) to check and handle the returned error explicitly: if LastInsertId/RowsAffected returns an error, log it and return an appropriate HTTP error response (or propagate the error) instead of proceeding with a zero value; ensure you use the actual id/rows only when err == nil and adjust subsequent success/404 logic accordingly.

coderabbitai · 2026-05-21T17:10:45Z


 	body := bytes.NewBufferString(`{"eventType": "Download"}`)
-	req, _ := http.NewRequest("POST", "/api/webhook/1", body)
+	req, _ := http.NewRequest("POST", "/api/webhook/"+string(rune('0'+arrID)), body)


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use numeric string conversion for arrID in webhook URLs.

Current URL construction converts arrID to a single rune, so IDs >= 10 generate invalid path segments and can fail for the wrong reason.

Suggested fix

-req, _ := http.NewRequest("POST", "/api/webhook/"+string(rune('0'+arrID)), body) +req, _ := http.NewRequest("POST", "/api/webhook/"+strconv.FormatInt(arrID, 10), body)

Apply the same replacement to all affected webhook test requests.

Also applies to: 512-512, 556-556, 597-597, 607-607

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/api/handlers_webhook_test.go` at line 202, The webhook test builds URLs using string(rune('0'+arrID)) which only works for single digit IDs; replace that with a numeric string conversion (e.g., strconv.Itoa(arrID) or fmt.Sprintf("%d", arrID)) wherever used (specifically in the POST request constructions that reference arrID) so IDs >= 10 produce correct path segments—apply the same replacement at all affected occurrences mentioned in the comment.

codecov · 2026-05-21T17:10:53Z

Codecov Report

❌ Patch coverage is 48.27586% with 45 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/api/handlers_arr.go	28.07%	37 Missing and 4 partials ⚠️
internal/api/handlers_webhook.go	86.20%	3 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…on missing (#190) Closes audit finding T1. IsRecoverable and IsTrueCorruption used switch statements with implicit "default: false" — when a new ErrorType constant was added without updating both switches, the new type silently classified as neither recoverable nor true-corruption, making it invisible to the remediation pipeline. New errors would just be ignored, with no signal to the operator or developer. Replace with an authoritative errorCategories map keyed by ErrorType string: var errorCategories = map[string]ErrorCategory{ ErrorTypeZeroByte: CategoryTrueCorruption, // ... every constant must appear here ErrorTypeMountLost: CategoryRecoverable, } IsRecoverable and IsTrueCorruption now delegate to a category() helper that consults the map. On lookup miss: - testing.Testing() == true → panic with a message naming the unregistered type and pointing at the source file to update. This is the property that closes T1: a new ErrorType added without map entry will fail every test that exercises remediation routing, caught at CI rather than in production. - production binary → log Errorf and fall back to CategoryRecoverable (the conservative choice — retry rather than delete user data). Unknown errors no longer silently bypass remediation; they're treated as transient infrastructure issues with operator-visible logging. Tests: - Existing TestHealthCheckError_IsRecoverable / _IsTrueCorruption test tables continue to pass unchanged (registered types still classify the same way). - New TestHealthCheckError_UnregisteredTypePanics asserts the panic message contains "unregistered error type" so future refactors can't accidentally weaken the regression guard. - New TestHealthCheckError_CategoryConsistency asserts every entry in errorCategories has a non-Unknown category (a CategoryUnknown registration would defeat the purpose). This is the first PR in Phase 2 (Foundational types) of the remediation plan. Phase 1 (Critical correctness — security & silent failures) is complete as of PR #189. Co-authored-by: mescon <mescon@users.noreply.github.com>

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

mescon merged commit bca5ff7 into main May 21, 2026
7 of 9 checks passed

mescon deleted the fix/webhook-per-instance-secret branch May 21, 2026 17:40

mescon mentioned this pull request May 21, 2026

fix(integration): categorize HealthErrorType via explicit map, panic on missing (Phase 2.3: T1) #190

Merged

4 tasks

coderabbitai Bot mentioned this pull request Jun 3, 2026

fix(frontend): use per-instance webhook secret in copy-URL field #293

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(webhook): per-instance webhook secret separate from master API key (Phase 1.1c — Phase 1 complete)#189

fix(webhook): per-instance webhook secret separate from master API key (Phase 1.1c — Phase 1 complete)#189
mescon merged 1 commit into
mainfrom
fix/webhook-per-instance-secret

mescon commented May 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 21, 2026

Uh oh!

coderabbitai Bot May 21, 2026

Uh oh!

codecov Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mescon commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 21, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mescon commented May 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 21, 2026 •

edited

Loading