Skip to content

feat: RAG service API key file management#302

Merged
tsivaprasad merged 11 commits intomainfrom
PLAT-490-rag-service-api-key-file-management
Apr 7, 2026
Merged

feat: RAG service API key file management#302
tsivaprasad merged 11 commits intomainfrom
PLAT-490-rag-service-api-key-file-management

Conversation

@tsivaprasad
Copy link
Copy Markdown
Contributor

@tsivaprasad tsivaprasad commented Mar 18, 2026

Summary

This PR adds RAGServiceKeysResource, a resource that stores provider API keys (OpenAI, Anthropic, etc.) as files on the host filesystem and mounts them into the RAG container in read-only mode. The keys are not exposed via Docker Swarm configs or API responses.

Changes

  • Add RAGServiceKeysResource (swarm.rag_service_keys) — manages the {DataDir}/services/{instanceID}/keys/ directory: Create writes key files at mode 0600 in a 0700 directory, Update rewrites them for key rotation, Delete removes the directory via os.RemoveAll
  • Add extractRAGAPIKeys helper that extracts API keys from a parsed RAGServiceConfig into a {pipeline_name}_embedding.key / {pipeline_name}_rag.key filename map
  • Extend ServiceContainerSpecOptions with KeysPath; when non-empty a read-only bind mount to /app/keys is added to the container spec
  • Wire RAGServiceKeysResource into generateRAGInstanceResources after RAGServiceUserRole; RAG config is now parsed at resource-generation time
  • Register the new resource type in RegisterResourceTypes

Testing

  1. Covered unit tests
  2. Manual Verification
  • Created cluster
  • Created DB using below config:
{
  "id": "storefront",
  "spec": {
    "database_name": "storefront",
    "database_users": [
      {
        "username": "admin",
        "password": "password",
        "db_owner": true,
        "attributes": ["SUPERUSER", "LOGIN"]
      }
    ],
    "port": 0,
    "nodes": [
      { "name": "n1", "host_ids": ["host-1"] }
    ],
    "services": [
      {
        "service_id": "rag",
        "service_type": "rag",
        "version": "latest",
        "host_ids": ["host-1"],
        "port": 0,
        "config": {
          "defaults": {
            "token_budget": 2000,
            "top_n": 10
          },
          "pipelines": [
            {
              "name": "default",
              "description": "Main RAG pipeline",
              "tables": [
                {
                  "table": "documents_content_chunks",
                  "text_column": "content",
                  "vector_column": "embedding"
                }
              ],
              "embedding_llm": {
                "provider": "openai",
                "model": "text-embedding-3-small",
                "api_key": "sk-proj-AAA"
              },
              "rag_llm": {
                "provider": "anthropic",
                "model": "claude-sonnet-4-5",
                "api_key": "sk-ant-XXX"
              },
              "token_budget": 4000,
              "top_n": 15,
              "search": {
                "hybrid_enabled": true,
                "vector_weight": 0.7
              }
            }
          ]
        }
      }
    ]
  }
}
  • No api_keys in response
  • Confirmed that the key files exists at path
# pwd
/host-1/services/storefront-rag-host-1/keys
# ls -la
total 8
drwx------ 4 root root 128 Mar 17 16:48 .
drwx------ 3 root root  96 Mar 17 16:48 ..
-rw------- 1 root root 164 Mar 17 16:48 default_embedding.key
-rw------- 1 root root 108 Mar 17 16:48 default_rag.key

Checklist

  • Tests added

Notes for Reviewers

  • RAGServiceKeysResource owns the keys/ subdirectory entirely (no separate DirResource), since no other resource shares it
  • The KeysPath field on ServiceContainerSpecOptions is currently unused by MCP (zero value skips the mount) — no behaviour change for existing MCP services
  • Key files are written as root-owned (UID 0); the RAG container UID and ownership will be addressed when container deployment is wired up in the next PR

PLAT-490

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds host-side RAG API key file management and wiring into RAG instance resource generation, refactors MCP per-node role and PostgREST authenticator generation, introduces a container KeysPath mount option, and adds unit tests for key handling, filename validation, and resource lifecycle.

Changes

Cohort / File(s) Summary
RAG Service Keys Resource
server/internal/orchestrator/swarm/rag_service_keys_resource.go, server/internal/orchestrator/swarm/rag_service_keys_resource_test.go, server/internal/orchestrator/swarm/resources.go
New ResourceTypeRAGServiceKeys, identifier factory, and RAGServiceKeysResource with Create/Update/Refresh/Delete. Persists API keys under a keys/ directory with filename validation and enforced permissions; registered in resource types and covered by unit tests.
Orchestrator — RAG instance generation
server/internal/orchestrator/swarm/orchestrator.go
generateRAGInstanceResources now parses RAG config with early failure on parse errors, creates a service data dir resource, extracts API keys via extractRAGAPIKeys, and appends a RAGServiceKeysResource.
Orchestrator — MCP / PostgREST refactor
server/internal/orchestrator/swarm/orchestrator.go
generateMCPInstanceResources iterates all spec.DatabaseNodes, skips only the canonical node when creating per-node ServiceUserRole entries, and emits per-node PostgRESTAuthenticatorResource using per-node RW role identifiers.
Service container spec
server/internal/orchestrator/swarm/service_spec.go
Adds KeysPath string to ServiceContainerSpecOptions and conditionally includes an additional read-only bind mount for API key files into the container when KeysPath is set.
Tests and validation adjustments
server/internal/orchestrator/swarm/rag_service_user_role_test.go, server/internal/orchestrator/swarm/rag_service_keys_resource_test.go
Updates RAG instance tests to expect appended dir and keys resources, adjusts per-node role assertions, and adds tests for extractRAGAPIKeys, validateKeyFilename, directory/file permission enforcement, and full resource lifecycle behavior.

Poem

🐇 I hop through keys and tidy trails,
I name each file with careful scales.
No ../ tricks, no stray slash games—
Directories snug, with guarded names.
I hum, I guard, the secrets stay—🐰 keys safe where I lay.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding RAG service API key file management functionality, which aligns with the core feature of storing and managing provider API keys on the host filesystem.
Description check ✅ Passed The PR description follows the template with all required sections: Summary, Changes, Testing, and Checklist. All critical information is provided including linked issue, testing details, and implementation notes for reviewers.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch PLAT-490-rag-service-api-key-file-management

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tsivaprasad
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 18, 2026

@tsivaprasad Sure! I'll review the changes in this PR now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/rag_service_keys_resource.go`:
- Around line 121-127: extractRAGAPIKeys currently uses pipeline names directly
to build filenames (e.g., keys[p.Name+"_embedding.key"]) and writeKeyFiles
writes them to disk, allowing path traversal via crafted pipeline names; fix by
sanitizing/validating pipeline names before using them in filenames (e.g.,
reject names containing path separators or non-alphanumeric chars, or replace
unsafe chars with underscores) and build file paths with filepath.Join then
verify the cleaned path is inside KeysDir (ensure filepath.Clean(joinedPath) has
KeysDir as a prefix) before writing; apply the same hardening wherever p.Name is
used to form filenames in extractRAGAPIKeys and writeKeyFiles.
- Around line 76-80: The Refresh loop in Refresh is only checking
os.IsNotExist(err) and ignores other stat errors; update the loop over r.Keys
(using r.KeysDir and filepath.Join) so that after calling os.Stat you: if err ==
nil continue; if os.IsNotExist(err) return resource.ErrNotFound; otherwise
return or wrap the encountered err (preserving context such as the key name) so
permission/I/O errors are surfaced instead of treated as present. Ensure you
reference the os.Stat call and return the real error for non-NotExist cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d7d38e6a-f745-4663-993e-e43256518cb2

📥 Commits

Reviewing files that changed from the base of the PR and between 4392b87 and 26eec5e.

📒 Files selected for processing (6)
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/rag_service_keys_resource.go
  • server/internal/orchestrator/swarm/rag_service_keys_resource_test.go
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go
  • server/internal/orchestrator/swarm/resources.go
  • server/internal/orchestrator/swarm/service_spec.go

@tsivaprasad tsivaprasad force-pushed the PLAT-489-rag-service-service-user-provisioning branch 3 times, most recently from 759feca to 497d7b2 Compare March 30, 2026 14:16
@tsivaprasad tsivaprasad force-pushed the PLAT-490-rag-service-api-key-file-management branch from c384b7d to 901e64e Compare April 3, 2026 12:18
@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 3, 2026

Up to standards ✅

🟢 Issues 1 medium

Results:
1 new issue

Category Results
Complexity 1 medium

View in Codacy

🟢 Metrics 61 complexity · 14 duplication

Metric Results
Complexity 61
Duplication 14

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

@tsivaprasad tsivaprasad changed the base branch from PLAT-489-rag-service-service-user-provisioning to main April 3, 2026 14:12
@tsivaprasad tsivaprasad force-pushed the PLAT-490-rag-service-api-key-file-management branch from 7a20ebb to da808b3 Compare April 3, 2026 14:48
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 685-703: The code creates a host-side keys directory
(RAGServiceKeysResource) but never appends a
ServiceInstanceSpecResource/ServiceInstanceResource pair so the keys never get
mounted; modify the generator to mirror the MCP/PostgREST flow by creating and
appending the same service spec/resource pair (use the same pattern that
produces ServiceInstanceSpecResource and ServiceInstanceResource) and set the
service's mount to consume the KeysPath from RAGServiceKeysResource (refer to
dataDirID, dataDir, extractRAGAPIKeys and KeysPath) before calling
buildServiceInstanceResources so the returned orchestratorResources include the
service that mounts the keys.

In `@server/internal/orchestrator/swarm/rag_service_keys_resource.go`:
- Around line 103-123: Create/Update currently only set permissions on newly
created entries, leaving existing keys/ directory or files with insecure modes;
ensure directory and files are explicitly chmod'd on every Create and Update. In
Create and Update call os.MkdirAll(keysDir, 0o700) (or ensure existence) and
then os.Chmod(keysDir, 0o700) to enforce directory perms before any writes;
after writing each key file in writeKeyFiles (or immediately after os.WriteFile)
call os.Chmod(filePath, 0o600) to enforce file perms; in Update ensure you
create/chmod the directory prior to calling removeStaleKeyFiles/writeKeyFiles so
permissions are always hardened. Include these changes in the
RAGServiceKeysResource methods Create, Update and in
writeKeyFiles/removeStaleKeyFiles as appropriate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f2473c91-6b36-4833-bcf1-a5fadd99d2ee

📥 Commits

Reviewing files that changed from the base of the PR and between 26eec5e and da808b3.

📒 Files selected for processing (7)
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/rag_service_keys_resource.go
  • server/internal/orchestrator/swarm/rag_service_keys_resource_test.go
  • server/internal/orchestrator/swarm/rag_service_user_role.go
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go
  • server/internal/orchestrator/swarm/resources.go
  • server/internal/orchestrator/swarm/service_spec.go
✅ Files skipped from review due to trivial changes (1)
  • server/internal/orchestrator/swarm/rag_service_keys_resource_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • server/internal/orchestrator/swarm/resources.go
  • server/internal/orchestrator/swarm/service_spec.go
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
server/internal/orchestrator/swarm/orchestrator.go (1)

686-704: ⚠️ Potential issue | 🔴 Critical

RAG still never emits the service resources that would mount these keys.

This path appends only RO-role and filesystem resources, then returns; buildServiceInstanceResources just serializes that slice. Unlike generateMCPInstanceResources, there is no ServiceInstanceSpecResource/ServiceInstanceResource, so nothing can pass KeysPath into the container or start the RAG service at all.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/orchestrator.go` around lines 686 - 704,
The RAG path builds only filesystem and keys resources (RAGServiceKeysResource)
then returns via buildServiceInstanceResources, so no
ServiceInstanceSpecResource/ServiceInstanceResource is emitted and KeysPath
never reaches the container; fix by mirroring the pattern used in
generateMCPInstanceResources: create and append a ServiceInstanceSpecResource
with KeysPath set to the keys subdir and then a ServiceInstanceResource (or
ServiceInstanceSpecResource wrapper) that references the same ServiceInstanceID
so the orchestrator will mount the keys and start the RAG service; ensure you
append these resources to orchestratorResources before calling
buildServiceInstanceResources.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@server/internal/orchestrator/swarm/orchestrator.go`:
- Around line 686-704: The RAG path builds only filesystem and keys resources
(RAGServiceKeysResource) then returns via buildServiceInstanceResources, so no
ServiceInstanceSpecResource/ServiceInstanceResource is emitted and KeysPath
never reaches the container; fix by mirroring the pattern used in
generateMCPInstanceResources: create and append a ServiceInstanceSpecResource
with KeysPath set to the keys subdir and then a ServiceInstanceResource (or
ServiceInstanceSpecResource wrapper) that references the same ServiceInstanceID
so the orchestrator will mount the keys and start the RAG service; ensure you
append these resources to orchestratorResources before calling
buildServiceInstanceResources.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 82bd302d-2d12-4869-b757-f61e6e25480b

📥 Commits

Reviewing files that changed from the base of the PR and between da808b3 and f8f2987.

📒 Files selected for processing (2)
  • server/internal/orchestrator/swarm/orchestrator.go
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go
✅ Files skipped from review due to trivial changes (1)
  • server/internal/orchestrator/swarm/rag_service_user_role_test.go

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
server/internal/orchestrator/swarm/rag_service_keys_resource.go (1)

103-123: ⚠️ Potential issue | 🟠 Major

Permission hardening in Create/Update is still incomplete.

Create/Update should enforce directory and file modes on every reconciliation and Update should ensure keysDir exists before stale cleanup/write. This concern was raised earlier and is still unresolved in the current patch.

🔒 Proposed hardening
 func (r *RAGServiceKeysResource) Create(ctx context.Context, rc *resource.Context) error {
 	keysDir, err := r.keysDir(rc)
 	if err != nil {
 		return err
 	}
 	if err := os.MkdirAll(keysDir, 0o700); err != nil {
 		return fmt.Errorf("failed to create keys directory: %w", err)
 	}
+	if err := os.Chmod(keysDir, 0o700); err != nil {
+		return fmt.Errorf("failed to set keys directory permissions: %w", err)
+	}
 	return r.writeKeyFiles(keysDir)
 }

 func (r *RAGServiceKeysResource) Update(ctx context.Context, rc *resource.Context) error {
 	keysDir, err := r.keysDir(rc)
 	if err != nil {
 		return err
 	}
+	if err := os.MkdirAll(keysDir, 0o700); err != nil {
+		return fmt.Errorf("failed to create keys directory: %w", err)
+	}
+	if err := os.Chmod(keysDir, 0o700); err != nil {
+		return fmt.Errorf("failed to set keys directory permissions: %w", err)
+	}
 	if err := r.removeStaleKeyFiles(keysDir); err != nil {
 		return err
 	}
 	return r.writeKeyFiles(keysDir)
 }
@@
 		path := filepath.Join(keysDir, name)
 		if err := os.WriteFile(path, []byte(key), 0o600); err != nil {
 			return fmt.Errorf("failed to write key file %q: %w", name, err)
 		}
+		if err := os.Chmod(path, 0o600); err != nil {
+			return fmt.Errorf("failed to set key file %q permissions: %w", name, err)
+		}
 	}

Based on learnings: "implement idempotent provisioning semantics: re-running ProvisionServices on already-provisioned service instances should be safe and non-destructive."

Also applies to: 140-149

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/rag_service_keys_resource.go` around lines
103 - 123, Create/Update currently don't enforce idempotent permission
hardening: ensure keysDir exists (create with 0o700) on every reconciliation and
enforce file modes after writing; in RAGServiceKeysResource.Create and
RAGServiceKeysResource.Update call mkdir (os.MkdirAll) with 0o700 before any
cleanup/write, have Update create keysDir before removeStaleKeyFiles, and after
r.writeKeyFiles ensure each key file written by writeKeyFiles is chmod'ed to a
strict mode (e.g., 0o600) or have writeKeyFiles return file paths so caller can
set modes; also ensure directory permissions are re-applied (os.Chmod(keysDir,
0o700)) so repeats are idempotent and non-destructive.
🧹 Nitpick comments (1)
server/internal/orchestrator/swarm/rag_service_keys_resource_test.go (1)

297-315: Add a directory-permission assertion in Create test.

At Line 297, the test verifies key file modes but not the keys/ directory mode. Add a Stat(keysDir) assertion for 0700 so permission regressions are caught.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/rag_service_keys_resource_test.go` around
lines 297 - 315, Add a directory permission assertion for the keysDir in the
test: after constructing keysDir (as in the existing test in
rag_service_keys_resource_test.go) call os.Stat(keysDir) and assert its
Mode().Perm() equals 0o700 (0700) (use t.Errorf on mismatch), analogous to the
existing per-file Stat checks for each key name in r.Keys so directory
permission regressions are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/rag_service_keys_resource.go`:
- Around line 91-97: The loop in Refresh that iterates over r.Keys uses
filepath.Join(keysDir, name) without validating the key filename, allowing
directory traversal; before calling os.Stat or joining the path validate each
key name by calling validateKeyFilename(name) (the same validation used
elsewhere), and if validation fails return resource.ErrNotFound (or an
appropriate error) so malformed names are rejected and you never stat outside
keysDir; update the loop in the Refresh method to perform this validation for
each name.

---

Duplicate comments:
In `@server/internal/orchestrator/swarm/rag_service_keys_resource.go`:
- Around line 103-123: Create/Update currently don't enforce idempotent
permission hardening: ensure keysDir exists (create with 0o700) on every
reconciliation and enforce file modes after writing; in
RAGServiceKeysResource.Create and RAGServiceKeysResource.Update call mkdir
(os.MkdirAll) with 0o700 before any cleanup/write, have Update create keysDir
before removeStaleKeyFiles, and after r.writeKeyFiles ensure each key file
written by writeKeyFiles is chmod'ed to a strict mode (e.g., 0o600) or have
writeKeyFiles return file paths so caller can set modes; also ensure directory
permissions are re-applied (os.Chmod(keysDir, 0o700)) so repeats are idempotent
and non-destructive.

---

Nitpick comments:
In `@server/internal/orchestrator/swarm/rag_service_keys_resource_test.go`:
- Around line 297-315: Add a directory permission assertion for the keysDir in
the test: after constructing keysDir (as in the existing test in
rag_service_keys_resource_test.go) call os.Stat(keysDir) and assert its
Mode().Perm() equals 0o700 (0700) (use t.Errorf on mismatch), analogous to the
existing per-file Stat checks for each key name in r.Keys so directory
permission regressions are caught.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 09319c5f-063b-4cd9-bf98-f87d1bfcb5fd

📥 Commits

Reviewing files that changed from the base of the PR and between f8f2987 and fa8d812.

📒 Files selected for processing (2)
  • server/internal/orchestrator/swarm/rag_service_keys_resource.go
  • server/internal/orchestrator/swarm/rag_service_keys_resource_test.go

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
server/internal/orchestrator/swarm/rag_service_keys_resource_test.go (1)

323-355: Add a regression test for non-destructive Update failure path.

Current tests validate successful rotation/removal, but not the case where Update fails due to an invalid desired filename. A targeted test here would lock in the expectation that existing keys are not removed on validation failure.

🧪 Suggested test case
+func TestRAGServiceKeysResource_Update_InvalidFilename_DoesNotDeleteExistingKeys(t *testing.T) {
+	parentID := "inst1-data"
+	rc, parentPath := ragKeysRCWithTempDir(t, parentID)
+
+	r := &RAGServiceKeysResource{
+		ServiceInstanceID: "inst1",
+		HostID:            "host-1",
+		ParentID:          parentID,
+		Keys:              map[string]string{"default_rag.key": "sk-v1"},
+	}
+	if err := r.Create(context.Background(), rc); err != nil {
+		t.Fatalf("Create() error = %v", err)
+	}
+
+	r.Keys = map[string]string{"sub/dir.key": "sk-bad"} // invalid filename
+	if err := r.Update(context.Background(), rc); err == nil {
+		t.Fatal("Update() = nil, want error")
+	}
+
+	keysDir := filepath.Join(parentPath, "keys")
+	if _, err := os.Stat(filepath.Join(keysDir, "default_rag.key")); err != nil {
+		t.Fatalf("default_rag.key should remain after failed Update, got err = %v", err)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/rag_service_keys_resource_test.go` around
lines 323 - 355, Add a regression test that ensures Update is non-destructive
when validation fails: use ragKeysRCWithTempDir to create an initial
RAGServiceKeysResource (call Create) with a valid key file (e.g., "old_rag.key"
-> "sk-old"), then set r.Keys to an invalid desired filename (e.g., "../bad" or
another name that will trigger validation in RAGServiceKeysResource.Update) and
call Update expecting an error; assert the error is returned and that the
original file still exists with unchanged content (check
filepath.Join(parentPath,"keys","old_rag.key") exists and contains "sk-old") and
that no new invalid file was created. This locks in the guarantee that Update
does not remove/overwrite existing keys on validation failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/internal/orchestrator/swarm/rag_service_keys_resource.go`:
- Around line 193-196: The validateKeyFilename function currently allows "." and
".." which can create unsafe path resolution; update validateKeyFilename to
explicitly reject the dot-segments by returning an error when name == "." or
name == ".." (in addition to the existing filepath.Clean/IsAbs/ContainsAny
checks) so Refresh and any write paths cannot use these special segments.
- Around line 120-134: In Update, validate all desired key filenames before
performing any destructive work: after obtaining keysDir (in
RAGServiceKeysResource.Update) and before calling removeStaleKeyFiles, run a
validation step that checks each desired filename produced/used by writeKeyFiles
(e.g., validate keys' names/paths) and return an error on invalid names; only if
validation passes proceed to call removeStaleKeyFiles and then writeKeyFiles.
Apply the same ordering/validation change to the analogous block around the
other Update usage (the code at the 152-156 region) so deletions never happen
before filenames are validated.

---

Nitpick comments:
In `@server/internal/orchestrator/swarm/rag_service_keys_resource_test.go`:
- Around line 323-355: Add a regression test that ensures Update is
non-destructive when validation fails: use ragKeysRCWithTempDir to create an
initial RAGServiceKeysResource (call Create) with a valid key file (e.g.,
"old_rag.key" -> "sk-old"), then set r.Keys to an invalid desired filename
(e.g., "../bad" or another name that will trigger validation in
RAGServiceKeysResource.Update) and call Update expecting an error; assert the
error is returned and that the original file still exists with unchanged content
(check filepath.Join(parentPath,"keys","old_rag.key") exists and contains
"sk-old") and that no new invalid file was created. This locks in the guarantee
that Update does not remove/overwrite existing keys on validation failure.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 131486d4-8056-4359-9576-78b7da08721a

📥 Commits

Reviewing files that changed from the base of the PR and between fa8d812 and ea3a626.

📒 Files selected for processing (2)
  • server/internal/orchestrator/swarm/rag_service_keys_resource.go
  • server/internal/orchestrator/swarm/rag_service_keys_resource_test.go

@tsivaprasad tsivaprasad requested a review from jason-lynch April 4, 2026 14:03
Copy link
Copy Markdown
Member

@jason-lynch jason-lynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thank you for addressing the other comments about the key file lifecycle. Most of my questions are pretty minor, but I do want to be really careful about the database user stuff.

if len(spec.DatabaseNodes) > 1 {
for _, nodeInst := range spec.DatabaseNodes[1:] {
perNodeRWID := ServiceUserRolePerNodeIdentifier(spec.ServiceSpec.ServiceID, ServiceUserRoleRW, nodeInst.NodeName)
for _, nodeInst := range spec.DatabaseNodes {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to include these changes? If so, could you please request a review from Ryan as well? The way we manage these users is pretty tricky, and I want to make sure we don't accidentally break another service.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @jason-lynch , this change is intentional. The old DatabaseNodes[1:] assumed the canonical node is always first, but for RAG the canonical is the co-located node (not necessarily index 0) — so we now skip by name instead. This produces identical behavior for MCP/PostgREST since their canonical is always nodeInstances[0]. Requesting @rshoemaker review now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks ok.

return nil
}

// validateKeyFilename rejects filenames that could escape the keys directory via path traversal.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we validate these in the API layer as well? It might be worth being very restrictive about allowed characters in these key names.

That could definitely be a separate PR if not, I'm mostly curious what the current state is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jason-lynch — makes sense. I’ve created a separate ticket: https://pgedge.atlassian.net/browse/PLAT-536

However, there are currently no pipeline name restrictions in the RAG server.

@tsivaprasad tsivaprasad requested a review from rshoemaker April 7, 2026 07:40
Copy link
Copy Markdown
Contributor

@rshoemaker rshoemaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@tsivaprasad tsivaprasad merged commit acaaa66 into main Apr 7, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants