Skip to content

feat: Global Load Balancer integration, Multi-tenancy, and CI/CD optimizations#46

Merged
poyrazK merged 58 commits intomainfrom
feature/global-load-balancer
Feb 3, 2026
Merged

feat: Global Load Balancer integration, Multi-tenancy, and CI/CD optimizations#46
poyrazK merged 58 commits intomainfrom
feature/global-load-balancer

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Feb 3, 2026

Overview

This PR transitions the Global Load Balancer service to a real-world integration testing strategy, implements multi-tenancy across the stack, and optimizes the CI/CD pipeline for faster execution.

Key Changes

  • Global Load Balancer: Switched from mocks to real PostgreSQL integration tests. Added multi-user isolation and cross-service security residency checks.
  • Multi-tenancy: Updated domains, ports, services, and repositories to support and enforcement.
  • CI/CD Optimizations:
    • Consolidated race detection jobs using a matrix.
    • Enabled Docker layer caching via the GitHub Actions () backend.
    • Added concurrency management to automatically cancel redundant runs.
  • Code Quality: Synchronized Swagger documentation and resolved all Smart, fast linters runner.

Usage:
golangci-lint [flags]
golangci-lint [command]

Available Commands:
cache Cache control and information.
completion Generate the autocompletion script for the specified shell
config Configuration file information and verification.
custom Build a version of golangci-lint with custom linters.
fmt Format Go source files.
formatters List current formatters configuration.
help Display extra help
linters List current linters configuration.
migrate Migrate configuration file from v1 to v2.
run Lint the code.
version Display the golangci-lint version.

Flags:
--color string Use color when printing; can be 'always', 'auto', or 'never' (default "auto")
-h, --help Help for a command
-v, --verbose Verbose output
--version Print version

Use "golangci-lint [command] --help" for more information about a command. issues.

Summary by CodeRabbit

  • New Features

    • Global Load Balancer: DNS-based multi-region traffic steering with per-endpoint health checks and routing policies; API to create/manage GLBs and endpoints.
  • Documentation

    • Added GLB guides, ADR, architecture updates, API reference and README entries; OpenAPI/Swagger updated.
  • Tests

    • New unit/integration tests and test mocks covering GLB service and handlers.
  • Chores

    • CI/CD: workflow concurrency, Docker Buildx, image caching, and improved e2e startup/health validation.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
internal/repositories/postgres/rbac_repo_test.go (1)

15-23: ⚠️ Potential issue | 🟡 Minor

Surface cleanup errors instead of discarding them.

The cleanup section uses blank identifiers, which hides SQL failures and can leave stale data behind.

Suggested change
-	_, _ = db.Exec(ctx, "DELETE FROM role_permissions")
-	_, _ = db.Exec(ctx, "DELETE FROM roles")
+	_, err := db.Exec(ctx, "DELETE FROM role_permissions")
+	require.NoError(t, err)
+	_, err = db.Exec(ctx, "DELETE FROM roles")
+	require.NoError(t, err)

As per coding guidelines: Do not use silent failures - avoid blank identifier assignment like _ = someFunc().

internal/repositories/postgres/secret_repo_test.go (1)

17-26: ⚠️ Potential issue | 🟡 Minor

Don’t ignore cleanup errors when deleting secrets.

The cleanup currently discards errors, which can mask data leakage between tests.

Suggested change
-	_, _ = db.Exec(context.Background(), "DELETE FROM secrets")
+	_, err := db.Exec(context.Background(), "DELETE FROM secrets")
+	require.NoError(t, err)

As per coding guidelines: Do not use silent failures - avoid blank identifier assignment like _ = someFunc().

internal/repositories/postgres/test_helpers.go (1)

91-126: ⚠️ Potential issue | 🟡 Minor

Only ignore missing‑table errors during cleanup; fail on other errors.

The current code logs and continues for any error, masking real cleanup failures. Implement specific error checking to only suppress PostgreSQL error code 42P01 (undefined_table), and fail on all other errors:

Suggested change
 import (
 	"context"
+	"errors"
 	"log/slog"
 	"os"
 	"testing"
 	"time"

 	"github.com/google/uuid"
+	"github.com/jackc/pgx/v5/pgconn"
 	"github.com/jackc/pgx/v5/pgxpool"
 	appcontext "github.com/poyrazk/thecloud/internal/core/context"
 	"github.com/poyrazk/thecloud/internal/core/domain"
 	"github.com/pkg/testutil"
 	"github.com/stretchr/testify/require"
 )
@@
 	for _, q := range queries {
 		_, err := db.Exec(ctx, q)
-		// Suppress errors if a table does not exist (PostgreSQL error 42P01).
-		// This ensures stability when migrations are partially applied.
 		if err != nil {
-			t.Logf("Cleanup query failed (ignoring): %s - %v", q, err)
+			var pgErr *pgconn.PgError
+			if errors.As(err, &pgErr) && pgErr.Code == "42P01" {
+				t.Logf("Cleanup query failed (missing table, ignoring): %s - %v", q, err)
+				continue
+			}
+			require.NoError(t, err)
 		}
 	}
internal/repositories/postgres/multitenancy_integration_test.go (1)

17-24: ⚠️ Potential issue | 🟡 Minor

Check TRUNCATE errors instead of discarding them.
Ignoring cleanup failures can invalidate test results and mask database issues.

Proposed fix
-	_, _ = db.Exec(ctx, "TRUNCATE users, tenants, tenant_members, vpcs, instances, volumes CASCADE")
+	_, err := db.Exec(ctx, "TRUNCATE users, tenants, tenant_members, vpcs, instances, volumes CASCADE")
+	require.NoError(t, err)
internal/repositories/postgres/identity_repo_test.go (1)

16-38: ⚠️ Potential issue | 🟡 Minor

Handle cleanup errors instead of silently discarding them.
Line 23 ignores the Exec error using _, _, which violates the guideline: "Do not use silent failures - avoid blank identifier assignment like _ = someFunc()". Swallowing this error can leave stale data and cause non-deterministic test results.

Proposed fix
-	// Cleanup
-	_, _ = db.Exec(context.Background(), "DELETE FROM api_keys")
+	// Cleanup
+	_, err := db.Exec(context.Background(), "DELETE FROM api_keys")
+	require.NoError(t, err)
@@
-	err := userRepo.Create(context.Background(), testUser)
+	err = userRepo.Create(context.Background(), testUser)
internal/repositories/postgres/function_repo_test.go (1)

17-27: ⚠️ Potential issue | 🟡 Minor

Avoid silent cleanup failures.
The cleanup Exec calls at lines 25-26 discard errors; failed cleanup can hide schema/data issues and make the test flaky. This violates the guideline: do not use silent failures with blank identifier assignment.

Proposed fix
-	// Cleanup
-	_, _ = db.Exec(context.Background(), "DELETE FROM invocations")
-	_, _ = db.Exec(context.Background(), "DELETE FROM functions")
+	// Cleanup
+	_, err := db.Exec(context.Background(), "DELETE FROM invocations")
+	require.NoError(t, err)
+	_, err = db.Exec(context.Background(), "DELETE FROM functions")
+	require.NoError(t, err)
internal/api/setup/dependencies.go (1)

258-269: ⚠️ Potential issue | 🔴 Critical

Assign the DNS service to Services.DNS in the svcs initialization.

The DNS field is defined in the Services struct (line 140) and the dnsSvc is created (lines 197–200), but it is never assigned in the svcs struct literal at lines 258–269. When router.go:106 attempts to initialize the DNS handler with svcs.DNS, it will be nil, causing a panic at runtime. Add DNS: dnsSvc, to the svcs initialization.

🤖 Fix all issues with AI agents
In `@docs/swagger/swagger.yaml`:
- Around line 2876-3033: The OpenAPI operations for the Global Load Balancer
routes (paths: /global-lb, /global-lb/{id}, /global-lb/{id}/endpoints,
/global-lb/{id}/endpoints/{epID}) are missing the security annotation; update
each operation (get/post/delete under those paths) to include security: -
APIKeyAuth: [] so they are treated as protected endpoints (ensure the APIKeyAuth
security scheme exists elsewhere in the spec); add the annotation under the
operation objects for Create/List/Get/Delete/AddEndpoint/RemoveEndpoint to match
other protected endpoints.

In `@internal/adapters/dns/powerdns_geo.go`:
- Around line 41-43: CreateGeoRecord currently returns nil when the incoming
records slice is empty, leaving stale A records in PowerDNS; change
CreateGeoRecord to detect len(records)==0 and call DeleteGeoRecord to remove any
existing geo A records for the given fqdn/zone instead of returning early. Also
extract the hard-coded TTL into a package-level constant named MaxDNSRecordTTL
(or similar) and use that constant in CreateGeoRecord. Finally update
DeleteGeoRecord so it does not silently swallow invalid-hostname errors —
propagate and return errors instead of returning nil for invalid inputs so
callers (like CreateGeoRecord) can react to deletion failures.

In `@internal/core/services/global_lb.go`:
- Around line 25-39: Change NewGlobalLBService to accept a single params struct
(e.g., GlobalLBServiceParams) instead of five separate arguments: define a
params type with fields for repo (ports.GlobalLBRepository), lbRepo
(ports.LBRepository), geoDNS (ports.GeoDNSBackend), auditSvc
(ports.AuditService) and logger (*slog.Logger), then update NewGlobalLBService
to take that params struct and assign its fields to the GlobalLBService struct
(repo, lbRepo, geoDNS, auditSvc, logger). Ensure all call sites constructing
GlobalLBService are updated to pass the params struct.
- Around line 133-187: Load the Global Load Balancer before creating the
endpoint in AddEndpoint: call s.Get(ctx, glbID) early, check error and return
wrapped error if not found, then verify the returned glb.UserID (and tenant if
applicable) matches appcontext.UserIDFromContext(ctx) to authorize the caller;
only after ownership is confirmed, create and persist the domain.GlobalEndpoint
(s.repo.AddEndpoint). After adding, reuse the previously loaded glb for the DNS
refresh (avoid calling s.Get again) and guard the s.geoDNS.CreateGeoRecord call
on the loaded glb being non-nil and on CreateGeoRecord returning an error,
logging appropriately. Ensure any pointer dereferences (e.g., targetID) remain
guarded as before.
- Around line 48-52: The call to s.repo.GetByHostname currently ignores
repository errors; update the check in the create/update flow to treat
non-NotFound errors as internal failures and only accept a NotFound as OK: after
calling s.repo.GetByHostname(ctx, hostname) use errors.Is(err, errors.NotFound)
to allow not-found, but if err != nil and not NotFound return a wrapped internal
error (e.g., errors.Wrap(errors.Internal, "failed to check hostname uniqueness",
err)); then if existing != nil return the Conflict error as before. This change
should be applied around the existing GetByHostname call (variables existing,
err and method s.repo.GetByHostname).

In `@internal/handlers/global_lb_handler.go`:
- Around line 198-201: The RemoveEndpoint handler currently ignores the route
param by assigning _ = c.Param("id"); replace that with actual validation: read
id := c.Param("id"), reject empty or malformed values (e.g., validate
UUID/expected format) and return a 400 on invalid input; then enforce that the
endpoint being removed belongs to that GLB by fetching the endpoint (e.g.,
GetEndpointByID / endpointService.Get) and comparing its GlbID to the route id,
returning a 404/400 if they don’t match or call the service with both endpointID
and glbID (e.g., globalLBService.RemoveEndpoint(endpointID, glbID)) so the
service can enforce the association; remove the blank identifier assignment and
use proper error logging and responses in RemoveEndpoint.

In `@internal/repositories/postgres/global_lb_repo_integration_test.go`:
- Around line 72-75: The tests call repo.List and repo.Delete without the
required userID parameter; update the calls to pass the existing userID variable
from the test context (declare/obtain at top) so signatures match the repository
interface—e.g., change repo.List(ctx) to repo.List(ctx, userID) and
repo.Delete(ctx, userID, id) (adjust the Delete call to include userID before
the resource id) referencing the repo.List and repo.Delete methods in the test.

In `@internal/repositories/postgres/global_lb_repo.go`:
- Around line 56-72: The List method in globalLBRepository iterates rows but
doesn't check rows.Err() after the loop, which can hide iteration errors; update
the List(ctx, userID uuid.UUID) implementation to call and return rows.Err() (or
wrap it) after the for rows.Next() loop before returning the list, and ensure
rows.Close() is still deferred; apply the same post-iteration rows.Err() check
to the other repository method that uses scanGlobalLB and iterates rows (the
analogous rows-processing block referenced in the review).
- Around line 41-58: The queries in GetByID, GetByHostname and List use "SELECT
*" which breaks scanGlobalLB when schema changes; replace each "SELECT * FROM
global_load_balancers ..." with an explicit, ordered column list that exactly
matches the field order expected by scanGlobalLB (e.g., id, user_id, hostname,
created_at, updated_at, ...), ensuring the same column order in all three
methods so scanGlobalLB continues to map fields correctly.
- Around line 41-99: The service is exposing GLB data without verifying
ownership: update the Get() and AddEndpoint() flows to enforce user ownership by
either (A) extracting userID from ctx inside the service methods and returning
an error if the GLB's UserID != userID (follow the Delete() pattern), or (B)
change repository lookups to include user_id filters by adding userID parameters
to GetByID() and GetByHostname() and using those in the SQL WHERE clauses; also
ensure the HTTP handler passes the authenticated userID into the service if you
choose option B. Locate uses of Get(), AddEndpoint(), GetByID(),
GetByHostname(), and Delete() to implement the ownership check consistently and
return a clear permission error when ownership fails.
🟡 Minor comments (17)
.github/workflows/e2e.yml-56-56 (1)

56-56: ⚠️ Potential issue | 🟡 Minor

Typo in comment: "thecompose" → "the compose".

📝 Proposed fix
-      # The 'api' service in thecompose file already uses the latest build
+      # The 'api' service in the compose file already uses the latest build
tests/gateway_e2e_test.go-47-48 (1)

47-48: ⚠️ Potential issue | 🟡 Minor

Don't silently ignore response body close errors.

_ = resp.Body.Close() (and deferred wrappers that discard errors) violates the no-silent-failure guideline and can mask transport issues in tests. Apply a helper function to enforce error handling across all 20 occurrences.

Per coding guidelines: "Do not use silent failures - avoid blank identifier assignment like _ = someFunc()."

Suggested pattern
+func closeBody(t *testing.T, resp *http.Response) {
+	t.Helper()
+	if err := resp.Body.Close(); err != nil {
+		t.Fatalf("close response body: %v", err)
+	}
+}

-defer func() { _ = resp.Body.Close() }()
+t.Cleanup(func() { closeBody(t, resp) })

-_ = resp.Body.Close()
+closeBody(t, resp)

Applies to lines 47, 59, 86, 112, 121, 128, 146, 153, 179, 187, 210, 222, 231, 253, 261, 283, 293, 302, 315, 329

docs/FEATURES.md-150-162 (1)

150-162: ⚠️ Potential issue | 🟡 Minor

Renumber sections after adding GLB.

“Managed Kubernetes” still shows as 9, so numbering is now duplicated. Consider bumping subsequent section numbers to keep the outline consistent.

docs/services/cloud-global-lb.md-16-18 (1)

16-18: ⚠️ Potential issue | 🟡 Minor

Fix “hexadecimal” → “hexagonal” architecture typo.

Tiny wording fix for accuracy.

✏️ Suggested edit
-GLB follows the standard hexadecimal architecture of the platform:
+GLB follows the standard hexagonal architecture of the platform:
internal/handlers/global_lb_handler.go-42-147 (1)

42-147: ⚠️ Potential issue | 🟡 Minor

Add @Security directives to Swagger annotations.
Handlers include @Summary, @Tags, and @Router, but the @Security directive is missing.

As per coding guidelines: Include Swagger documentation comments on handler methods with @Summary, @Tags, @Security, and @Router directives.

Also applies to: 188-196

internal/adapters/dns/powerdns_geo.go-58-62 (1)

58-62: ⚠️ Potential issue | 🟡 Minor

Return an error for invalid hostnames in DeleteGeoRecord.
Currently it silently succeeds on invalid input, which hides failures.

🛠️ Suggested fix
 	parts := strings.Split(hostname, ".")
 	if len(parts) < 2 {
-		return nil
+		return fmt.Errorf("invalid hostname: %s", hostname)
 	}

As per coding guidelines: Do not use silent failures - avoid blank identifier assignment like _ = someFunc().

internal/core/services/global_lb_test.go-76-77 (1)

76-77: ⚠️ Potential issue | 🟡 Minor

Don’t ignore Create errors in the list filtering test.
Ignoring errors can make the test pass even when setup fails.

🛠️ Suggested fix
-		_, _ = svc.Create(ctx1, "lb1", "lb1.com", domain.RoutingLatency, domain.GlobalHealthCheckConfig{})
-		_, _ = svc.Create(ctx2, "lb2", "lb2.com", domain.RoutingLatency, domain.GlobalHealthCheckConfig{})
+		_, err := svc.Create(ctx1, "lb1", "lb1.com", domain.RoutingLatency, domain.GlobalHealthCheckConfig{})
+		require.NoError(t, err)
+		_, err = svc.Create(ctx2, "lb2", "lb2.com", domain.RoutingLatency, domain.GlobalHealthCheckConfig{})
+		require.NoError(t, err)

As per coding guidelines: Do not use silent failures - avoid blank identifier assignment like _ = someFunc().

internal/handlers/global_lb_handler.go-160-168 (1)

160-168: ⚠️ Potential issue | 🟡 Minor

Avoid magic defaults for endpoint weight/priority.
Consider named constants (or config) for defaults.

🛠️ Suggested fix
+const (
+	defaultEndpointWeight   = 1
+	defaultEndpointPriority = 1
+)
@@
-	weight := 1
+	weight := defaultEndpointWeight
 	if req.Weight > 0 {
 		weight = req.Weight
 	}
-	priority := 1
+	priority := defaultEndpointPriority
 	if req.Priority > 0 {
 		priority = req.Priority
 	}

As per coding guidelines: Do not use magic numbers - use named constants instead.

internal/adapters/dns/powerdns_geo.go-47-52 (1)

47-52: ⚠️ Potential issue | 🟡 Minor

Replace hard-coded TTL with a named constant (or config).
This makes TTL intent clearer and easier to adjust.

🛠️ Suggested fix
+const defaultGeoDNSTTL = 60
@@
 	rs := ports.RecordSet{
 		Name:    hostname,
 		Type:    "A",
-		TTL:     60, // Short TTL for dynamic responses
+		TTL:     defaultGeoDNSTTL, // Short TTL for dynamic responses
 		Records: records,
 	}

As per coding guidelines: Do not use magic numbers - use named constants instead.

internal/handlers/global_lb_handler.go-65-66 (1)

65-66: ⚠️ Potential issue | 🟡 Minor

Replace numeric HTTP status codes with http.Status* constants.

All instances of magic numbers (201, 200, 204) in httputil.Success() calls should use the corresponding Go standard library constants for clarity and consistency with the project's coding guidelines.

🛠️ Suggested fix
 import (
+	"net/http"
 	"github.com/gin-gonic/gin"
@@
-	httputil.Success(c, 201, glb)
+	httputil.Success(c, http.StatusCreated, glb)
@@
-	httputil.Success(c, 200, glb)
+	httputil.Success(c, http.StatusOK, glb)
@@
-	httputil.Success(c, 200, glbs)
+	httputil.Success(c, http.StatusOK, glbs)
@@
-	httputil.Success(c, 204, nil)
+	httputil.Success(c, http.StatusNoContent, nil)
@@
-	httputil.Success(c, 201, ep)
+	httputil.Success(c, http.StatusCreated, ep)
@@
-	httputil.Success(c, 204, nil)
+	httputil.Success(c, http.StatusNoContent, nil)

Occurs at lines 65, 90, 108, 132, 185, 213.

docs/swagger/docs.go-1856-2096 (1)

1856-2096: ⚠️ Potential issue | 🟡 Minor

Add auth metadata for /global-lb endpoints in Swagger.

These new paths omit security: APIKeyAuth while other authenticated routes include it. If GLB endpoints are protected, add @Security APIKeyAuth in the handler annotations so swag regenerates the spec correctly.

internal/core/services/global_lb_integration_test.go-75-90 (1)

75-90: ⚠️ Potential issue | 🟡 Minor

Fix error-ignoring patterns in test setup calls.

Three service calls ignore returned errors using blank identifiers, which can hide setup failures and cause confusing failures later:

  • Line 84: listA, _ := svc.List(ctxA, userA)
  • Line 153: glb, _ := svc.Create(ctx, "life", ...)
  • Line 157: ep1, _ := svc.AddEndpoint(ctx, glb.ID, ...)

Use require.NoError(t, err) to catch setup errors immediately.

Example fixes
-		listA, _ := svc.List(ctxA, userA)
+		listA, err := svc.List(ctxA, userA)
+		require.NoError(t, err)
 		assert.Len(t, listA, 1)
-		glb, _ := svc.Create(ctx, "life", hostname, domain.RoutingLatency, domain.GlobalHealthCheckConfig{Protocol: "HTTP", Port: 80})
+		glb, err := svc.Create(ctx, "life", hostname, domain.RoutingLatency, domain.GlobalHealthCheckConfig{Protocol: "HTTP", Port: 80})
+		require.NoError(t, err)
 
 		// Add 2 endpoints
 		ip1 := "1.2.3.4"
 		ip2 := "5.6.7.8"
-		ep1, _ := svc.AddEndpoint(ctx, glb.ID, "us-east-1", "IP", nil, &ip1, 1, 1)
+		ep1, err := svc.AddEndpoint(ctx, glb.ID, "us-east-1", "IP", nil, &ip1, 1, 1)
+		require.NoError(t, err)
 		_, err := svc.AddEndpoint(ctx, glb.ID, "eu-west-1", "IP", nil, &ip2, 1, 1)
 		require.NoError(t, err)
internal/repositories/mock/global_lb_repo.go-63-68 (1)

63-68: ⚠️ Potential issue | 🟡 Minor

Return explicit errors in mock Delete for not-found and unauthorized cases.

Delete currently returns nil regardless of whether the LB exists or the userID matches. This silently masks scenarios that should be errors, making tests less robust. Return appropriate errors to keep the mock defensive and aligned with the guideline against silent failures.

🔧 Example: return explicit errors
+import "errors"
 
 func (m *MockGlobalLBRepo) Delete(ctx context.Context, id uuid.UUID, userID uuid.UUID) error {
-	if glb, ok := m.GLBs[id]; ok && glb.UserID == userID {
-		delete(m.GLBs, id)
-	}
-	return nil
+	glb, ok := m.GLBs[id]
+	if !ok {
+		return errors.New("global load balancer not found")
+	}
+	if glb.UserID != userID {
+		return errors.New("unauthorized")
+	}
+	delete(m.GLBs, id)
+	return nil
 }
docs/swagger/swagger.yaml-3007-3033 (1)

3007-3033: ⚠️ Potential issue | 🟡 Minor

Missing 404 response for DELETE /global-lb/{id}/endpoints/{epID}.

The endpoint deletion should return 404 when either the GLB or the endpoint doesn't exist, similar to the POST endpoint on /global-lb/{id}/endpoints which includes a 404 response (line 2996-2999).

📝 Proposed fix
   /global-lb/{id}/endpoints/{epID}:
     delete:
       parameters:
       - description: Global LB ID
         in: path
         name: id
         required: true
         type: string
       - description: Endpoint ID
         in: path
         name: epID
         required: true
         type: string
       responses:
         "204":
           description: No Content
         "400":
           description: Bad Request
           schema:
             $ref: '#/definitions/httputil.Response'
+        "404":
+          description: Not Found
+          schema:
+            $ref: '#/definitions/httputil.Response'
         "500":
           description: Internal Server Error
           schema:
             $ref: '#/definitions/httputil.Response'
docs/swagger/swagger.yaml-2922-2943 (1)

2922-2943: ⚠️ Potential issue | 🟡 Minor

Missing 404 response for DELETE /global-lb/{id}.

The DELETE endpoint for a specific GLB should return 404 when the resource doesn't exist, consistent with the GET endpoint on the same path (lines 2962-2965) and other DELETE operations in this spec (e.g., /dns/zones/{id} at line 2686).

📝 Proposed fix
   /global-lb/{id}:
     delete:
       parameters:
       - description: Global LB ID
         in: path
         name: id
         required: true
         type: string
       responses:
         "204":
           description: No Content
         "400":
           description: Bad Request
           schema:
             $ref: '#/definitions/httputil.Response'
+        "404":
+          description: Not Found
+          schema:
+            $ref: '#/definitions/httputil.Response'
         "500":
           description: Internal Server Error
           schema:
             $ref: '#/definitions/httputil.Response'
internal/core/services/global_lb.go-79-82 (1)

79-82: ⚠️ Potential issue | 🟡 Minor

Don’t ignore audit log failures.

Using _ = s.auditSvc.Log(...) silently drops audit errors. Please surface or log failures to keep audit trails reliable.

✅ Suggested pattern
-	_ = s.auditSvc.Log(ctx, glb.UserID, "global_lb.create", "global_lb", glb.ID.String(), map[string]interface{}{
+	if err := s.auditSvc.Log(ctx, glb.UserID, "global_lb.create", "global_lb", glb.ID.String(), map[string]interface{}{
 		"hostname": hostname,
 		"policy":   policy,
-	})
+	}); err != nil {
+		s.logger.Error("audit log failed", "action", "global_lb.create", "error", err)
+	}
As per coding guidelines: Do not use silent failures - avoid blank identifier assignment like `_ = someFunc()`.

Also applies to: 128-129, 188-191, 232-234

docs/swagger/swagger.json-1845-2085 (1)

1845-2085: ⚠️ Potential issue | 🟡 Minor

Add APIKeyAuth security to the Global LB endpoints.

Most authenticated resources in this spec declare APIKeyAuth; the new /global-lb routes omit it, which can mislead clients and SDKs. Please add the security block (and 401/403 if applicable) consistently across these operations.

🔐 Example (apply similarly to the other /global-lb operations)
 "/global-lb": {
     "get": {
+        "security": [
+            {
+                "APIKeyAuth": []
+            }
+        ],
         "produces": [
             "application/json"
         ],
🧹 Nitpick comments (20)
internal/repositories/postgres/migrations/079_create_global_lb.up.sql (2)

34-35: Consider adding an index on tenant_id for multi-tenancy query performance.

Given that multi-tenancy is a core feature of this PR, queries will frequently filter by tenant_id. Without an index, these queries may result in full table scans as the data grows.

📈 Proposed fix to add tenant_id index
 CREATE INDEX idx_global_lb_user_id ON global_load_balancers(user_id);
+CREATE INDEX idx_global_lb_tenant_id ON global_load_balancers(tenant_id);
 CREATE INDEX idx_global_lb_endpoints_glb_id ON global_lb_endpoints(global_lb_id);

20-32: Add CHECK constraints to enforce data integrity for target_type discriminator.

The target_type column serves as a discriminator, but there's no enforcement that:

  1. target_type only contains valid values ('LB' or 'IP')
  2. target_id is populated when target_type = 'LB'
  3. target_ip is populated when target_type = 'IP'

Without these constraints, invalid data states are possible at the database level.

🛡️ Proposed constraints
 CREATE TABLE global_lb_endpoints (
     id UUID PRIMARY KEY,
     global_lb_id UUID NOT NULL REFERENCES global_load_balancers(id) ON DELETE CASCADE,
     region VARCHAR(50) NOT NULL,
-    target_type VARCHAR(10) NOT NULL, -- 'LB' or 'IP'
-    target_id UUID, -- Can be null if target_type is IP
-    target_ip INET, -- Can be null if target_type is LB
+    target_type VARCHAR(10) NOT NULL CHECK (target_type IN ('LB', 'IP')),
+    target_id UUID,
+    target_ip INET,
     weight INT DEFAULT 1,
     priority INT DEFAULT 1,
     healthy BOOLEAN DEFAULT true,
     last_health_check TIMESTAMPTZ,
-    created_at TIMESTAMPTZ DEFAULT NOW()
+    created_at TIMESTAMPTZ DEFAULT NOW(),
+    CONSTRAINT chk_target_id_required CHECK (target_type != 'LB' OR target_id IS NOT NULL),
+    CONSTRAINT chk_target_ip_required CHECK (target_type != 'IP' OR target_ip IS NOT NULL)
 );
.github/workflows/ci.yml (1)

75-106: Matrix consolidation looks good, but note the shell word-splitting dependency.

The matrix-based approach effectively consolidates the previous three separate race-detection jobs. However, the third matrix entry "./cmd/... ./pkg/..." relies on shell word-splitting when interpolated unquoted at line 106. This works but is somewhat fragile.

Consider using YAML array syntax or separate matrix entries for cleaner semantics:

♻️ Optional: Split combined package pattern into separate entries
     strategy:
       matrix:
         package: [
           "./internal/core/services/...",
           "./internal/repositories/...",
-          "./cmd/... ./pkg/..."
+          "./cmd/...",
+          "./pkg/..."
         ]

This would make each matrix entry self-contained and avoid relying on shell word-splitting behavior.

internal/pkg/testutil/test_infra.go (1)

35-35: Silent failure violates coding guidelines.

The blank identifier assignment _ = cli.Close() explicitly discards the error, which conflicts with the project's coding guideline prohibiting silent failures. While Close() errors in test utilities are often non-critical, consider logging them for debugging purposes.

♻️ Suggested fix to log close errors
-	defer func() { _ = cli.Close() }()
+	defer func() {
+		if err := cli.Close(); err != nil {
+			// Log but don't fail - this is cleanup code
+			fmt.Fprintf(os.Stderr, "warning: failed to close docker client: %v\n", err)
+		}
+	}()

Note: You'll need to add "os" to the imports.

As per coding guidelines: "Do not use silent failures - avoid blank identifier assignment like _ = someFunc()"

docs/adr/ADR-012-global-load-balancer.md (1)

14-14: Minor grammar improvement.

Consider hyphenating the compound adjective for clarity.

📝 Suggested fix
-    *   The GLB service acts as an orchestrator for a **Geo-route capable DNS backend** (initially implemented via PowerDNS).
+    *   The GLB service acts as an orchestrator for a **Geo-route-capable DNS backend** (initially implemented via PowerDNS).
internal/repositories/mock/logger.go (1)

5-13: Consider using io.Discard from standard library.

The standard library provides io.Discard which serves the same purpose as NoopWriter and eliminates the need for a custom type.

♻️ Simplified implementation
 package mock
 
-import "log/slog"
+import (
+	"io"
+	"log/slog"
+)
 
 func NewNoopLogger() *slog.Logger {
-	return slog.New(slog.NewTextHandler(&NoopWriter{}, nil))
+	return slog.New(slog.NewTextHandler(io.Discard, nil))
 }
-
-type NoopWriter struct{}
-
-func (w *NoopWriter) Write(p []byte) (n int, err error) {
-	return len(p), nil
-}
internal/repositories/postgres/test_helpers.go (2)

20-47: Thread context into SetupDB instead of hardcoding context.Background().

This helper creates its own context, so callers can’t set deadlines or cancel. Consider accepting context.Context as the first parameter and using it for all blocking calls (update call sites accordingly).

Suggested change
-func SetupDB(t *testing.T) *pgxpool.Pool {
-	ctx := context.Background()
+func SetupDB(ctx context.Context, t *testing.T) *pgxpool.Pool {
+	if ctx == nil {
+		ctx = context.Background()
+	}

As per coding guidelines: Do not skip context.Context as the first parameter in functions; Propagate context.Context to all blocking calls.


49-89: Pass context into SetupTestUser instead of creating a new background context.

This keeps the helper cancellable and consistent with context propagation rules (update call sites accordingly).

Suggested change
-func SetupTestUser(t *testing.T, db *pgxpool.Pool) context.Context {
-	ctx := context.Background()
+func SetupTestUser(ctx context.Context, t *testing.T, db *pgxpool.Pool) context.Context {
+	if ctx == nil {
+		ctx = context.Background()
+	}

As per coding guidelines: Do not skip context.Context as the first parameter in functions; Propagate context.Context to all blocking calls.

internal/repositories/postgres/rbac_repo_test.go (1)

25-120: Consider table‑driven cases for the RBAC CRUD subtests.

A table‑driven structure would reduce repetition and aligns with the test‑style guideline.

As per coding guidelines: Use table-driven tests in test files.

internal/repositories/postgres/global_lb_repo_integration_test.go (1)

34-129: Consider table-driven structure for consistency.

This file uses multiple subtests but not a table-driven format. If you want to align with project testing conventions, a table of cases would help.
As per coding guidelines: Use table-driven tests in test files.

internal/handlers/global_lb_handler_test.go (1)

68-110: Prefer table-driven structure for handler tests.

This is a single-case subtest; a table-driven style will scale better as you add error cases.
As per coding guidelines: Use table-driven tests in test files.

internal/api/setup/router.go (1)

330-340: Consider enforcing tenant context on GLB routes.

If GLB is tenant-scoped (TenantID exists in the domain), you may want to align with other tenant resources by requiring a tenant context and membership.

🔐 Possible adjustment
- glbGroup := r.Group("/global-lb")
- glbGroup.Use(httputil.Auth(svcs.Identity, svcs.Tenant))
+ glbGroup := r.Group("/global-lb")
+ glbGroup.Use(
+     httputil.Auth(svcs.Identity, svcs.Tenant),
+     httputil.RequireTenant(),
+     httputil.TenantMember(svcs.Tenant),
+ )
internal/repositories/mock/geodns.go (1)

12-33: Protect MockGeoDNS.Records from concurrent access and slice aliasing.
If tests run with t.Parallel or the service invokes GeoDNS in goroutines, unsynchronized map writes can race, and storing the endpoints slice by reference can leak later mutations into recorded state.

♻️ Suggested refactor
 import (
 	"context"
 	"log/slog"
+	"sync"

 	"github.com/poyrazk/thecloud/internal/core/domain"
 	"github.com/poyrazk/thecloud/internal/core/ports"
 )
@@
 type MockGeoDNS struct {
+	mu      sync.RWMutex
 	Records map[string][]domain.GlobalEndpoint
 }
@@
 func (m *MockGeoDNS) CreateGeoRecord(ctx context.Context, hostname string, endpoints []domain.GlobalEndpoint) error {
 	slog.Info("GeoDNS: Creating record", "hostname", hostname, "endpoints", len(endpoints))
-	m.Records[hostname] = endpoints
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	copied := append([]domain.GlobalEndpoint(nil), endpoints...)
+	m.Records[hostname] = copied
 	return nil
 }
@@
 func (m *MockGeoDNS) DeleteGeoRecord(ctx context.Context, hostname string) error {
 	slog.Info("GeoDNS: Deleting record", "hostname", hostname)
+	m.mu.Lock()
+	defer m.mu.Unlock()
 	delete(m.Records, hostname)
 	return nil
 }
internal/core/services/global_lb_test.go (2)

16-24: Use testify/mock-based mocks for service tests.
Current map-backed mocks don’t leverage testify/mock expectations and don’t satisfy the testing guideline.

As per coding guidelines: Use testify/mock for creating mock objects in tests.


27-167: Consider table-driven tests to reduce repetition.
The scenarios in TestGlobalLBCreate, TestGlobalLBAddEndpoint, and TestGlobalLBRemoveEndpoint are good candidates for a table-driven structure.

As per coding guidelines: Use table-driven tests in test files.

internal/repositories/mock/lb_repo.go (1)

11-73: Implement target persistence in the mock repo.

AddTarget, RemoveTarget, ListTargets, UpdateTargetHealth, and GetTargetsForInstance are currently no-ops, which can let target-related tests pass even if target logic is broken. Consider adding an in-memory target store (or returning explicit errors) so the mock mirrors real behavior.

♻️ Example in-memory target store
 type MockLBRepo struct {
 	LBs map[uuid.UUID]*domain.LoadBalancer
+	Targets map[uuid.UUID][]*domain.LBTarget
 }
 
 func NewMockLBRepo() *MockLBRepo {
 	return &MockLBRepo{
 		LBs: make(map[uuid.UUID]*domain.LoadBalancer),
+		Targets: make(map[uuid.UUID][]*domain.LBTarget),
 	}
 }
 
-func (m *MockLBRepo) AddTarget(ctx context.Context, target *domain.LBTarget) error       { return nil }
-func (m *MockLBRepo) RemoveTarget(ctx context.Context, lbID, instanceID uuid.UUID) error { return nil }
-func (m *MockLBRepo) ListTargets(ctx context.Context, lbID uuid.UUID) ([]*domain.LBTarget, error) {
-	return nil, nil
-}
-func (m *MockLBRepo) UpdateTargetHealth(ctx context.Context, lbID, instanceID uuid.UUID, health string) error {
-	return nil
-}
-func (m *MockLBRepo) GetTargetsForInstance(ctx context.Context, instanceID uuid.UUID) ([]*domain.LBTarget, error) {
-	return nil, nil
-}
+func (m *MockLBRepo) AddTarget(ctx context.Context, target *domain.LBTarget) error {
+	m.Targets[target.LBID] = append(m.Targets[target.LBID], target)
+	return nil
+}
+func (m *MockLBRepo) RemoveTarget(ctx context.Context, lbID, instanceID uuid.UUID) error {
+	targets := m.Targets[lbID]
+	filtered := targets[:0]
+	for _, t := range targets {
+		if t.InstanceID != instanceID {
+			filtered = append(filtered, t)
+		}
+	}
+	m.Targets[lbID] = filtered
+	return nil
+}
+func (m *MockLBRepo) ListTargets(ctx context.Context, lbID uuid.UUID) ([]*domain.LBTarget, error) {
+	return m.Targets[lbID], nil
+}
+func (m *MockLBRepo) UpdateTargetHealth(ctx context.Context, lbID, instanceID uuid.UUID, health string) error {
+	for _, t := range m.Targets[lbID] {
+		if t.InstanceID == instanceID {
+			t.Health = health
+			break
+		}
+	}
+	return nil
+}
+func (m *MockLBRepo) GetTargetsForInstance(ctx context.Context, instanceID uuid.UUID) ([]*domain.LBTarget, error) {
+	var out []*domain.LBTarget
+	for _, targets := range m.Targets {
+		for _, t := range targets {
+			if t.InstanceID == instanceID {
+				out = append(out, t)
+			}
+		}
+	}
+	return out, nil
+}
internal/core/services/global_lb_integration_test.go (3)

20-39: Prefer testify/mock for GeoDNS stub to align with test guidelines.

The hand-rolled stub makes it harder to assert call expectations and inputs. Consider switching to testify/mock (or wrapping this stub with it) so interactions can be asserted explicitly.

As per coding guidelines: Use testify/mock for creating mock objects in tests.


41-62: Ensure DB-backed integration tests are gated from unit runs.

This test constructs real PostgreSQL repositories. Please verify it is isolated behind an integration build tag or a dedicated CI job so unit runs stay hermetic and don’t require a database.

As per coding guidelines: Do not test with real external dependencies in unit tests - use mocks instead.


64-179: Consider table-driven subtests for the three scenarios.

The three scenarios follow the same setup/act/assert pattern; a table-driven structure would reduce repetition and make future cases easier to add.

As per coding guidelines: Use table-driven tests in test files.

docs/swagger/swagger.yaml (1)

1251-1272: Consider documenting conditional field requirements.

When target_type is "LB", the target_id field should be provided; when "IP", the target_ip field is needed. OpenAPI 2.0 cannot express this conditional validation, so consider adding a description noting that server-side validation enforces this, or upgrade to OpenAPI 3.0+ for oneOf/discriminator support.

📝 Suggested documentation improvement
   httphandlers.AddGlobalEndpointRequest:
     properties:
       priority:
         description: Default 1
         type: integer
       region:
         type: string
       target_id:
-        type: string
+        description: Required when target_type is "LB"
+        type: string
       target_ip:
-        type: string
+        description: Required when target_type is "IP"
+        type: string
       target_type:
         enum:
         - LB
         - IP
         type: string

Comment thread docs/swagger/swagger.yaml
Comment thread internal/adapters/dns/powerdns_geo.go
Comment thread internal/core/services/global_lb.go
Comment thread internal/core/services/global_lb.go
Comment thread internal/core/services/global_lb.go
Comment thread internal/handlers/global_lb_handler.go Outdated
Comment thread internal/repositories/postgres/global_lb_repo_integration_test.go
Comment thread internal/repositories/postgres/global_lb_repo.go
Comment thread internal/repositories/postgres/global_lb_repo.go
Comment thread internal/repositories/postgres/global_lb_repo.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new Global Load Balancer (GLB) capability (DNS/GeoDNS-driven multi-region steering), expands multi-tenant test infrastructure, and updates CI/CD workflows to improve speed and determinism.

Changes:

  • Added GLB domain model, ports, service logic, HTTP handlers/routes, and PowerDNS GeoDNS adapter support.
  • Added PostgreSQL migrations and repository implementation + integration/unit test coverage scaffolding for GLB.
  • Updated Swagger outputs, docs, docker-compose, and GitHub Actions (concurrency + caching + race job matrix).

Reviewed changes

Copilot reviewed 48 out of 48 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
tests/gateway_e2e_test.go Safer response-body close handling in gateway E2E tests.
internal/repositories/postgres/vpc_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/test_helpers.go Exports DB/user/cleanup helpers and adds documentation.
internal/repositories/postgres/secret_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/rbac_repo_test.go Updates integration test DB setup helper name.
internal/repositories/postgres/notify_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/multitenancy_integration_test.go Updates integration test DB setup helper name.
internal/repositories/postgres/migrations/079_create_global_lb.up.sql Adds GLB + endpoint tables and indexes.
internal/repositories/postgres/migrations/079_create_global_lb.down.sql Adds rollback for GLB tables.
internal/repositories/postgres/migration_test.go Updates integration test DB setup helper name.
internal/repositories/postgres/instance_repo_integration_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/identity_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/global_lb_repo_integration_test.go Adds GLB repository integration test suite.
internal/repositories/postgres/global_lb_repo.go Adds PostgreSQL implementation of GLB repository.
internal/repositories/postgres/gateway_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/function_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/database_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/cron_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/postgres/container_repo_test.go Updates integration test to use exported DB/user helpers.
internal/repositories/mock/logger.go Adds a no-op slog logger for tests.
internal/repositories/mock/lb_repo.go Adds a mock LB repository implementation for tests.
internal/repositories/mock/global_lb_repo.go Adds a mock GLB repository implementation for tests.
internal/repositories/mock/geodns.go Adds a mock GeoDNS backend for tests.
internal/repositories/mock/audit_service.go Adds a mock audit service for tests.
internal/pkg/testutil/test_infra.go Safer Docker client close handling.
internal/handlers/global_lb_handler_test.go Adds handler-level unit test for GLB create endpoint.
internal/handlers/global_lb_handler.go Adds HTTP API for GLB CRUD + endpoint management.
internal/core/services/setup_test.go Extends service-test DB cleanup to include GLB tables.
internal/core/services/global_lb_test.go Adds unit tests for GLB service behavior using mocks.
internal/core/services/global_lb_integration_test.go Adds service-level integration scenarios for GLB.
internal/core/services/global_lb.go Implements GLB business logic, DNS sync, and auditing.
internal/core/ports/global_lb.go Introduces ports for GLB repo/service and GeoDNS backend.
internal/core/domain/global_lb.go Adds GLB and endpoint domain types + routing policy enum.
internal/api/setup/router.go Registers GLB routes into the API router.
internal/api/setup/dependencies.go Wires GLB repository/service into dependency graph.
internal/adapters/dns/powerdns_geo.go Implements GeoDNSBackend on PowerDNS backend.
docs/swagger/swagger.yaml Swagger updates for GLB endpoints and types.
docs/swagger/swagger.json Swagger updates for GLB endpoints and types.
docs/swagger/docs.go Generated swagger docs updated with GLB endpoints/types.
docs/services/cloud-global-lb.md Adds GLB service documentation/guide.
docs/architecture.md Updates architecture docs to include GLB service.
docs/api-reference.md Updates API reference with GLB endpoints.
docs/adr/ADR-012-global-load-balancer.md Adds ADR describing GLB design decisions.
docs/FEATURES.md Documents GLB as a platform feature.
docker-compose.yml Pins API image tag for CI and local compose usage.
README.md Adds GLB to top-level feature list and docs index.
.github/workflows/e2e.yml Adds concurrency + Buildx + GHA cache for E2E build/start.
.github/workflows/ci.yml Adds concurrency, pins swag, consolidates race jobs, adds Docker layer cache.
Comments suppressed due to low confidence (1)

internal/repositories/postgres/test_helpers.go:126

  • The comment says cleanup errors are only suppressed for missing tables (PostgreSQL 42P01), but the implementation currently logs and ignores all errors. This can hide real cleanup failures and cause flaky tests. Check the error type/code and only ignore 42P01; fail the test for other errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/core/services/global_lb.go Outdated

// Check for hostname uniqueness
existing, err := s.repo.GetByHostname(ctx, hostname)
if err == nil && existing != nil {
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hostname uniqueness check ignores repository errors other than "found". If GetByHostname returns an internal/DB error, this code will proceed to create a GLB anyway (and may later fail with a less clear error). Treat non-NotFound errors as fatal and only allow creation when the error indicates "not found".

Suggested change
if err == nil && existing != nil {
if err != nil {
return nil, errors.Wrap(errors.Internal, "failed to check hostname uniqueness", err)
}
if existing != nil {

Copilot uses AI. Check for mistakes.
Comment on lines +123 to +125
t.Run("Delete", func(t *testing.T) {
err := repo.Delete(ctx, glbID, userID)
require.NoError(t, err)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repo.Delete requires both id and userID (Delete(ctx, id, userID)), but this test calls repo.Delete(ctx, glbID) with no user. Pass the owner userID (or adjust the repo signature) to compile and properly test authorization scoping.

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +47
func (r *globalLBRepository) GetByID(ctx context.Context, id uuid.UUID) (*domain.GlobalLoadBalancer, error) {
query := `SELECT * FROM global_load_balancers WHERE id = $1`

// Implementation Note: Manual field mapping is utilized here to ensure
// schema robustness and decoupled domain-to-storage mapping.
row := r.db.QueryRow(ctx, query, id)
return scanGlobalLB(row)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repository uses SELECT * but scanGlobalLB relies on a specific column order (see the comment in scanGlobalLB). Other repositories select explicit columns (e.g. internal/repositories/postgres/lb_repo.go:41-46) to avoid breakage when schema/order changes. Prefer selecting explicit columns in the required order instead of SELECT *.

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +47
func (r *globalLBRepository) GetByID(ctx context.Context, id uuid.UUID) (*domain.GlobalLoadBalancer, error) {
query := `SELECT * FROM global_load_balancers WHERE id = $1`

// Implementation Note: Manual field mapping is utilized here to ensure
// schema robustness and decoupled domain-to-storage mapping.
row := r.db.QueryRow(ctx, query, id)
return scanGlobalLB(row)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetByID is not scoped by user_id (or tenant) and can return another user's GLB. Most repositories scope GetByID using appcontext.UserIDFromContext(ctx) in the WHERE clause (e.g. internal/repositories/postgres/lb_repo.go:39-46). Consider adding user/tenant scoping here (and similarly for endpoint queries) to enforce data isolation at the storage layer.

Copilot uses AI. Check for mistakes.
Weight int `json:"weight"` // 1-100, default 1
Priority int `json:"priority"` // Lower is higher priority (for FAILOVER)
Healthy bool `json:"healthy"`
LastHealthCheck time.Time `json:"last_health_check"`
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_health_check is nullable in the DB (TIMESTAMPTZ with no NOT NULL), but the domain model uses a non-pointer time.Time. When the DB value is NULL, this will serialize as the zero time (e.g. year 0001) which is misleading for API clients. Consider using *time.Time (and omitempty) for LastHealthCheck.

Suggested change
LastHealthCheck time.Time `json:"last_health_check"`
LastHealthCheck *time.Time `json:"last_health_check,omitempty"`

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +37
func (m *MockGlobalLBRepo) GetByID(ctx context.Context, id uuid.UUID) (*domain.GlobalLoadBalancer, error) {
if glb, ok := m.GLBs[id]; ok {
// return copy
return glb, nil
}
return nil, nil // simplified
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetByID returns (nil, nil) when a GLB is missing. Service code (Get, Delete, etc.) assumes a non-nil GLB on err == nil and can panic. Return a not-found error instead to match the postgres implementation (which returns errors.NotFound).

Copilot uses AI. Check for mistakes.
Lifecycle: services.NewLifecycleService(c.Repos.Lifecycle, c.Repos.Storage),
DNS: dnsSvc,
InstanceType: services.NewInstanceTypeService(c.Repos.InstanceType),
GlobalLB: glbSvc,
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dnsSvc is constructed above but never assigned into the returned Services struct. This will leave svcs.DNS as nil and cause a panic when InitHandlers calls httphandlers.NewDNSHandler(svcs.DNS) or when DNS routes are hit. Add DNS: dnsSvc to the svcs := &Services{...} literal.

Suggested change
GlobalLB: glbSvc,
GlobalLB: glbSvc,
DNS: dnsSvc,

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +31
func (m *MockLBRepo) GetByID(ctx context.Context, id uuid.UUID) (*domain.LoadBalancer, error) {
if lb, ok := m.LBs[id]; ok {
return lb, nil
}
return nil, nil
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the LB ID is not found, this mock returns (nil, nil). Callers (e.g. GlobalLBService.AddEndpoint) assume a non-nil LB when err == nil and will panic on lb.UserID. Return a proper not-found error when missing to match the real repository behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +89 to +98
func (m *MockGlobalLBRepo) GetEndpointByID(ctx context.Context, endpointID uuid.UUID) (*domain.GlobalEndpoint, error) {
for _, eps := range m.Endpoints {
for _, ep := range eps {
if ep.ID == endpointID {
return ep, nil
}
}
}
return nil, nil
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetEndpointByID returns (nil, nil) when the endpoint is missing, which can lead to nil dereferences in GlobalLBService.RemoveEndpoint. Return a not-found error when the endpoint doesn't exist so behavior matches the real repository.

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +43
func TestGlobalLBServiceIntegration(t *testing.T) {
db := setupDB(t)
// Add global lb tables to cleanup if needed, but cleanDB in setup_test.go should be updated
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DB pool created by setupDB(t) is never closed in this test. Add defer db.Close() (or t.Cleanup(db.Close)) after creating the pool to avoid leaking connections/resources across tests.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 3, 2026 09:18
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @.github/workflows/ci.yml:
- Around line 26-27: The CI fails because swag's -g flag points to main.go
relative to the repo root while the actual entrypoint is cmd/api/main.go; update
the swag invocation (the "swag init" command) so the -g flag references the
correct file path (cmd/api/main.go) while keeping the -d directories
(cmd/api,internal/handlers) and other flags intact to ensure Swagger can find
the proper entrypoint.

In `@docs/swagger/docs.go`:
- Around line 1856-2097: The swagger docs are missing security annotations for
GLB handlers; add a line comment "// `@Security` APIKeyAuth" to the doc comments
for each GLB handler function (ListGlobalLB / List, CreateGlobalLB / Create,
GetGlobalLB / Get, DeleteGlobalLB / Delete, AddGlobalEndpoint / AddEndpoint,
RemoveGlobalEndpoint / RemoveEndpoint) so the generated swagger includes the
APIKeyAuth requirement (update the Go doc comment immediately above each handler
method referenced by registerGlobalLBRoutes and re-run the swagger generator).

In `@docs/swagger/swagger.json`:
- Around line 1845-2085: The GLB operations (paths "/global-lb" [get, post],
"/global-lb/{id}" [get, delete], "/global-lb/{id}/endpoints" [post], and
"/global-lb/{id}/endpoints/{epID}" [delete]) are missing the security stanza;
update each operation object to include the OpenAPI security entry referencing
APIKeyAuth (e.g., add "security": [{"APIKeyAuth": []}]) so these endpoints
require the same API key auth as other resources like /lb; ensure you add this
exact security field to the operations named in the diff (get, post, delete
methods) without altering other fields.
🧹 Nitpick comments (6)
tests/instance_types_e2e_test.go (1)

49-55: Extract the port mapping into a named constant.
The literal "0:80" is a magic value and is now repeated across E2E tests. Consider defining a shared constant (here or in testutil) and reuse it.

♻️ Suggested tweak
+const defaultPortMapping = "0:80"
...
-        "ports":         "0:80",
+        "ports":         defaultPortMapping,

As per coding guidelines, do not use magic numbers - use named constants instead.

docs/swagger/swagger.yaml (1)

1250-1271: Consider documenting conditional field requirements.

When target_type is "LB", target_id is expected; when "IP", target_ip is expected. OpenAPI 2.0 cannot express this conditional validation, so ensure the handler layer enforces this logic and consider adding clarifying descriptions to the field definitions.

📝 Suggested documentation enhancement
   httphandlers.AddGlobalEndpointRequest:
     properties:
       priority:
         description: Default 1
         type: integer
       region:
         type: string
       target_id:
+        description: Required when target_type is "LB"
         type: string
       target_ip:
+        description: Required when target_type is "IP"
         type: string
docs/swagger/swagger.json (3)

5334-5375: Align domain.GlobalEndpoint.target_type with the request enum.

The response schema is missing the enum that exists on the request type, which can confuse clients and generators.

♻️ Proposed fix
                 "target_type": {
                     "description": "\"LB\" or \"IP\"",
-                    "type": "string"
+                    "type": "string",
+                    "enum": [
+                        "LB",
+                        "IP"
+                    ]
                 },

6646-6677: Clarify the conditional requirements for target_id vs target_ip.

Swagger 2.0 can’t express the conditional requirement, but adding descriptions will prevent client confusion.

📝 Proposed clarification
                 "target_id": {
-                    "type": "string"
+                    "description": "Required when target_type is \"LB\"",
+                    "type": "string"
                 },
                 "target_ip": {
-                    "type": "string"
+                    "description": "Required when target_type is \"IP\"",
+                    "type": "string"
                 },

6795-6826: Avoid redundant enum + allOf in policy.

domain.RoutingPolicy already defines the enum. Keeping both can drift later; a single $ref is cleaner.

♻️ Proposed fix
                 "policy": {
-                    "enum": [
-                        "LATENCY",
-                        "GEOLOCATION",
-                        "WEIGHTED",
-                        "FAILOVER"
-                    ],
-                    "allOf": [
-                        {
-                            "$ref": "#/definitions/domain.RoutingPolicy"
-                        }
-                    ]
+                    "$ref": "#/definitions/domain.RoutingPolicy"
                 }
docs/swagger/docs.go (1)

6806-6838: Consider aligning request field naming with routing_policy.

CreateGlobalLBRequest uses policy, while the response model exposes routing_policy. If feasible, align naming (or document the mismatch) to reduce client confusion. Update handler struct tags/annotations and re-run swagger generation accordingly.

Comment thread .github/workflows/ci.yml Outdated
Comment thread docs/swagger/docs.go
Comment thread docs/swagger/swagger.json
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 52 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (2)

docs/FEATURES.md:163

  • Section numbering is now inconsistent: there are two "### 9." headings (Global LB and Managed Kubernetes). Renumber subsequent sections to keep headings unique and ordered.
### 9. Managed Kubernetes (KaaS) 🆕
**What it is**: Provision and manage production-ready Kubernetes clusters.
**Tech Stack**: Kubeadm, Containerd, LoadBalancer, Go Workers.

internal/repositories/postgres/test_helpers.go:126

  • The cleanup loop logs and ignores all query errors, but the comment says errors are only suppressed for missing tables (42P01). This can hide real cleanup failures (e.g., FK violations) and make tests flaky. Consider checking the Postgres error code and only ignoring 42P01; fail the test for other errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +133 to +155
func (s *GlobalLBService) AddEndpoint(ctx context.Context, glbID uuid.UUID, region string, targetType string, targetID *uuid.UUID, targetIP *string, weight, priority int) (*domain.GlobalEndpoint, error) {
// Validate target
userID := appcontext.UserIDFromContext(ctx)
switch targetType {
case "LB":
if targetID == nil {
return nil, errors.New(errors.InvalidInput, "target_id required for LB endpoint")
}
// Verify LB exists and belongs to user
lb, err := s.lbRepo.GetByID(ctx, *targetID)
if err != nil {
return nil, errors.Wrap(errors.NotFound, "target load balancer not found", err)
}
if lb.UserID != userID {
return nil, errors.New(errors.Unauthorized, "unauthorized access to regional load balancer")
}
case "IP":
if targetIP == nil || *targetIP == "" {
return nil, errors.New(errors.InvalidInput, "target_ip required for IP endpoint")
}
default:
return nil, errors.New(errors.InvalidInput, "invalid target type")
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddEndpoint never verifies that glbID belongs to the authenticated user/tenant. That allows a user to add endpoints to another user's Global LB (and trigger DNS sync/audit) if they know the ID. Fetch the Global LB first and enforce ownership before accepting endpoint mutations.

Copilot uses AI. Check for mistakes.
lb, err := s.lbRepo.GetByID(ctx, *targetID)
if err != nil {
return nil, errors.Wrap(errors.NotFound, "target load balancer not found", err)
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When targetType == "LB", lbRepo.GetByID is only checked for err != nil, but lb is not checked for nil before dereferencing lb.UserID. Some repository implementations/mocks return (nil, nil) for not-found, which would panic here. Treat lb == nil as not found and return an appropriate error.

Suggested change
}
}
if lb == nil {
return nil, errors.New(errors.NotFound, "target load balancer not found")
}

Copilot uses AI. Check for mistakes.
- **Unified Hostname**: Provide a single global hostname for your application (e.g., `api.global.example.com`).

## Architecture
GLB follows the standard hexadecimal architecture of the platform:
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "hexadecimal architecture" should be "hexagonal architecture".

Suggested change
GLB follows the standard hexadecimal architecture of the platform:
GLB follows the standard hexagonal architecture of the platform:

Copilot uses AI. Check for mistakes.
}
}

_ = s.auditSvc.Log(ctx, glb.UserID, "global_lb.endpoint_add", "global_lb", glbID.String(), map[string]interface{}{
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glb is used after glb, err := s.Get(ctx, glbID) even when err != nil (e.g., in the audit log call). This can panic if Get fails (or after tightening authorization checks). Guard the audit/DNS sync block on err == nil or use userID from context for audit logging.

Suggested change
_ = s.auditSvc.Log(ctx, glb.UserID, "global_lb.endpoint_add", "global_lb", glbID.String(), map[string]interface{}{
_ = s.auditSvc.Log(ctx, userID, "global_lb.endpoint_add", "global_lb", glbID.String(), map[string]interface{}{

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +37
if ep.TargetType == "IP" && ep.TargetIP != nil && *ep.TargetIP != "" {
records = append(records, *ep.TargetIP)
}
// Note: Load Balancer (LB) targets are temporarily deferred. These require
// sophisticated CNAME or ALIAS record management to avoid conflicts at the zone root.
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateGeoRecord only emits DNS records for TargetType == "IP". As a result, Global LB endpoints that reference regional LBs (TargetType == "LB") will never be published to DNS, even though the service/handlers support creating them. Either resolve LB endpoints to an address/hostname before calling the GeoDNS backend, or implement LB target handling here (e.g., CNAME/ALIAS/A to the regional LB IP).

Suggested change
if ep.TargetType == "IP" && ep.TargetIP != nil && *ep.TargetIP != "" {
records = append(records, *ep.TargetIP)
}
// Note: Load Balancer (LB) targets are temporarily deferred. These require
// sophisticated CNAME or ALIAS record management to avoid conflicts at the zone root.
// Emit A records for any healthy endpoint that has a concrete IP target.
// This allows LB-type endpoints that have been resolved to IPs upstream
// to be published here without needing special-case handling.
if ep.TargetIP != nil && *ep.TargetIP != "" {
records = append(records, *ep.TargetIP)
}

Copilot uses AI. Check for mistakes.
Comment thread internal/adapters/dns/powerdns_geo.go Outdated
}

if len(records) == 0 {
return nil
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When records is empty, CreateGeoRecord returns nil without updating PowerDNS. This leaves stale A records behind when the last endpoint is removed or all endpoints become unhealthy, breaking DNS synchronization guarantees. Consider calling DeleteGeoRecord (or issuing a REPLACE/DELETE RRSet) when there are no records to publish.

Suggested change
return nil
// No healthy IP endpoints remain; ensure any existing A records are removed
return b.DeleteGeoRecord(ctx, hostname)

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/e2e.yml
cache-from: type=gha
cache-to: type=gha,mode=max

# The 'api' service in thecompose file already uses the latest build
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "thecompose" → "the compose".

Suggested change
# The 'api' service in thecompose file already uses the latest build
# The 'api' service in the compose file already uses the latest build

Copilot uses AI. Check for mistakes.
if err != nil {
return nil, err
}

Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get does not enforce ownership/tenant scoping. Because the repository GetByID is currently not scoped by user, any authenticated user who can guess an ID could fetch another user's Global LB and its endpoints. Enforce that glb.UserID (and ideally TenantID) matches the caller context, and return NotFound/Unauthorized when it doesn't.

Suggested change
// Enforce ownership and tenant scoping
currentUserID := appcontext.UserIDFromContext(ctx)
currentTenantID := appcontext.TenantIDFromContext(ctx)
// If the record is associated with a specific user, ensure it matches the caller.
if glb.UserID != uuid.Nil && glb.UserID != currentUserID {
// Return NotFound to avoid leaking the existence of the resource.
return nil, errors.New(errors.NotFound, "global load balancer not found")
}
// If the record is associated with a specific tenant, ensure it matches the caller.
if glb.TenantID != uuid.Nil && glb.TenantID != currentTenantID {
// Return NotFound to avoid leaking the existence of the resource.
return nil, errors.New(errors.NotFound, "global load balancer not found")
}

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +54
func (r *globalLBRepository) GetByID(ctx context.Context, id uuid.UUID) (*domain.GlobalLoadBalancer, error) {
query := `SELECT * FROM global_load_balancers WHERE id = $1`

// Implementation Note: Manual field mapping is utilized here to ensure
// schema robustness and decoupled domain-to-storage mapping.
row := r.db.QueryRow(ctx, query, id)
return scanGlobalLB(row)
}

func (r *globalLBRepository) GetByHostname(ctx context.Context, hostname string) (*domain.GlobalLoadBalancer, error) {
query := `SELECT * FROM global_load_balancers WHERE hostname = $1`
row := r.db.QueryRow(ctx, query, hostname)
return scanGlobalLB(row)
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetByID (and GetByHostname/List) are not scoped by the caller identity and use SELECT *. This breaks the established repo pattern (e.g., lb_repo.go scopes queries by appcontext.UserIDFromContext(ctx)) and allows cross-user reads if called from services/handlers. Scope queries by user_id (and tenant if applicable) and select explicit columns to avoid brittle column-order coupling.

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +10
id UUID PRIMARY KEY,
user_id UUID NOT NULL,
tenant_id UUID NOT NULL,
name VARCHAR(255) NOT NULL,
hostname VARCHAR(255) UNIQUE NOT NULL,
policy VARCHAR(50) NOT NULL,
health_check_protocol VARCHAR(10),
health_check_port INT,
health_check_path VARCHAR(255),
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global_load_balancers.user_id and tenant_id are declared NOT NULL but are missing foreign key constraints, unlike other multi-tenant tables (e.g., dns_zones). Adding REFERENCES users(id) / REFERENCES tenants(id) helps enforce integrity. Also, the health_check_* columns are nullable while the Go model/repo scan uses non-nullable fields; consider making them NOT NULL with defaults (or updating the Go scan/model to handle NULLs).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 3, 2026 10:23
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@internal/core/services/global_lb_integration_test.go`:
- Around line 41-45: Tests are leaking Global LB state: update the shared
cleanDB utility to include cleanup for the new Global LB tables referenced by
TestGlobalLBServiceIntegration (or add explicit teardown in that test);
specifically, modify the cleanDB function (used by setupDB) to truncate/drop the
new Global LB tables (or delete rows) and include any related sequences/index
cleanup so subsequent runs are isolated, or add a defer in
TestGlobalLBServiceIntegration to perform the same cleanup if you prefer
test-local handling.
- Around line 26-32: The CreateGeoRecord implementation stores the passed
endpoints slice by reference despite the "Copy to simulate persistence" comment;
change StubGeoDNSBackend.CreateGeoRecord to allocate a new slice and copy the
endpoints into it before assigning to s.CreatedRecords[hostname] (if
domain.GlobalEndpoint is a struct a shallow copy of each element is sufficient;
if it contains pointer fields perform a per-element deep copy) so mutations to
the original endpoints after the call do not affect the stored record.
🧹 Nitpick comments (8)
.github/workflows/ci.yml (1)

309-318: Consider adding Buildx setup to deploy jobs for consistency.

The build job explicitly sets up Buildx via docker/setup-buildx-action@v3, but the deploy jobs rely on docker/build-push-action to implicitly handle it. While this works, explicitly adding the Buildx setup step would improve consistency and give you explicit control over the Buildx version across all jobs.

💡 Optional: Add Buildx setup to deploy-staging
     steps:
       - uses: actions/checkout@v6
 
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+
       - name: Lowercase Repo Name

Apply the same pattern to deploy-production.

internal/core/services/global_lb_integration_test.go (1)

1-8: Consider guarding this integration test with a build tag.

This test wires real PostgreSQL repositories; gating it keeps default unit-test runs mock-only.

🔧 Suggested build tag
+//go:build integration
+// +build integration
+
 package services_test
As per coding guidelines, Do not test with real external dependencies in unit tests - use mocks instead; Mock repositories in service tests.
internal/handlers/global_lb_handler.go (2)

66-67: Use net/http status constants for success responses.

Hard-coded 201/200/204 are magic numbers; prefer http.StatusCreated, http.StatusOK, and http.StatusNoContent for clarity.

♻️ Suggested update
 import (
+	"net/http"
 	"github.com/gin-gonic/gin"
 	"github.com/google/uuid"
 	appcontext "github.com/poyrazk/thecloud/internal/core/context"
 	"github.com/poyrazk/thecloud/internal/core/domain"
 	"github.com/poyrazk/thecloud/internal/core/ports"
 	"github.com/poyrazk/thecloud/internal/errors"
 	"github.com/poyrazk/thecloud/pkg/httputil"
 )
@@
-	httputil.Success(c, 201, glb)
+	httputil.Success(c, http.StatusCreated, glb)
@@
-	httputil.Success(c, 200, glb)
+	httputil.Success(c, http.StatusOK, glb)
@@
-	httputil.Success(c, 200, glbs)
+	httputil.Success(c, http.StatusOK, glbs)
@@
-	httputil.Success(c, 204, nil)
+	httputil.Success(c, http.StatusNoContent, nil)
@@
-	httputil.Success(c, 201, ep)
+	httputil.Success(c, http.StatusCreated, ep)
@@
-	httputil.Success(c, 204, nil)
+	httputil.Success(c, http.StatusNoContent, nil)

As per coding guidelines: Do not use magic numbers - use named constants instead; Use httputil.Success(c, http.StatusOK, data) for successful handler responses.

Also applies to: 92-93, 111-112, 136-137, 190-191, 221-222


165-173: Make endpoint defaults explicit via named constants.

The default weight/priority values are magic numbers; extracting them clarifies intent and eases reuse.

♻️ Suggested update
 type GlobalLBHandler struct {
 	svc ports.GlobalLBService
 }
+
+const (
+	defaultGlobalEndpointWeight   = 1
+	defaultGlobalEndpointPriority = 1
+)
@@
-	weight := 1
+	weight := defaultGlobalEndpointWeight
 	if req.Weight > 0 {
 		weight = req.Weight
 	}
-	priority := 1
+	priority := defaultGlobalEndpointPriority
 	if req.Priority > 0 {
 		priority = req.Priority
 	}

As per coding guidelines: Do not use magic numbers - use named constants instead.

internal/core/services/global_lb.go (2)

143-148: Reconsider error wrapping semantics.

If s.Get returns Unauthorized (user doesn't own the GLB), wrapping it as NotFound obscures the actual error type. While this could be intentional for security (preventing enumeration), it's inconsistent with the explicit Unauthorized errors returned elsewhere in this file (e.g., lines 102, 166, 235).

Consider either:

  1. Returning the error directly to preserve semantics, or
  2. Adding a comment clarifying the intentional security-by-obscurity pattern.
♻️ Option: preserve error semantics
 glb, err := s.Get(ctx, glbID)
 if err != nil {
-    return nil, errors.Wrap(errors.NotFound, "global load balancer not found", err)
+    return nil, err // Preserves NotFound vs Unauthorized distinction
 }

85-88: Consider handling audit log errors.

The blank identifier assignment _ = s.auditSvc.Log(...) silently discards errors. While audit failures shouldn't block business operations, the coding guidelines recommend avoiding this pattern. Consider logging the error instead.

♻️ Proposed fix to log audit errors
-_ = s.auditSvc.Log(ctx, glb.UserID, "global_lb.create", "global_lb", glb.ID.String(), map[string]interface{}{
-    "hostname": hostname,
-    "policy":   policy,
-})
+if err := s.auditSvc.Log(ctx, glb.UserID, "global_lb.create", "global_lb", glb.ID.String(), map[string]interface{}{
+    "hostname": hostname,
+    "policy":   policy,
+}); err != nil {
+    s.logger.Warn("audit log failed", "action", "global_lb.create", "error", err)
+}

This pattern should be applied consistently to the other audit log calls at lines 138, 207, and 256.

docs/swagger/swagger.json (2)

5367-5486: Add enum constraint to target_type in domain.GlobalEndpoint response schema.

The request schema (httphandlers.AddGlobalEndpointRequest) already constrains target_type to ["LB", "IP"]. Adding the same enum to the response keeps both schemas consistent and follows OpenAPI 2.0 best practices for code generation and API documentation clarity.

♻️ Suggested schema tweak
                 "target_type": {
                     "description": "\"LB\" or \"IP\"",
-                    "type": "string"
+                    "type": "string",
+                    "enum": [
+                        "LB",
+                        "IP"
+                    ]
                 },

6828-6859: Remove redundant inline enum from policy field—use $ref alone to prevent drift.

The inline enum duplicates domain.RoutingPolicy (both define the same four values). Maintaining separate copies risks divergence if the enum evolves. Since the referenced schema already defines the enum, the inline duplicate is unnecessary.

♻️ Suggested schema simplification
                 "policy": {
-                    "enum": [
-                        "LATENCY",
-                        "GEOLOCATION",
-                        "WEIGHTED",
-                        "FAILOVER"
-                    ],
                     "allOf": [
                         {
                             "$ref": "#/definitions/domain.RoutingPolicy"
                         }
                     ]
                 }

Comment on lines +26 to +32
func (s *StubGeoDNSBackend) CreateGeoRecord(ctx context.Context, hostname string, endpoints []domain.GlobalEndpoint) error {
if s.CreatedRecords == nil {
s.CreatedRecords = make(map[string][]domain.GlobalEndpoint)
}
// Copy to simulate persistence
s.CreatedRecords[hostname] = endpoints
return nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n internal/core/services/global_lb_integration_test.go

Repository: PoyrazK/thecloud

Length of output: 6821


🏁 Script executed:

rg "CreatedRecords" internal/core/services/global_lb_integration_test.go -A 3 -B 1

Repository: PoyrazK/thecloud

Length of output: 1093


Copy endpoints before storing to match the comment's intent and prevent aliasing.

The comment says "Copy to simulate persistence," but the slice is stored by reference. If the service code mutates the endpoints slice after CreateGeoRecord is called, the stub would reflect those mutations instead of persisting independent data.

Suggested fix
-	// Copy to simulate persistence
-	s.CreatedRecords[hostname] = endpoints
+	// Copy to simulate persistence
+	copied := append([]domain.GlobalEndpoint(nil), endpoints...)
+	s.CreatedRecords[hostname] = copied
🤖 Prompt for AI Agents
In `@internal/core/services/global_lb_integration_test.go` around lines 26 - 32,
The CreateGeoRecord implementation stores the passed endpoints slice by
reference despite the "Copy to simulate persistence" comment; change
StubGeoDNSBackend.CreateGeoRecord to allocate a new slice and copy the endpoints
into it before assigning to s.CreatedRecords[hostname] (if domain.GlobalEndpoint
is a struct a shallow copy of each element is sufficient; if it contains pointer
fields perform a per-element deep copy) so mutations to the original endpoints
after the call do not affect the stored record.

Comment on lines +41 to +45
func TestGlobalLBServiceIntegration(t *testing.T) {
db := setupDB(t)
// Add global lb tables to cleanup if needed, but cleanDB in setup_test.go should be updated
// For now we manually clean them or ensure cleanDB is updated.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Resolve the cleanup TODO to keep tests isolated.

Please update cleanDB to cover the new Global LB tables (or add explicit cleanup here) so scenarios don’t leak state between runs. Do you want me to draft the cleanDB update or open an issue to track it?

🤖 Prompt for AI Agents
In `@internal/core/services/global_lb_integration_test.go` around lines 41 - 45,
Tests are leaking Global LB state: update the shared cleanDB utility to include
cleanup for the new Global LB tables referenced by
TestGlobalLBServiceIntegration (or add explicit teardown in that test);
specifically, modify the cleanDB function (used by setupDB) to truncate/drop the
new Global LB tables (or delete rows) and include any related sequences/index
cleanup so subsequent runs are isolated, or add a defer in
TestGlobalLBServiceIntegration to perform the same cleanup if you prefer
test-local handling.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 55 changed files in this pull request and generated 14 comments.

Comments suppressed due to low confidence (1)

internal/repositories/postgres/test_helpers.go:113

  • CleanDB is intended to remove standard resource tables for test isolation, but it doesn’t delete from global_lb_endpoints/global_load_balancers. This can leave leftover GLB data between integration tests. Consider adding deletes for the new GLB tables (endpoints first, then load balancers).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +218 to +222
return
}

httputil.Success(c, 204, nil)
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint returns HTTP 204 but uses httputil.Success, which writes a JSON body. Prefer c.Status(http.StatusNoContent) here as well (or adjust httputil.Success to special-case 204).

Copilot uses AI. Check for mistakes.
Comment on lines +108 to +112
_, err := r.db.Exec(ctx, query,
glb.Name, glb.Policy,
glb.HealthCheck.Protocol, glb.HealthCheck.Port, glb.HealthCheck.Path,
glb.HealthCheck.IntervalSec, glb.HealthCheck.TimeoutSec,
glb.HealthCheck.HealthyCount, glb.HealthCheck.UnhealthyCount,
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update does not check RowsAffected; updating a non-existent GLB would return nil and silently succeed. Consider checking cmd.RowsAffected() and returning a conflict/not-found error when 0 rows are affected (consistent with other postgres repos).

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +122
func (r *globalLBRepository) Delete(ctx context.Context, id uuid.UUID, userID uuid.UUID) error {
query := `DELETE FROM global_load_balancers WHERE id = $1 AND user_id = $2`
_, err := r.db.Exec(ctx, query, id, userID)
return err
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete ignores the Exec command tag and returns nil even when no rows were deleted (e.g., wrong userID). Please check cmd.RowsAffected() and return a NotFound/Unauthorized-style error when 0 rows are affected, consistent with other repos (e.g., LBRepository.Delete).

Copilot uses AI. Check for mistakes.
Comment on lines +181 to +185
func (r *globalLBRepository) UpdateEndpointHealth(ctx context.Context, epID uuid.UUID, healthy bool) error {
query := `UPDATE global_lb_endpoints SET healthy = $1, last_health_check = $2 WHERE id = $3`
_, err := r.db.Exec(ctx, query, healthy, time.Now(), epID)
return err
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateEndpointHealth does not check RowsAffected; updating a missing endpoint would return nil. Consider checking cmd.RowsAffected() and returning NotFound when 0 rows are affected.

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +132
// Synchronously remove the associated GeoDNS record set.
if err := s.geoDNS.DeleteGeoRecord(ctx, glb.Hostname); err != nil {
s.logger.Error("geo-dns record deletion failed", "hostname", glb.Hostname, "error", err)
// Proceed with database deletion to maintain system consistency and prevent orphaned resources.
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GlobalLBService.Delete deletes the DNS record before deleting the DB row. If repo.Delete fails (or becomes a no-op due to a mismatched userID), DNS will be removed while the GLB still exists in the DB. Consider deleting from the DB first (and ensuring repo.Delete returns NotFound when nothing is deleted), then deleting DNS as a best-effort step.

Copilot uses AI. Check for mistakes.
Comment on lines +157 to +167
if targetID == nil {
return nil, errors.New(errors.InvalidInput, "target_id required for LB endpoint")
}
// Verify LB exists and belongs to user
lb, err := s.lbRepo.GetByID(ctx, *targetID)
if err != nil {
return nil, errors.Wrap(errors.NotFound, "target load balancer not found", err)
}
if lb.UserID != userID {
return nil, errors.New(errors.Unauthorized, "unauthorized access to regional load balancer")
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddEndpoint supports targetType "LB", but the PowerDNS GeoDNS adapter currently ignores LB endpoints when building DNS answers. This means a GLB with only LB endpoints will end up with no DNS records (and may trigger deletion). Either disallow LB endpoints until they can be resolved to an address (return InvalidInput), or implement resolution (e.g., map LB to an IP/CNAME) so DNS sync reflects LB endpoints.

Suggested change
if targetID == nil {
return nil, errors.New(errors.InvalidInput, "target_id required for LB endpoint")
}
// Verify LB exists and belongs to user
lb, err := s.lbRepo.GetByID(ctx, *targetID)
if err != nil {
return nil, errors.Wrap(errors.NotFound, "target load balancer not found", err)
}
if lb.UserID != userID {
return nil, errors.New(errors.Unauthorized, "unauthorized access to regional load balancer")
}
// LB endpoints are not yet supported by the GeoDNS adapter, and would
// result in GLBs without usable DNS records. Disallow them until proper
// resolution (e.g., LB -> IP/CNAME) is implemented.
_ = userID // reserved for future use when LB resolution is supported
return nil, errors.New(errors.InvalidInput, "LB endpoints are not currently supported for global load balancers")

Copilot uses AI. Check for mistakes.
Comment on lines +138 to +142
func (r *globalLBRepository) RemoveEndpoint(ctx context.Context, endpointID uuid.UUID) error {
query := `DELETE FROM global_lb_endpoints WHERE id = $1`
_, err := r.db.Exec(ctx, query, endpointID)
return err
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoveEndpoint deletes by endpoint id but does not check RowsAffected; callers can't distinguish "already removed / not found" from success. Consider checking cmd.RowsAffected() and returning a NotFound error when 0.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +137
}

httputil.Success(c, 204, nil)
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These handlers return HTTP 204 but use httputil.Success, which always writes a JSON body. For 204 No Content, prefer c.Status(http.StatusNoContent) (or adjust httputil.Success to special-case 204) to avoid sending a response body for 204.

Copilot uses AI. Check for mistakes.
Comment thread docs/FEATURES.md
Comment on lines +150 to +159
- **Global Scope**: Use Global Load Balancers (GLB) for multi-region traffic distribution across different regional ELBs.

### 9. Global Load Balancer (GLB) 🆕
**What it is**: Multi-region traffic steering at the DNS level.
**Tech Stack**: PowerDNS (LUA/Geo), Go, PostgreSQL.
**Implementation**:
- **GeoDNS Orchestration**: Dynamically manages DNS A/CNAME records based on regional health.
- **Routing Policies**: Supports `LATENCY`, `GEOLOCATION`, `WEIGHTED`, and `FAILOVER`.
- **Health Tracking**: Global probes monitor regional endpoints; unhealthy targets are automatically pulled from DNS.
- **Hybrid Support**: Can route to internal Regional LBs or external static IPs.
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section describes GLB health probes, automatic unhealthy endpoint removal, and routing to regional LBs, but the current implementation doesn’t perform health checking and the PowerDNS GeoDNS adapter only publishes IP endpoints (LB endpoints are ignored). Please align the documentation with current behavior or implement the missing features so the docs remain accurate.

Suggested change
- **Global Scope**: Use Global Load Balancers (GLB) for multi-region traffic distribution across different regional ELBs.
### 9. Global Load Balancer (GLB) 🆕
**What it is**: Multi-region traffic steering at the DNS level.
**Tech Stack**: PowerDNS (LUA/Geo), Go, PostgreSQL.
**Implementation**:
- **GeoDNS Orchestration**: Dynamically manages DNS A/CNAME records based on regional health.
- **Routing Policies**: Supports `LATENCY`, `GEOLOCATION`, `WEIGHTED`, and `FAILOVER`.
- **Health Tracking**: Global probes monitor regional endpoints; unhealthy targets are automatically pulled from DNS.
- **Hybrid Support**: Can route to internal Regional LBs or external static IPs.
- **Global Scope**: Use Global Load Balancers (GLB) for multi-region traffic distribution via DNS-based routing to regional endpoints.
### 9. Global Load Balancer (GLB) 🆕
**What it is**: Multi-region traffic steering at the DNS level.
**Tech Stack**: PowerDNS (LUA/Geo), Go, PostgreSQL.
**Implementation**:
- **GeoDNS Orchestration**: Manages DNS A/CNAME records for configured regional IP endpoints via PowerDNS GeoDNS.
- **Routing Policies**: Uses PowerDNS GeoDNS capabilities (for example, latency/geolocation-aware responses) applied to the statically configured regional endpoints.
- **Health & Failover**: The current implementation does not perform active health checks; endpoints must be updated manually or by external automation when their health changes.
- **Endpoint Support**: Publishes IP-based endpoints only; direct integration with internal Regional LBs as first-class GLB targets is not yet implemented.

Copilot uses AI. Check for mistakes.
Comment thread docs/api-reference.md
Comment on lines +337 to +341
## Global Load Balancers 🆕

**Headers Required:** `X-API-Key: <your-api-key>`

### GET /loadbalancers
List load balancers.
### GET /global-lb
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API reference introduces Global Load Balancers but no longer documents the existing regional load balancer endpoints (e.g., /lb in swagger). Consider adding a separate section for regional LBs instead of replacing/removing them, so the API reference stays complete.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Feb 3, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
11 Security Hotspots

See analysis details on SonarQube Cloud

@poyrazK poyrazK merged commit 6b145d9 into main Feb 3, 2026
35 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants