Skip to content

fix: prevent Namespace informer cache errors and fallbacks#1686

Merged
EItanya merged 3 commits intokagent-dev:mainfrom
supreme-gg-gg:fix/controller-namespace
Apr 21, 2026
Merged

fix: prevent Namespace informer cache errors and fallbacks#1686
EItanya merged 3 commits intokagent-dev:mainfrom
supreme-gg-gg:fix/controller-namespace

Conversation

@supreme-gg-gg
Copy link
Copy Markdown
Contributor

Summary

In namespaced RBAC mode, any code path that read a Namespace object via the cached client would lazily start a cluster-scoped list/watch informer. Since a Role cannot grant access to cluster-scoped resources, this informer repeatedly fails with a 403 error. The simple repro is listing namespaces from the UI and the following will show up repeated in the controller logs:

failed to list *v1.Namespace: namespaces is forbidden: User \"system:serviceaccount:kagent:kagent-controller\" cannot list resource \"namespaces\" in API group \"\" at the cluster

This PR disables controller-runtime's informer cache for Namespace objects, which automatically uses direct API reads, and thus we have the ability to catch authorization errors and use a fallback and prevent the error loop. When available, we use watchedNamespaces as a fallback for isInternalK8sURL and /api/namespaces call sites. For AllowedNamespaces.from=Selector, it returns a clear error since reading namespace labels requires cluster role.

Testing

The following scenarios have been manually tested:

  • Listing namespaces from the UI no longer causes UI and controller error
  • The above informer error spam is no longer there
  • Cross NS reference with AllowNamespaces.from=All, AllowedNamespaces.from=Selector (both allowed and rejected paths and unwatched namespace rejection)
  • isInternalK8sURL will fallback to watchedNamespaces during agent translation

Alternatives

Other than catching the error, I've also thought of using some mechanism to "inform" the controller if it has cluster wide access or not. I considered watchedNamespaces (which is always set if using namespaced RBAC) or creating another flag, but eventually both has some complex logic or edge cases troubles and gracefully handling errors / fallback seems the best solution to me right now.

Copilot AI review requested due to automatic review settings April 16, 2026 23:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Disables controller-runtime’s cached client for Namespace objects to prevent cluster-scoped Namespace informer 403 loops in namespaced RBAC mode, and adds watched-namespace-based fallbacks where Namespace reads may be forbidden.

Changes:

  • Configure manager client to bypass the informer cache for corev1.Namespace.
  • Add watched-namespace fallback logic to namespace listing (/api/namespaces) and internal-K8s-URL detection.
  • Improve behavior when AllowedNamespaces.from=Selector is used without Namespace read permissions (clear error) and add/extend related tests.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
go/core/pkg/app/app.go Disables cached client for Namespace; passes watched namespaces into translator constructor.
go/core/internal/httpserver/handlers/namespaces.go Adds Forbidden/Unauthorized fallback to configured watched namespaces; introduces helper for name-only responses.
go/core/internal/httpserver/handlers/handlers.go Adds WatchedNamespaces onto handler base dependencies.
go/core/internal/httpserver/handlers/agents.go Constructs translator with watched namespaces from handler base.
go/core/internal/httpserver/handlers/namespaces_test.go Adds test covering fallback to watched namespaces when Namespace reads are forbidden.
go/core/internal/controller/translator/agent/adk_api_translator.go Adds watched-namespace fallback in isInternalK8sURL; introduces new constructor variant.
go/core/internal/controller/translator/agent/proxy_test.go Adds test ensuring proxy translation falls back to watched namespaces when Namespace reads are forbidden.
go/api/v1alpha2/common_types.go Returns a clearer error when selector-based allowed namespaces can’t read namespace labels due to RBAC.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread go/core/pkg/app/app.go Outdated
Comment thread go/core/internal/httpserver/handlers/namespaces.go
Comment thread go/core/internal/httpserver/handlers/namespaces.go Outdated
Comment thread go/core/internal/httpserver/handlers/handlers.go
Comment thread go/api/v1alpha2/common_types.go
Comment thread go/core/pkg/app/app.go Outdated
Comment on lines +419 to +427
Client: client.Options{
Cache: &client.CacheOptions{
// Prevent the cached client from starting a cluster-scoped
// Namespace informer. In namespaced RBAC mode a Role cannot
// grant access to cluster-scoped resources, so an informer
// list/watch would keep crashing and cannot be handled.
DisableFor: []client.Object{&corev1.Namespace{}},
},
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be optional based on the clusterRole setting? Otherwise caching this is definitely better

Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
@supreme-gg-gg supreme-gg-gg force-pushed the fix/controller-namespace branch from 3098a15 to 6ef0e38 Compare April 20, 2026 21:58
@EItanya EItanya merged commit b345356 into kagent-dev:main Apr 21, 2026
23 checks passed
Huimintai pushed a commit to Huimintai/kagent that referenced this pull request Apr 21, 2026
…v#1686)

## Summary

In namespaced RBAC mode, any code path that read a `Namespace` object
via the cached client would lazily start a cluster-scoped list/watch
informer. Since a Role cannot grant access to cluster-scoped resources,
this informer repeatedly fails with a 403 error. The simple repro is
listing namespaces from the UI and the following will show up repeated
in the controller logs:

```
failed to list *v1.Namespace: namespaces is forbidden: User \"system:serviceaccount:kagent:kagent-controller\" cannot list resource \"namespaces\" in API group \"\" at the cluster
```

This PR disables controller-runtime's informer cache for `Namespace`
objects, which automatically uses direct API reads, and thus we have the
ability to catch authorization errors and use a fallback and prevent the
error loop. When available, we use `watchedNamespaces` as a fallback for
`isInternalK8sURL` and `/api/namespaces` call sites. For
`AllowedNamespaces.from=Selector`, it returns a clear error since
reading namespace labels requires cluster role.

## Testing

The following scenarios have been manually tested:

- Listing namespaces from the UI no longer causes UI and controller
error
- The above informer error spam is no longer there
- Cross NS reference with `AllowNamespaces.from=All`,
`AllowedNamespaces.from=Selector` (both allowed and rejected paths and
unwatched namespace rejection)
- `isInternalK8sURL` will fallback to `watchedNamespaces` during agent
translation

## Alternatives

Other than catching the error, I've also thought of using some mechanism
to "inform" the controller if it has cluster wide access or not. I
considered `watchedNamespaces` (which is always set if using namespaced
RBAC) or creating another flag, but eventually both has some complex
logic or edge cases troubles and gracefully handling errors / fallback
seems the best solution to me right now.

---------

Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants