Periscope v1.1 — five questions every EKS SRE bounces tabs to answer, in one place #216
gnana997
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
v1.0 of Periscope shipped six months ago with one constraint at its core: a Kubernetes console that can never hold static AWS credentials. Pod Identity / IRSA only on EKS; agent-tunneled mTLS for everything else. That's the foundation.
Since then, the question I kept hearing from operators was the same shape every time: "I can connect to my clusters, but for the actual question I'm trying to answer, I'm still in the AWS console."
Five questions in particular. Today's v1.1 release closes the last one. The four before it shipped through v1.0's patch series.
The five questions
1. What CVEs are running in my pods?
You're triaging a vulnerability disclosure. Customer security team wants a list of every workload affected. Without Periscope:
aws inspector2 list-findings→ page through 200+ raw findings → correlate by ECR image digest → manually map digests to pods.With Periscope: every list page (Pods, Nodes, Karpenter, Workloads) carries an inline severity chip —
2C · 5H · 12M— sourced from Amazon Inspector v2. Click into any pod / node / Deployment / StatefulSet / DaemonSet, the Security detail-pane tab opens, and the 200 raw CVEs are grouped server-side by package so the wall collapses to ~10 actionable bumps. Each group header carries the suggested upgrade version that closes every finding inside it — e.g. "upgradego/stdlib1.16.1 → 1.26.3 closes 116 CVEs."Filter chips slice by severity /
exploits-only/fixable-only. Refresh is entity-scoped and audited; reads aren't (CloudTrail records the underlying Inspector calls against periscope-server's role). One Helm flag (inspector.enabled: true) wires it on per cluster.The grouping logic is server-side at
internal/cve/findings_group.go, so the same wire shape feeds the SPA today and the future MCP / AI agent tool layer (v1.4 epic). One source of truth for "what to fix first." Docs →2. Is this cluster ready to bump K8s minors?
EKS minor releases drop ~3 times a year. The pre-flight question is always the same: do I have any API-deprecation insights, are my managed node groups on the latest AMI, what add-ons need to move before the cluster upgrade?
Periscope surfaces all three on one page. EKS Upgrade Insights inline with per-insight severity, the K8s-version step it blocks, and the remediation link. Managed node group AMI drift detection so operators see at a glance which groups are weeks behind the latest AMI release — the rotation work that has to happen before the cluster bump. Add-on freshness (health, installed version, latest available, K8s-compat window) joined to each cluster's
DescribeAddonVersions.The whole readiness story is one route. No separate console. Docs →
3. What add-ons are out of date — and can I upgrade them from here?
Three years ago this was a kubectl-and-pray operation. Today the EKS add-on API exposes a clean lifecycle, but the AWS console UX still makes you context-switch.
Periscope's EKS Add-ons page browses installed add-ons with health + installed version + latest available + compatibility window + "blocks next K8s minor" warnings. The catalog tab browses every AWS-published add-on for the cluster's K8s version, filterable by type. Install / upgrade / delete actions are wired with a schema-aware configuration editor (the values screen Periscope uses for Helm) and the
resolveConflictschoice surfaced inline. The right-edge detail pane shows the operator's storedconfigurationValues, the add-on's IAM service-account role + Pod Identity associations, and full version history with one-click "upgrade to" on newer versions. Status-aware polling watchesCREATING/UPDATING/DELETINGflips so the UI stays in sync without manual refresh.Docs →
4. Why isn't Karpenter scheduling this pod?
The Karpenter answer used to live in three places:
kubectl get nodepoolsfor the topology,karpenter-controllerlogs for the scheduling decisions, andkubectl describe podfor theFailedSchedulingevents. Periscope joins all three on one route.NodePool table with weight, disruption budgets, current/limit usage, per-pool
$/hrand spot savings (the controller's/metricsexposition scraped via apiserver service-proxy under impersonation, so the scrape is per-user-authorized).NodeClaims grouped by NodePool with
Drifted/Initialized/Launchedconditions surfaced as badges; pools with any drifted claim auto-expand.Pending pods waiting on Karpenter with the per-NodePool incompatibility breakdown extracted from the
FailedSchedulingapiserver event. Operators no longer grep karpenter-controller logs to find out why a pod isn't being scheduled — each rejected NodePool renders inline next to the pod with the wrapped reason ("incompatible requirements," "insufficient capacity," "taint mismatch") parsed for them.Auto-detected via the
karpenter.sh/v1CRD probe — the sidebar entry only appears on Karpenter-enabled clusters. No-op on the others. Docs →5. Which IAM grants does this pod have? Which workloads can do action X? — NEW in v1.1
This is the question v1.0 didn't answer yet. EKS is the only managed Kubernetes where pods inherit cloud-provider identity through three layers (Kubernetes RBAC, IRSA or Pod Identity, then IAM), and the auth model itself just changed (
aws-auth→ Access Entries). The answers to "who can do what" live across the AWS Console, the IAM API,kubectl, and a ConfigMap nobody wants to touch.v1.1 ships three surfaces that close the loop:
Cluster Access page — reconciles EKS Access Entries with the legacy
kube-system/aws-authConfigMap, builds a unified SA → IAM Role index across IRSA annotations and Pod Identity associations, surfaces the Pod Identity view by role, and rolls migration health into one chip on the header (aws-auth-only · dual · entries-onlycounts). The view EKS shipped without, especially if you're mid-migration.AWS Access tab on every Pod / ServiceAccount / Deployment / StatefulSet / DaemonSet detail pane — resolves the workload's identity chain server-side, fetches every inline + managed IAM policy attached to the bound role(s), groups by AWS service, and chips against an 18-action sensitive-permissions catalog (privilege-escalation, data, cross-account, destructive, cluster self-modify, wildcard). The "what does my
cron-rotatoractually grant in AWS" answer in one tab.Reverse lookup at
/clusters/<c>/reverse-lookup— type an action (s3:DeleteBucket,iam:PassRole, anything else), get back every pod in the cluster whose bound IAM role grants it, with binding-source attribution (IRSA / Pod Identity) and a wildcard chip on rows where the role grants viaAction: "*"rather than the explicit action. The "which workloads can do X?" answer.All three are powered by the same server-side IAM policy resolution engine, which is exposed at
GET /api/identity/sensitive-catalogas a stable cluster-agnostic wire shape so future MCP / AI agent integrations index against the same chip metadata the SPA renders from.Blog walkthrough → · Docs →
What v1.1 is honestly not (and what comes next)
Same shape as the v1.0 announcement's "what's missing" section. Better to call out the limits up front than have you discover them mid-incident.
v1.1's AWS Access surfaces show identity-side IAM grants, not effective access:
aws:RequestTag,aws:PrincipalTag,aws:SourceVpc). We display the condition, we don't tell you whether it applies to a specific request. - No SCPs or permission boundaries. Your AWS org may deny what the identity policies appear to allow. - Nosts:AssumeRolechain walking. We flag the action; we don't follow it across accounts. - No resource-based policies (bucket policies, KMS key policies). Identity-side only. - No fleet-wide reverse lookup (per-cluster in v1.1). - No real-time IAM updates — 5-minute cache + manual refresh.If you're security-auditing for compliance, treat the AWS Access tab as the upper bound of what the role grants. The effective surface is smaller; the inventory on the identity side is exhaustive.
v1.3 is the effective-access release — conditions evaluated, SCPs + boundaries applied, cross-account
sts:AssumeRolechains walked, fleet-wide reverse-lookup, CloudTrail-driven cache invalidation. Plus the AWS compliance lens (CloudTrail pod-correlation + cluster-wide kube-apiserver audit ingestion — every action by every identity in one feed) and the related- resources graph so every detail pane gets a Related tab.v1.2 lands first with a different focus: GPU + AI workload visibility (Pod ↔ GPU map, Idle GPU finder, DCGM reconciler — #202), in-browser cluster shell with single-row audit (#104), SSM into EKS nodes with per-user impersonation (#105), and Helm private OCI auth via Pod Identity / IRSA (#121).
v1.4 wraps the arc with an agent-native chat surface and an MCP-style tool registry over the wire shapes shipped in v1.1–v1.3 (#151).
The full release roadmap with theme bands is at periscopehq.dev/roadmap.
The integration philosophy, briefly
The five answers above aren't a feature checklist; they're a posture. Periscope's design principle is: if the question is about your EKS cluster, the answer lives in Periscope. The AWS Console becomes the place you go for write operations on AWS primitives (creating an IAM role from scratch, configuring an Inspector v2 organization, provisioning a NAT gateway) — not the place you go to read state about your running clusters.
The mechanism that makes this work without leaking credentials is the same one from v1.0: - Cluster reads go through Pod Identity / IRSA, scoped to the periscope-server role - Every action carries the human's OIDC identity via K8s impersonation, so the audit row says
alice@corp— neverperiscope-bot- Pre-flightSelfSubjectAccessReviewchecks so disabled UI explains itself instead of failing on click - The same audit trail covers EKS surfaces, IAM surfaces, Inspector reads, and add-on writes — one log, one schemaThe new v1.1 IAM-policy reads emit a new audit verb (
aws_iam_read) distinct from the existingaws_identity_read(used by the cluster-identity endpoints). Operators with audit filters onaws_identity_readshould add the new verb.Try it
The image at
ghcr.io/gnana997/periscope:v1.1.0is cosign-signed with SBOM + intoto attestations on the GitHub Release.If you'd rather see the surfaces light up against a throwaway cluster before integrating, the periscope-demo-eks-antipatterns repo provisions a small EKS cluster with all 12 of the IAM antipatterns the v1.1 surfaces detect (broad wildcards, dual-source SAs, orphan Pod Identity, stale IRSA annotations, broad OIDC trust, default-SA blind spot) plus four vulnerable container images for the Security tab. ~$2 to spin up and tear down in a single capture session.
The IAM grant Periscope-server needs is documented at
docs/setup/cluster-rbac.md— including the optionaliam:SimulatePrincipalPolicyline that powers the locked-feature pane (tells operators exactly which IAM action is missing, instead of a generic 403, when permissions aren't fully wired yet).Questions I'd love discussion on
A few that came up internally and I don't have firm answers to:
dynamodb:Scanon a broad scope? Specific KMS grant operations? I'd rather argue about a 19th entry than over-extend the default. -aws-auth→ Access Entries migration status — where are most teams? Early (mostly aws-auth still), mid (heavy dual), done (entries-only)? The migration-health chip surfaces per-cluster but I'm curious about the aggregate. - What's the next EKS / AWS surface that would bring you back? — i.e., what's the sixth question that drives you to the AWS console that we haven't yet integrated. Honest "this is still painful" feedback is more useful than feature votes on the roadmap.Drop a reply, open an issue, or grab the demo cluster and tell me what's broken.
— @gnana997
Beta Was this translation helpful? Give feedback.
All reactions