Skip to content

Feast Operator should auto-create RBAC for Feast service account to access KubeRay cluster #6408

@ntkathole

Description

@ntkathole

Problem

When the Feast Operator deploys a FeatureStore CR with batchEngine configured to use a KubeRay cluster, the Feast server pods fail to connect to the Ray cluster because the Feast deployment's service account lacks the necessary RBAC permissions to interact with Ray resources.

Currently, users must manually create Roles and RoleBindings to grant the Feast service account access to rayclusters.ray.io and secrets resources. The connection simply fails with 403 Forbidden errors from the Kubernetes API when the CodeFlare SDK attempts to discover and connect to the Ray cluster.

Scenario

User creates a FeatureStore CR with a batchEngine referencing a Ray batch engine ConfigMap:

spec:
  batchEngine:
    configMapRef:
      name: feast-ray-batch-engine

The ConfigMap contains KubeRay configuration:

use_kuberay: true
kuberay_conf:
  cluster_name: feast-ray
  namespace: feast-demo

The Feast Operator creates a Deployment with service account feast-demo, but does not grant it any permissions to access Ray resources.

When materialization runs, the CodeFlare SDK (via kube-authkit) authenticates using the pod's service account and attempts to:

  • GET /apis/ray.io/v1/namespaces/feast-rag/rayclusters/feast-ray — to discover the Ray cluster
  • GET /api/v1/namespaces/feast-rag/secrets — to retrieve mTLS certificates for the Ray client connection

Both calls fail with 403 Forbidden because the service account has no RBAC grants for these resources.

Expected Behavior

When the Feast Operator detects that batchEngine is configured with a KubeRay cluster, it should automatically create the following RBAC resources for the Feast deployment's service account:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: <featurestore-name>-ray-access
rules:
- apiGroups: ["ray.io"]
  resources: ["rayclusters"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: <featurestore-name>-ray-secrets
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list", "watch", "create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: <featurestore-name>-ray-access
subjects:
- kind: ServiceAccount
  name: <featurestore-name>
roleRef:
  kind: Role
  name: <featurestore-name>-ray-access
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: <featurestore-name>-ray-secrets
subjects:
- kind: ServiceAccount
  name: <featurestore-name>
roleRef:
  kind: Role
  name: <featurestore-name>-ray-secrets
  apiGroup: rbac.authorization.k8s.io

These resources should be:

  • Owned by the FeatureStore CR (via ownerReferences) so they are garbage-collected on deletion
  • Only created when batchEngine is configured with a KubeRay-type ConfigMap
  • Scoped to the FeatureStore namespace (namespace-scoped Roles, not ClusterRoles)

Current Workaround

Users must manually create the Roles and RoleBindings shown above before running materialization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    OperatorFeast operator related issues

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions