# RBAC Setup - Role-Based Access Control

This notebook introduces Weaviate's Role-Based Access Control (RBAC) system, which provides fine-grained access control based on user roles and permissions.

## What you'll learn:
- Basic RBAC concepts in Weaviate
- How to set up users and roles
- Creating custom permissions
- Managing access to collections and data

## Weaviate Python Client & RBAC


RBAC (Role-Based Access Control) requires:
- Weaviate v1.29+ (RBAC introduced)
- Root user access or role management permissions

In [None]:
import os
from dotenv import load_dotenv

load_dotenv()

WEAVIATE_KEY = os.getenv("WEAVIATE_KEY")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_URL = os.getenv("OPENAI_URL")

print(f"Weaviate Key:{WEAVIATE_KEY}")
print(f"OpenAI API Key: {OPENAI_API_KEY[:20]}")
print(f"OpenAI URL: {OPENAI_URL}")



In [None]:
import weaviate
from weaviate.classes.init import Auth

# Connect to the local instance
client = weaviate.connect_to_local(
  host="127.0.0.1", # the address to the learner's instance
  port=8080,
  grpc_port=50051,
  auth_credentials=Auth.api_key(WEAVIATE_KEY),
  headers={
    "X-OpenAI-Api-Key": OPENAI_API_KEY
  }
)

print(client.is_ready())

In [None]:
# Creating a local config for later
LOCAL_CONFIG = {
    "host": "localhost",
    "port": 8080,
    "grpc_port": 50051,
    "headers": {
        "X-OpenAI-Api-Key": OPENAI_API_KEY,
        "X-OpenAI-BaseURL": OPENAI_URL
    }
}

## Understanding RBAC Components

* [Weaviate Docs - Rule Based Access Control (RBAC) Overview](https://docs.weaviate.io/weaviate/configuration/rbac)


Weaviate RBAC consists of three main components:

* [Weaviate Docs - Managing Users](https://docs.weaviate.io/weaviate/configuration/rbac/manage-users)

1. **Users**: Individual accounts with API keys
2. **Roles**: Collections of permissions
3. **Permissions**: Specific actions on resources

### Predefined Roles
* [Weaviate Docs - Managing Roles](https://docs.weaviate.io/weaviate/configuration/rbac/manage-roles)
- `root`: Full system access
- `viewer`: Read-only access to all resources

## Create a Test Collection

Let's create a collection we can use for our RBAC exercises.

In [None]:
from weaviate.classes.config import Configure, Property, DataType

# Clean up any existing collections
collection_names = ["CompanyData", "PublicInfo"]
for name in collection_names:
    if client.collections.exists(name):
        client.collections.delete(name)

# Create CompanyData collection (sensitive data)
client.collections.create(
    name = "CompanyData",
    vector_config=Configure.Vectors.self_provided(),
    properties=[
        Property(name="employee_name", data_type=DataType.TEXT),
        Property(name="salary", data_type=DataType.NUMBER),
        Property(name="department", data_type=DataType.TEXT),
    ],
)

# Create PublicInfo collection (public data)
client.collections.create(
   name = "PublicInfo",
    vector_config=Configure.Vectors.self_provided(),
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
    ],
)

print("Collections created successfully!")

## Add Sample Data

Let's add some sample data to our collections for testing.

In [None]:
# Add sample data to our collections
company_data = client.collections.get("CompanyData")
company_data.data.insert_many([
    {
        "employee_name": "Alice Johnson",
        "salary": 85000,
        "department": "Engineering"
    },
    {
        "employee_name": "Jim Smith", 
        "salary": 92000,
        "department": "Engineering"
    },
    {
        "employee_name": "Carol Davis",
        "salary": 78000,
        "department": "Marketing"
    }
])

# Add data to PublicInfo
public_info = client.collections.get("PublicInfo")
public_info.data.insert_many([
    {
        "title": "Company Mission",
        "content": "To build the world's best vector database"
    },
    {
        "title": "Office Hours", 
        "content": "Monday to Friday, 9 AM to 5 PM"
    }
])

print(f"CompanyData objects: {company_data.aggregate.over_all().total_count}")
print(f"PublicInfo objects: {public_info.aggregate.over_all().total_count}")

## List Existing Roles

Let's see what roles are currently available.

> **Note**: If RBAC is not enabled on your Weaviate instance, this will show an error - that's expected!

In [None]:
# Clean up existing custom roles first
custom_roles_to_delete = ["ceo", "dept_manager", "employee", "hr_manager"]

existing_roles = client.roles.list_all()
for role_name in custom_roles_to_delete:
    if role_name in existing_roles:
        client.roles.delete(role_name=role_name)
            

In [None]:
roles = client.roles.list_all()
print("Current roles:")
for role in roles:
    print(f"- {role}")

## Create Custom Roles

Now let's create some custom roles for different types of users in our organization.

In [None]:
from weaviate.classes.rbac import Permissions

# Role 1: HR Manager - Can access all company data
hr_permissions = [
    Permissions.data(
        collection="CompanyData",
        create=True,
        read=True,
        update=True,
        delete=True
    ),
    Permissions.data(
        collection="PublicInfo",
        read=True
    )
]

client.roles.create(role_name="hr_manager", permissions=hr_permissions)
print("Created 'hr_manager' role")

# Role 2: Employee - Can only read public information
employee_permissions = [
    Permissions.data(
        collection="PublicInfo",
        read=True
    )
]

client.roles.create(role_name="employee", permissions=employee_permissions)
print("Created 'employee' role")

# Role 3: Department Manager - Can read company data but not modify
dept_manager_permissions = [
    Permissions.data(
        collection="CompanyData",
        read=True
    ),
    Permissions.data(
        collection="PublicInfo",
        read=True
    )
]

client.roles.create(role_name="dept_manager", permissions=dept_manager_permissions)
print("Created 'dept_manager' role")

In [None]:
# Check what roles were created
all_roles = client.roles.list_all()
print("Created roles:")
for role_name, role in all_roles.items():
    print(f"- {role_name}")

In [None]:
all_roles = client.roles.list_all()
for role_name, role in all_roles.items():
    print(role_name, role)

In [None]:
# Create users and assign roles

# Clean up any existing user first
existing_users = client.users.db.list_all()
existing_user_ids = [user.user_id for user in existing_users]

if "hr_alice" in existing_user_ids:
    client.users.db.delete(user_id="hr_alice")
    print("Deleted existing hr_alice user")

# Create new user
hr_alice_key = client.users.db.create(user_id="hr_alice")
print(f" Created user: hr_alice")

client.users.db.assign_roles(user_id="hr_alice", role_names=["hr_manager"])
print(f" Assigned hr_manager role to hr_alice")

print(f"API Key: {hr_alice_key[:20]}...")

In [None]:
# Create more users and assign roles

# Clean up any existing users first
existing_users = client.users.db.list_all()
existing_user_ids = [user.user_id for user in existing_users]

if "employee_jim" in existing_user_ids:
    client.users.db.delete(user_id="employee_jim")
    print("Deleted existing employee_jim user")

if "manager_carol" in existing_user_ids:
    client.users.db.delete(user_id="manager_carol")
    print("Deleted existing manager_carol user")

# Create employee_bob
employee_jim_key = client.users.db.create(user_id="employee_jim")
print(f"Created user: employee_jim")

client.users.db.assign_roles(user_id="employee_jim", role_names=["employee"])
print(f"Assigned employee role to employee_jim")

# Create manager_carol
manager_carol_key = client.users.db.create(user_id="manager_carol")
print(f"Created user: manager_carol")

client.users.db.assign_roles(user_id="manager_carol", role_names=["dept_manager"])
print(f"Assigned dept_manager role to manager_carol")

print(f"Jim API Key: {employee_jim_key[:20]}...")
print(f"Carol API Key: {manager_carol_key[:20]}...")

## Test Access Control

Now let's test our RBAC setup by connecting as different users and trying to access data.

### Test with HR user, has admin access so should be able to do almost  anything...almost

In [None]:
# Create new client with hr_alice's API key

alice_client = weaviate.connect_to_local(
    **LOCAL_CONFIG,
    auth_credentials=Auth.api_key(hr_alice_key),
)

alice_client.is_ready()

In [None]:
# Test 1: Try to read PublicInfo 

public_collection = alice_client.collections.get("PublicInfo")
result = public_collection.query.fetch_objects(limit=1)
    
print(result)

In [None]:
# Test 2: Try to read CompanyData 

company_collection = alice_client.collections.get("CompanyData")
result = company_collection.query.fetch_objects(limit=1)

print(result)

In [None]:
# Test 3: Try to insert into CompanyData 

company_collection = alice_client.collections.get("CompanyData")
company_collection.data.insert({
    "employee_name": "Test Employee",
    "salary": 50000,
    "department": "Test"
})

In [None]:
# Test 4: Try to insert into PublicInfo 

company_collection = alice_client.collections.get("PublicInfo")
company_collection.data.insert({
        "title": "Donut Wednesdays",
        "content": "Free donuts in the break room all day on Wednesdays"
    })


alice_client.close()

In [None]:
# Set up friendly error handling for Jupyter
from IPython.core.interactiveshell import InteractiveShell
from weaviate.exceptions import InsufficientPermissionsError

def custom_exception_handler(self, exc_type, exc_value, exc_traceback, tb_offset=None):
    if exc_type == InsufficientPermissionsError:
        print("🧙🏻‍♂️You shall not pass - insufficient permissions!")
        return
    
    # For all other exceptions, use the default handler
    self.showtraceback((exc_type, exc_value, exc_traceback), tb_offset=tb_offset)

# Get the current IPython instance and override the exception handler
shell = InteractiveShell.instance()
shell.set_custom_exc((InsufficientPermissionsError,), custom_exception_handler)

# !!!Important!!! This will remove the message from all 403 errors, even those related to schema issues

### Now, test with employee data, should have limited access based on role and permissions

In [None]:
# Create new client with Jim's API key

jim_client = weaviate.connect_to_local(
    **LOCAL_CONFIG,
    auth_credentials=Auth.api_key(employee_jim_key),
)

jim_client.is_ready()

In [None]:
# Test 1: Try to read PublicInfo 

public_collection = jim_client.collections.get("PublicInfo")
result = public_collection.query.fetch_objects(limit=1)
    
print(result)

In [None]:
# Test 2: Try to read CompanyData 

company_collection = jim_client.collections.get("CompanyData")
result = company_collection.query.fetch_objects(limit=1)

print(result)

In [None]:
# Test 3: Try to insert into CompanyData 

company_collection = jim_client.collections.get("CompanyData")
company_collection.data.insert({
    "employee_name": "Test Employee 1",
    "salary": 50000,
    "department": "Test"
})


david_client.close()

### Now, test with a manager, we should see more access than the employe but not as much as HR.

In [None]:
# Test manager_carol access (Manager)

# Create new client with manager_carol's API key
carol_client = weaviate.connect_to_local(
    **LOCAL_CONFIG,
    auth_credentials=Auth.api_key(manager_carol_key),
)

# Test 1: Try to read PublicInfo (should work)
print("Testing PublicInfo access...")
try:
    public_collection = carol_client.collections.get("PublicInfo")
    result = public_collection.query.fetch_objects(limit=1)
    print("✅ Can read PublicInfo")
except Exception as e:
    print(f"❌ Cannot read PublicInfo: {e}")

# Test 2: Try to read CompanyData (should work for HR)
print("Testing CompanyData access...")
try:
    company_collection = carol_client.collections.get("CompanyData")
    result = company_collection.query.fetch_objects(limit=1)
    print("✅ Can read CompanyData")
except Exception as e:
    print("❌ Cannot read CompanyData")

# Test 3: Try to insert into CompanyData (should work for HR)
print("Testing CompanyData insert...")
try:
    company_collection = carol_client.collections.get("CompanyData")
    company_collection.data.insert({
        "employee_name": "Test Employee",
        "salary": 50000,
        "department": "Test"
    })
    print("✅ Can insert into CompanyData")
except Exception as e:
    print("❌ Cannot insert into CompanyData")

carol_client.close()

# BONUS Create a Super User with admin role and access to all collections

In [None]:
ceo_permissions = [
    Permissions.data(
        collection="*", #This will give access to all collections
        create=True,
        read=True,
        update=True,
        delete=True
    ),
]

client.roles.create(role_name="ceo", permissions=ceo_permissions)
print("Created 'ceo' role")

In [None]:
client.users.db.delete(user_id="ceo_bob")
print("Deleted existing ceo_bob user")

# Create new user
ceo_bob_key = client.users.db.create(user_id="ceo_bob")
print(f"Created user: ceo_bob")

client.users.db.assign_roles(user_id="ceo_bob", role_names=["ceo"])
print(f"Assigned ceo role to ceo_bob")

print(f"API Key: {ceo_bob_key[:20]}...")

In [None]:
# Test 3: Try to insert into PublicInfo 
bob_client = weaviate.connect_to_local(
    **LOCAL_CONFIG,
    auth_credentials=Auth.api_key(ceo_bob_key),
)

company_collection = bob_client.collections.get("PublicInfo")
company_collection.data.insert({
        "title": "Vinyl Wednesdays",
        "content": "Vinyl only music played in the breakroom on Wednesdays"
    })


bob_client.close()

# BONUS Creating Collections

In [None]:
alice_client = weaviate.connect_to_local(
    **LOCAL_CONFIG,
    auth_credentials=Auth.api_key(hr_alice_key),
)

alice_client.collections.create(
    name = "CompanyDataNew",
    vector_config=Configure.Vectors.self_provided(),
    properties=[
        Property(name="employee_name", data_type=DataType.TEXT),
        Property(name="salary", data_type=DataType.NUMBER),
        Property(name="department", data_type=DataType.TEXT),
    ],
)

In [None]:
bob_client = weaviate.connect_to_local(
    **LOCAL_CONFIG,
    auth_credentials=Auth.api_key(ceo_bob_key),
)

bob_client.collections.create(
    name = "CompanyDataNew",
    vector_config=Configure.Vectors.self_provided(),
    properties=[
        Property(name="employee_name", data_type=DataType.TEXT),
        Property(name="salary", data_type=DataType.NUMBER),
        Property(name="department", data_type=DataType.TEXT),
    ],
)


In [None]:
if client.collections.exists("CompanyDataNew"):
    client.collections.delete("CompanyDataNew")

client.collections.create(
    name = "CompanyDataNew",
    vector_config=Configure.Vectors.self_provided(),
    properties=[
        Property(name="employee_name", data_type=DataType.TEXT),
        Property(name="salary", data_type=DataType.NUMBER),
        Property(name="department", data_type=DataType.TEXT),
    ],
)

## Close the Client

In [None]:
alice_client.close()

In [None]:
bob_client.close()

In [None]:
client.close()