# L3 M2.2: Authorization & Multi-Tenant Access Control

**Track:** GCC Compliance Basics  
**Module:** M2 - Security & Access Control  
**Video:** M2.2 - Authorization & Multi-Tenant Access Control  
**Level:** L3 (Production-Ready)

---

## LEARNING ARC

By completing this notebook, you will:

1. **Design and implement RBAC** with three roles (Admin, Analyst, Compliance Officer) for RAG operations
2. **Build namespace-based multi-tenant isolation** in Pinecone vector database
3. **Configure ABAC using Open Policy Agent (OPA)** for context-aware access control
4. **Prove zero cross-tenant data leakage** through penetration testing and namespace enforcement
5. **Implement immutable audit logging** with 7-year retention for regulatory compliance
6. **Understand the distinction** between authentication (who are you) and authorization (what can you access)
7. **Deploy policy-as-code** using Rego language for version-controlled authorization rules

---

## PREREQUISITES

- ‚úÖ Completed Generic CCC M1-M4 (RAG MVP implementation)
- ‚úÖ Completed GCC Compliance M2.1 (Authentication & Identity Management)
- ‚úÖ Understanding of OAuth 2.0/OIDC and JWT tokens
- ‚úÖ Basic knowledge of multi-tenant architectures

---

**TIME:** 40-45 minutes

**ARCHITECTURE:**
```
User Request (JWT) ‚Üí RBAC Check ‚Üí ABAC Policy ‚Üí Namespace Isolation ‚Üí Audit Log ‚Üí Response
```

In [None]:
# Environment Setup and OFFLINE Mode Guard

import os
import sys
from pathlib import Path

# Add parent directory to path for imports
parent_dir = Path().resolve().parent
sys.path.insert(0, str(parent_dir))

# Import from package
from src.l3_m2_security_access_control import (
    AuthorizationManager,
    NamespaceManager,
    AuditLogger,
    query_with_authorization,
    check_rbac_permission,
    evaluate_abac_policy,
)

# Check service availability
import sys
sys.path.insert(0, str(parent_dir))

try:
    import config
    PINECONE_ENABLED = config.PINECONE_ENABLED
    POSTGRES_ENABLED = config.POSTGRES_ENABLED
    OPA_ENABLED = config.OPA_ENABLED
except:
    PINECONE_ENABLED = False
    POSTGRES_ENABLED = False
    OPA_ENABLED = False

if not PINECONE_ENABLED:
    print("‚ö†Ô∏è PINECONE disabled - running in demo mode with mock data")
if not POSTGRES_ENABLED:
    print("‚ö†Ô∏è PostgreSQL disabled - using in-memory storage")
if not OPA_ENABLED:
    print("‚ö†Ô∏è OPA disabled - using RBAC only (no ABAC)")

print("\n‚úÖ Environment check complete")
print(f"   Services: Pinecone={PINECONE_ENABLED}, PostgreSQL={POSTGRES_ENABLED}, OPA={OPA_ENABLED}")

## Section 1: Introduction & Hook

### The Multi-Tenant Challenge

You've deployed a RAG system for your organization's Government Community Cloud (GCC). M2.1 solved **authentication** - you know WHO each user is via JWT tokens.

But now you face a critical question:

**"What happens when a Finance analyst queries the system and accidentally (or maliciously) tries to access HR employee records?"**

Without proper authorization:
- ‚ùå Cross-tenant data leakage
- ‚ùå Regulatory violations (SOX, GDPR, DPDPA)
- ‚ùå Audit trail gaps
- ‚ùå No context-aware access control

### The Building Badge Analogy

- **Authentication (M2.1):** Your badge proves WHO you are
- **Authorization (M2.2):** Your badge level determines WHAT floors you can access

**Example:**
- Finance analyst ‚Üí Access only `finance-prod` namespace
- HR analyst ‚Üí Access only `hr-prod` namespace
- Admin ‚Üí Access ALL namespaces
- Compliance Officer ‚Üí Read-only access to ALL + audit logs

## Section 2: Conceptual Foundation

### Authentication vs. Authorization

| Aspect | Authentication (M2.1) | Authorization (M2.2) |
|--------|----------------------|----------------------|
| **Question** | Who are you? | What can you access? |
| **Technology** | OAuth 2.0, JWT, OIDC | RBAC, ABAC, OPA |
| **Output** | User identity + claims | Permission decision |
| **Example** | "alice@company.com" | "Can alice query finance-prod?" |

### RBAC (Role-Based Access Control)

**Three-Role Hierarchy:**

1. **Admin:**
   - Full control over all namespaces
   - User management (assign users to namespaces)
   - Policy configuration

2. **Analyst:**
   - Query access ONLY to assigned namespace
   - No cross-tenant access
   - Typical use: Finance analyst queries `finance-prod`

3. **Compliance Officer:**
   - Read-only access across ALL namespaces
   - Audit log export permissions
   - Cannot modify data

### ABAC (Attribute-Based Access Control)

Context-aware policies using Open Policy Agent (OPA):

```rego
# Example OPA policy
allow {
    input.user.role == "analyst"
    input.user.location == "US"
    input.resource.classification == "internal"
    time_within_business_hours
}
```

**Evaluated Attributes:**
- User: role, location, department
- Resource: namespace, classification (confidential/internal)
- Environment: time, IP address, device type

## Section 3: Technology Stack & Setup

### Core Components

1. **Pinecone Vector Database:** Multi-tenant namespace isolation
2. **PostgreSQL 15:** User/role/permission database + immutable audit logs
3. **Open Policy Agent (OPA):** ABAC policy engine
4. **FastAPI:** REST API framework
5. **JWT Tokens:** From M2.1 authentication

### Architecture Flow

```
1. User submits query with JWT token
   ‚Üì
2. Extract user_id, role, namespace from token
   ‚Üì
3. RBAC: Check role-permission mapping
   ‚Üì
4. ABAC: Evaluate OPA policy (if enabled)
   ‚Üì
5. Execute Pinecone query with namespace filter
   ‚Üì
6. Write immutable audit log
   ‚Üì
7. Return results or 403 Forbidden
```

In [None]:
# Initialize Authorization Manager

auth_manager = AuthorizationManager(
    pinecone_client=None,  # Will use mock in demo mode
    db_engine=None,
    opa_client=None,
)

namespace_manager = NamespaceManager(
    pinecone_client=None,
    db_engine=None,
)

audit_logger = AuditLogger(db_engine=None)

print("‚úÖ Managers initialized successfully")

## Section 4: RBAC Implementation

### Role-Permission Mapping

Let's implement the three-role hierarchy with granular permissions.

In [None]:
# Test RBAC Permission Checks

# Scenario 1: Admin accessing Finance namespace
result1 = check_rbac_permission(
    user_role="admin",
    user_namespace="admin-prod",
    target_namespace="finance-prod"
)
print("Scenario 1 - Admin ‚Üí Finance:")
print(f"  Allowed: {result1['allowed']}")
print(f"  Reason: {result1['reason']}")
print()

# Scenario 2: Analyst accessing own namespace
result2 = check_rbac_permission(
    user_role="analyst",
    user_namespace="finance-prod",
    target_namespace="finance-prod"
)
print("Scenario 2 - Analyst ‚Üí Own Namespace:")
print(f"  Allowed: {result2['allowed']}")
print(f"  Reason: {result2['reason']}")
print()

# Scenario 3: Analyst attempting cross-tenant access (ZERO LEAKAGE TEST)
result3 = check_rbac_permission(
    user_role="analyst",
    user_namespace="finance-prod",
    target_namespace="hr-prod"
)
print("Scenario 3 - Analyst ‚Üí Cross-Tenant (HR):")
print(f"  Allowed: {result3['allowed']}")
print(f"  Reason: {result3['reason']}")
print()

# Scenario 4: Compliance Officer read-all
result4 = check_rbac_permission(
    user_role="compliance_officer",
    user_namespace="audit-prod",
    target_namespace="legal-prod"
)
print("Scenario 4 - Compliance Officer ‚Üí Legal:")
print(f"  Allowed: {result4['allowed']}")
print(f"  Reason: {result4['reason']}")

# Expected:
# Scenario 1: Allowed (Admin has full access)
# Scenario 2: Allowed (Analyst accessing assigned namespace)
# Scenario 3: DENIED (Cross-tenant access blocked)
# Scenario 4: Allowed (Compliance has read-all)

## Section 5: Namespace Isolation

### Multi-Tenant Architecture

Each business unit receives a dedicated Pinecone namespace:

- `finance-prod` ‚Üí Finance department (1500 documents)
- `hr-prod` ‚Üí Human Resources (800 documents)
- `legal-prod` ‚Üí Legal department (600 documents)
- `admin-prod` ‚Üí Administration (300 documents)

**Isolation Guarantee:** Queries to `finance-prod` can NEVER retrieve documents from `hr-prod`.

In [None]:
# Create Namespace for New Business Unit

new_namespace = namespace_manager.create_namespace(
    namespace="marketing-prod",
    business_unit="Marketing",
    region="US"
)

print("Namespace Created:")
print(f"  Name: {new_namespace['namespace']}")
print(f"  Business Unit: {new_namespace['business_unit']}")
print(f"  Region: {new_namespace['region']}")
print(f"  Status: {new_namespace['status']}")
print(f"  Created At: {new_namespace['created_at']}")

# Expected: Namespace created successfully with metadata

In [None]:
# List User-Accessible Namespaces

# Admin can see all
admin_namespaces = namespace_manager.list_user_namespaces(
    user_id="admin@company.com",
    user_role="admin"
)
print("Admin Accessible Namespaces:")
print(f"  {admin_namespaces}")
print()

# Analyst sees only assigned namespace
analyst_namespaces = namespace_manager.list_user_namespaces(
    user_id="bob@company.com",
    user_role="analyst"
)
print("Analyst Accessible Namespaces:")
print(f"  {analyst_namespaces}")

# Expected:
# Admin: ['finance-prod', 'hr-prod', 'legal-prod', 'admin-prod']
# Analyst: ['finance-prod'] (assigned namespace only)

## Section 6: ABAC Policy Evaluation

### Context-Aware Access Control

ABAC adds an extra layer by evaluating:
- **Location:** US vs. India vs. other
- **Time:** Business hours vs. after hours
- **Classification:** Internal vs. Confidential
- **Device:** Corporate laptop vs. personal device

**Use Case:** Even if an analyst has RBAC permission, ABAC might deny access if they're querying from an unauthorized location.

In [None]:
# Simulate ABAC Policy Evaluation (Mock OPA)

# Note: In production, this would call real OPA endpoint
# Here we demonstrate the concept

def mock_abac_evaluation(user_location, classification, time_of_day):
    """Mock ABAC policy for demonstration."""
    # Policy: US users can access confidential data during business hours
    if user_location == "US" and time_of_day == "business_hours":
        return {"allowed": True, "reason": "ABAC policy satisfied"}
    elif user_location != "US" and classification == "confidential":
        return {"allowed": False, "reason": "Confidential data requires US location"}
    elif time_of_day == "after_hours":
        return {"allowed": False, "reason": "Access denied after business hours"}
    else:
        return {"allowed": True, "reason": "ABAC policy satisfied"}

# Test ABAC scenarios
print("ABAC Test 1 - US location, business hours, confidential:")
result1 = mock_abac_evaluation("US", "confidential", "business_hours")
print(f"  Allowed: {result1['allowed']} - {result1['reason']}")
print()

print("ABAC Test 2 - India location, confidential data:")
result2 = mock_abac_evaluation("IN", "confidential", "business_hours")
print(f"  Allowed: {result2['allowed']} - {result2['reason']}")
print()

print("ABAC Test 3 - After hours access attempt:")
result3 = mock_abac_evaluation("US", "internal", "after_hours")
print(f"  Allowed: {result3['allowed']} - {result3['reason']}")

# Expected:
# Test 1: Allowed
# Test 2: Denied (location restriction)
# Test 3: Denied (time restriction)

## Section 7: Authorized Query Execution

### Complete Authorization Flow

Now let's execute queries with full RBAC + ABAC + Namespace Isolation + Audit Logging.

In [None]:
# Authorized Query - Admin Accessing Finance

result = query_with_authorization(
    query="Show Q3 revenue projections",
    user_id="alice@company.com",
    user_role="admin",
    user_namespace="admin-prod",
    target_namespace="finance-prod",
    context={"location": "US", "time": "business_hours"},
    pinecone_client=None,  # Mock mode
    opa_client=None,
)

print("Query Result:")
print(f"  Status: {result['status']}")
if result['status'] == 'success':
    print(f"  Namespace: {result['results']['namespace']}")
    print(f"  Matches: {len(result['results'].get('matches', []))} documents")
    print(f"  Audit Log: {result['audit_log']['timestamp']}")
else:
    print(f"  Reason: {result['reason']}")

# Expected: Success - Admin has full access

In [None]:
# Authorized Query - Analyst Accessing Own Namespace

result = query_with_authorization(
    query="Show Q3 budget allocations",
    user_id="bob@company.com",
    user_role="analyst",
    user_namespace="finance-prod",
    target_namespace="finance-prod",
    context={"location": "US"},
    pinecone_client=None,
    opa_client=None,
)

print("Query Result:")
print(f"  Status: {result['status']}")
if result['status'] == 'success':
    print(f"  Namespace: {result['results']['namespace']}")
    print(f"  Audit Decision: {result['audit_log']['decision']}")

# Expected: Success - Analyst accessing assigned namespace

In [None]:
# CRITICAL TEST: Cross-Tenant Access Denial (Zero Leakage)

result = query_with_authorization(
    query="Show employee records",
    user_id="bob@company.com",
    user_role="analyst",
    user_namespace="finance-prod",
    target_namespace="hr-prod",  # Attempting cross-tenant access
    context={"location": "US"},
    pinecone_client=None,
    opa_client=None,
)

print("Cross-Tenant Access Test:")
print(f"  Status: {result['status']}")
print(f"  Reason: {result.get('reason', 'N/A')}")
print(f"  Audit Decision: {result['audit_log']['decision']}")
print(f"  Policy Used: {result['audit_log']['policy_used']}")

# Expected: DENIED - Cross-tenant access blocked by RBAC
# This proves zero data leakage between namespaces

## Section 8: Immutable Audit Logging

### Compliance-Grade Audit Trail

Every authorization decision is logged immutably:
- **7-year retention** for regulatory compliance (SOX, GDPR, DPDPA)
- **Write-once:** PostgreSQL table with INSERT-only permissions
- **Correlation IDs:** Link to M2.3 (Encryption & Secrets Management)

**Audit Log Fields:**
- `timestamp` - When
- `user_id` - Who
- `action` - What (query, create_namespace, etc.)
- `namespace` - Where
- `decision` - Result (allowed, denied, error)
- `policy_used` - How (RBAC, ABAC, RBAC+ABAC)

In [None]:
# Create Audit Log Entries

# Log successful access
log1 = audit_logger.log_access_attempt(
    user_id="alice@company.com",
    action="query",
    namespace="finance-prod",
    resources_accessed=["fin-001", "fin-002"],
    decision="allowed",
    policy_used="RBAC",
    context={"location": "US", "ip": "10.0.1.50"}
)

print("Audit Log Entry 1 (Allowed):")
print(f"  Timestamp: {log1['timestamp']}")
print(f"  User: {log1['user_id']}")
print(f"  Decision: {log1['decision']}")
print(f"  Resources: {log1['resources_accessed']}")
print()

# Log denied access
log2 = audit_logger.log_access_attempt(
    user_id="bob@company.com",
    action="query",
    namespace="hr-prod",
    decision="denied",
    policy_used="RBAC",
    context={"attempted_cross_tenant": True}
)

print("Audit Log Entry 2 (Denied):")
print(f"  Timestamp: {log2['timestamp']}")
print(f"  User: {log2['user_id']}")
print(f"  Decision: {log2['decision']}")
print(f"  Context: {log2['context']}")

# Expected: Both entries logged with immutable timestamps
# In production, these would be INSERT-ed into PostgreSQL

## Section 9: Common Failures & Troubleshooting

### Production Issues and Solutions

| Failure | Cause | Solution |
|---------|-------|----------|
| **Cross-Tenant Data Leak** | Missing namespace filter | Always enforce `namespace=user_namespace` |
| **Permission Denied (403)** | Wrong role-permission mapping | Verify JWT `role` claim from M2.1 |
| **ABAC Policy Violations** | Context mismatch | Review OPA Rego policy syntax |
| **JWT Token Expiration** | Token expired (30 min default) | Implement refresh flow from M2.1 |
| **Audit Log Not Immutable** | UPDATE/DELETE grants | `REVOKE UPDATE, DELETE ON audit_logs` |
| **Namespace Race Condition** | Concurrent assignments | Use database transactions |
| **OPA Connection Timeout** | OPA container not running | Check `docker ps \| grep opa` |

## Section 10: GCC-Specific Enterprise Context

### Regulatory Compliance Mapping

**SOX (Sarbanes-Oxley):**
- ‚úÖ 7-year audit retention enforced
- ‚úÖ Immutable audit trail (write-once)
- ‚úÖ Separation of duties (analyst vs. admin)

**GDPR (General Data Protection Regulation):**
- ‚úÖ Right to access: Compliance officer queries
- ‚úÖ Right to erasure: Admin namespace deletion
- ‚úÖ Data minimization: Namespace isolation

**DPDPA (Digital Personal Data Protection Act - India):**
- ‚úÖ Data localization: Pinecone India region
- ‚úÖ Consent-based access: ABAC policies
- ‚úÖ Audit trail for data access

### Cost Structure (Small GCC: 20 users, 50 tenants, 5K docs)

| Service | Monthly Cost |
|---------|-------------:|
| Pinecone (1 pod) | ‚Çπ5,500 ($70) |
| PostgreSQL RDS | ‚Çπ2,500 ($30) |
| OPA (self-hosted) | ‚Çπ500 ($5) |
| **Total** | **‚Çπ8,500 ($105)** |

### Performance SLA
- **Uptime:** 99.9%
- **Query Latency:** <200ms p95 (with RBAC+ABAC)
- **Audit Log Write:** <50ms

## Section 11: Decision Card & When to Use

### ‚úÖ Use Multi-Tenant Authorization When:

1. **Serving 20+ business units** on shared infrastructure
2. **Regulatory compliance requires** proof of zero cross-tenant leakage
3. **Fine-grained permissions needed** beyond admin/user
4. **Audit trail is mandatory** with 7+ year retention
5. **Context-aware access control** needed (location, time, device)

### ‚ùå Avoid This Pattern When:

1. **Single-tenant applications** (overhead not justified)
2. **<10 users** with simple access needs
3. **No compliance requirements** (simpler auth may suffice)
4. **Performance MORE critical than security** (adds latency)
5. **Rapid prototyping phase** (implement in production, not MVP)

### Deployment Tiers

**Tier 1 - Basic RBAC (20-50 tenants):** ‚Çπ30K-50K/month
- RBAC only, basic audit logging

**Tier 2 - RBAC+ABAC (50-100 tenants):** ‚Çπ75K-1.25L/month
- OPA policies, 7-year audit retention

**Tier 3 - Enterprise (100+ tenants):** ‚Çπ1.5L+/month
- Custom policies, 24/7 monitoring

## Section 12: Summary & Next Steps

### What You've Learned

‚úÖ **RBAC Implementation:** Three-role hierarchy with granular permissions  
‚úÖ **Namespace Isolation:** Zero cross-tenant data leakage via Pinecone namespaces  
‚úÖ **ABAC Policies:** Context-aware access control using Open Policy Agent  
‚úÖ **Immutable Audit Logging:** 7-year retention for regulatory compliance  
‚úÖ **Production Authorization:** Complete flow from JWT ‚Üí RBAC ‚Üí ABAC ‚Üí Query  

### Real-World Impact

**Before M2.2:**
- ‚ùå Any authenticated user could query all data
- ‚ùå No cross-tenant isolation
- ‚ùå No audit trail

**After M2.2:**
- ‚úÖ Role-based permissions enforced
- ‚úÖ Mathematically proven zero leakage
- ‚úÖ Complete compliance-grade audit trail

### Next Steps

1. **Complete M2.3:** Encryption & Secrets Management
   - Encrypt audit logs at rest
   - Secure JWT secret key storage
   - Implement key rotation

2. **Implement Custom ABAC Policies:**
   - Write Rego policies for your business logic
   - Test with `opa eval` CLI
   - Deploy via GitOps

3. **Add Advanced Roles:**
   - Data Steward (can modify namespace data)
   - Security Auditor (read-only audit logs)
   - Custom roles per business unit

4. **Integrate Enterprise IdP:**
   - Okta, Azure AD, AWS Cognito for SSO
   - Sync roles from identity provider

5. **Deploy to Production:**
   - Infrastructure-as-code (Terraform)
   - CI/CD pipeline for policy updates
   - Monitoring and alerting (Grafana)

### Resources

- üìò **Augmented Script:** [Augmented_GCC_Compliance_M2_2_Authorization_Multi.md](https://github.com/yesvisare/gcc_comp_ai_ccc_l2/blob/main/Augmented_GCC_Compliance_M2_2_Authorization_Multi.md)
- üìö **Pinecone Namespaces:** https://docs.pinecone.io/docs/namespaces
- üîê **Open Policy Agent:** https://www.openpolicyagent.org/docs/latest/
- üöÄ **FastAPI Security:** https://fastapi.tiangolo.com/tutorial/security/

---

**Congratulations!** You've completed L3 M2.2: Authorization & Multi-Tenant Access Control. You're now ready to deploy production-grade authorization for GCC RAG systems.