```{contents}
```
## Authentication & Authorization 

---

### 1. Why This Matters in Generative AI

Generative AI systems expose **high-value capabilities**:

* expensive model inference
* sensitive data access (documents, chats, embeddings, tools, APIs)
* powerful actions (code execution, database writes, browsing, transactions)

Without strong **Authentication** and **Authorization**, these systems become unsafe, expensive, and non-compliant.

---

### 2. Core Concepts

| Concept            | Purpose             | Question Answered             |
| ------------------ | ------------------- | ----------------------------- |
| **Authentication** | Verify identity     | *Who are you?*                |
| **Authorization**  | Control permissions | *What are you allowed to do?* |

They are independent but always used together.

---

### 3. Authentication in GenAI

Authentication establishes **who is making the request** to the AI system.

#### Common Methods

| Method     | Used For                                |
| ---------- | --------------------------------------- |
| API Keys   | Server-to-server LLM access             |
| OAuth 2.0  | End-user login via Google, GitHub, etc. |
| JWT Tokens | Stateless session verification          |
| mTLS       | High-security service communication     |
| SSO        | Enterprise identity integration         |

#### Typical Authentication Flow

```
User → Login → Identity Provider → Token (JWT)
     → Send Token with each AI request
     → Backend verifies token
     → Request forwarded to LLM
```

#### Example (JWT Verification)

```python
from jose import jwt

def authenticate(token, public_key):
    payload = jwt.decode(token, public_key, algorithms=["RS256"])
    return payload["sub"], payload["roles"]
```

---

### 4. Authorization in GenAI

Authorization controls **what the authenticated entity may do**.

#### What Needs Authorization in GenAI?

* Which **models** a user can access (GPT-4, local model, fine-tuned model)
* Which **tools** the agent may use (browser, database, code executor)
* Which **documents / embeddings** can be retrieved (RAG systems)
* Which **actions** are allowed (write DB, send email, deploy code)
* Usage limits (tokens, rate, cost ceilings)

#### Common Authorization Models

| Model         | Description                                 |
| ------------- | ------------------------------------------- |
| RBAC          | Role-Based Access Control                   |
| ABAC          | Attribute-Based Access Control              |
| Policy Engine | Declarative policies (OPA, Cedar, Zanzibar) |

---

### 5. Example: RBAC in a GenAI App

| Role    | Permissions                       |
| ------- | --------------------------------- |
| User    | Chat, retrieve personal docs      |
| Analyst | Chat, RAG over org data           |
| Admin   | All models, tools, system prompts |
| Agent   | Limited tools only                |

#### Policy Check

```python
def authorize(user_role, action):
    permissions = {
        "user": ["chat"],
        "analyst": ["chat", "rag"],
        "admin": ["chat", "rag", "tool", "config"],
        "agent": ["tool"]
    }
    return action in permissions[user_role]
```

---

### 6. Combined Request Pipeline

```
1. Client sends request with token
2. Authentication: verify identity
3. Authorization: check policy
4. Prompt construction + safety filters
5. Model inference
6. Tool execution (if allowed)
7. Response returned
```

---

### 7. Authorization Inside the Model Workflow

Modern GenAI systems enforce authorization **at multiple layers**:

| Layer             | Enforcement                    |
| ----------------- | ------------------------------ |
| API Gateway       | Who may call which model       |
| Orchestrator      | Which tools can be invoked     |
| RAG Layer         | Which vectors / docs retrieved |
| Agent Controller  | Which actions allowed          |
| Logging & Billing | Usage limits & quotas          |

---

### 8. RAG-Specific Authorization

When retrieving knowledge:

```
Query → Auth Check → Allowed Document Set
      → Vector Search on allowed docs only
      → Context injected into prompt
```

```python
allowed_docs = filter_docs_by_user(user_id)
results = vector_db.search(query, allowed_docs)
```

This prevents **cross-user data leakage**.

---

### 9. Threats Prevented

| Threat                      | Prevented By            |
| --------------------------- | ----------------------- |
| Data leakage                | Authorization           |
| Model abuse                 | Authentication + quotas |
| Prompt injection tool abuse | Tool authorization      |
| Cost explosion              | Usage authorization     |
| Agent takeover              | Capability restrictions |

---

### 10. Summary Table

| Layer    | Authentication   | Authorization       |
| -------- | ---------------- | ------------------- |
| Identity | Who is calling   | —                   |
| API      | Token validation | Model access        |
| RAG      | User identity    | Document access     |
| Agent    | Service identity | Tool permissions    |
| System   | Key ownership    | Cost & quota limits |

---

### 11. Design Principles

* **Never trust the prompt alone**
* **Enforce permissions outside the model**
* **Least-privilege for agents**
* **Separate identity from policy**
* **Audit everything**

---

### 12. Final Intuition

> **Authentication proves identity.
> Authorization limits power.
> Together they make Generative AI safe, scalable, and governable.**
