# Part X — Advanced Topics  
## 41. Multi‑Tenancy Patterns (SaaS Architectures) — Tenant Isolation, Scoping, Security, and “No Data Leaks”

You already built an org-scoped tasks system. That’s the *seed* of a SaaS
architecture. This chapter turns that seed into a **repeatable, safe multi-tenant
pattern** you can apply to:

- B2B SaaS apps (each customer = tenant)
- internal multi-department systems
- white-labeled platforms
- agencies managing multiple client workspaces

The #1 failure mode in multi-tenancy is catastrophic:
- **Tenant A sees Tenant B’s data** (a data leak)

This chapter is about making that failure mode hard to introduce, easy to detect,
and easy to prevent.

---

## 41.0 Learning Outcomes

By the end, you should be able to:

1. Explain the three major tenant isolation strategies:
   - shared DB, shared schema (tenant_id column)
   - shared DB, schema-per-tenant (PostgreSQL schemas)
   - DB-per-tenant
2. Choose a strategy based on:
   - security requirements
   - operational complexity
   - scaling needs
   - regulatory/compliance constraints
3. Implement **shared DB multi-tenancy** safely:
   - tenant identification (subdomain/path/header)
   - tenant context middleware
   - queryset scoping patterns that are difficult to bypass
4. Apply tenant isolation consistently to:
   - ORM queries
   - DRF APIs
   - admin
   - caching keys
   - background tasks
   - webhooks
   - realtime (Channels groups)
5. Write **regression tests** that catch tenant leaks reliably.
6. Produce a “multi-tenant security checklist” for your project.

---

# 41.1 What Is a Tenant (In Django Terms)?

A **tenant** is a boundary of data and permissions.

In your project, an `Organization` already behaves like a tenant:
- `Task.organization`
- `Membership.organization`
- task list URLs: `/orgs/<org_slug>/tasks/`

So: **Organization = Tenant** (shared DB).

That is the most common starting point.

---

# 41.2 The Three Tenant Isolation Strategies (Industry Reality)

## 41.2.1 Strategy A — Shared DB, Shared Schema (tenant_id column)
All customers’ data live in the same tables, separated by a tenant identifier
column like `organization_id`.

Example:
- `tasks_task(organization_id, ...)`

**Pros**
- simplest to build and operate
- easiest migrations (one schema)
- easiest analytics across tenants (with care)
- cheapest infrastructure

**Cons**
- highest risk of “bug leaks data across tenants”
- noisy-neighbor risk (one tenant’s heavy usage affects others)
- harder to meet strict compliance “hard isolation” requirements

**When to choose**
- most SaaS apps early and mid scale
- internal tools
- teams that want fastest time-to-market

You are already here.

---

## 41.2.2 Strategy B — Shared DB, Schema-per-tenant (PostgreSQL schemas)
Each tenant has its own schema:
- `tenant_acme.tasks_task`
- `tenant_beta.tasks_task`

Apps use PostgreSQL’s `search_path` to route queries to the correct tenant schema.

**Pros**
- much stronger isolation than shared tables
- tenant-specific migrations possible (still tricky)
- can drop/move a tenant’s data more cleanly

**Cons**
- more complex migrations and ops
- cross-tenant queries are harder
- tooling overhead
- requires Postgres and careful setup

**When to choose**
- multi-tenant apps with stronger isolation needs
- customers demanding separation but not full DB-per-tenant
- teams comfortable with Postgres ops

Common ecosystem package: `django-tenants` / `django-tenant-schemas` (approach, not mandatory).

---

## 41.2.3 Strategy C — DB-per-tenant
Each tenant has their own database.

**Pros**
- strongest isolation boundary (at DB level)
- easy “noisy neighbor” isolation
- per-tenant backups, restores, upgrades possible

**Cons**
- most operational complexity (many DBs)
- migrations become a fleet problem
- costs increase
- cross-tenant analytics requires data warehouse/ETL

**When to choose**
- high compliance requirements
- very large tenants with dedicated DB needs
- enterprise SaaS with strict isolation contracts

---

# 41.3 The “Tenant Leak” Threat Model (How Leaks Actually Happen)

Tenant leaks usually happen due to:

1. **Missing scoping in a query**
   - `Task.objects.get(id=123)` instead of `Task.objects.get(id=123, organization=org)`
2. **Wrong tenant derived from request**
   - using user-selected org id without membership check
3. **Inconsistent scoping in background tasks**
   - task runs without tenant context and queries across all tenants
4. **Caching without tenant in key**
   - cached value for tenant A reused for tenant B
5. **Admin “shortcut” views**
   - staff tools that list all tenants but accidentally expose data to non-staff

Your goal is not just “remember to filter by organization.”  
Your goal is “make it hard to forget.”

---

# 41.4 Tenant Identification (How You Decide Which Tenant the Request Is For)

There are three common patterns:

## 41.4.1 Path-based tenancy (what you already use)
Example:
- `/orgs/<org_slug>/tasks/`

**Pros**
- simple
- explicit
- easy to test
- works with one domain

**Cons**
- URLs are longer
- your frontend must always include org slug

This is a great default for most Django apps.

## 41.4.2 Subdomain-based tenancy (common SaaS UX)
Example:
- `https://acme.example.com/`
- `https://beta.example.com/`

**Pros**
- clean URLs
- strong “tenant feel” (white-label friendly)

**Cons**
- requires DNS/wildcard TLS setup
- requires careful cookie/CSRF settings
- local dev setup is more complex

## 41.4.3 Header-based tenancy (common for internal APIs)
Example:
- `X-Tenant: acme`

**Pros**
- works well for machine clients
- good for internal platforms

**Cons**
- easier to misuse
- must be strictly authorized (never “trust header = tenant”)

**Industry rule:** path/subdomain are generally safer for web apps; headers can be fine for internal APIs.

---

# 41.5 Implement Tenant Context (Shared DB Approach)

Even if you keep path-based tenancy, it’s very useful to have a “tenant context” so
you don’t pass `org` everywhere manually.

We’ll implement a safe, explicit approach:

- Middleware resolves tenant (Organization) and attaches to request:
  - `request.tenant`
- A small helper reads tenant from request in views
- Selectors always accept tenant explicitly (still recommended)

## 41.5.1 Create a request-local tenant context (thread-local)
Create `orgs/tenant_context.py`:

```python
from __future__ import annotations

import threading
from typing import Optional

_local = threading.local()


def set_current_tenant_id(value: Optional[int]) -> None:
    _local.tenant_id = value


def get_current_tenant_id() -> Optional[int]:
    return getattr(_local, "tenant_id", None)
```

### Why thread-local exists here
- Logging filters, audit logs, and some service code can access tenant id without
  passing it through many function calls.
- You must still be careful: thread-local is per-thread, so it must be cleared per request.

---

## 41.5.2 Middleware: resolve tenant from URL and enforce membership
You already have `get_org_for_user_or_404`. We’ll reuse it.

Create `orgs/middleware.py`:

```python
from __future__ import annotations

import re
from typing import Optional

from django.http import HttpRequest

from orgs.services import get_org_for_user_or_404
from orgs.tenant_context import set_current_tenant_id

ORG_PATH_RE = re.compile(r"^/orgs/(?P<org_slug>[-\w]+)/")


class TenantMiddleware:
    """
    Path-based tenant resolver.
    - If request path starts with /orgs/<org_slug>/..., resolve tenant.
    - Attach request.tenant (Organization).
    - Enforce membership for authenticated users via get_org_for_user_or_404.

    Policy:
    - If user is not authenticated and tries /orgs/<slug>/..., return 404
      (or redirect to login if you prefer).
    """

    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request: HttpRequest):
        set_current_tenant_id(None)
        request.tenant = None  # type: ignore[attr-defined]

        match = ORG_PATH_RE.match(request.path)
        if match:
            org_slug = match.group("org_slug")

            # This enforces membership; it raises 404 if org not found or not member.
            org = get_org_for_user_or_404(user=request.user, org_slug=org_slug)

            request.tenant = org  # type: ignore[attr-defined]
            set_current_tenant_id(org.id)

        try:
            response = self.get_response(request)
        finally:
            set_current_tenant_id(None)

        return response
```

### Where to register middleware
In `config/settings.py` (or base settings), place it **after auth middleware** so
`request.user` exists:

```python
MIDDLEWARE = [
    "django.middleware.security.SecurityMiddleware",
    "config.middleware.RequestIdMiddleware",
    "config.access_log_middleware.AccessLogMiddleware",
    "django.contrib.sessions.middleware.SessionMiddleware",
    "django.middleware.common.CommonMiddleware",
    "django.middleware.csrf.CsrfViewMiddleware",
    "django.contrib.auth.middleware.AuthenticationMiddleware",

    "orgs.middleware.TenantMiddleware",

    "django.contrib.messages.middleware.MessageMiddleware",
    "django.middleware.clickjacking.XFrameOptionsMiddleware",
]
```

### Why ordering matters
- TenantMiddleware uses `request.user`, which requires `AuthenticationMiddleware`.
- It must run before your views so `request.tenant` is available.

---

## 41.5.3 Use tenant context in views (cleanly)
In a task view you can now do:

```python
org = request.tenant
if org is None:
    # This endpoint is tenant-scoped; if no tenant, 404.
    raise Http404
```

But: still keep explicit org lookup for non-path routes. TenantMiddleware only
resolves `/orgs/<slug>/...`.

---

# 41.6 Tenant-Scoped Query Patterns (How to Make Leaks Hard)

There are two main patterns. Many teams use both.

## 41.6.1 Pattern A (recommended): Selectors require tenant explicitly
This is safest because the function signature forces correct scoping.

Example (you already do something similar):

```python
# tasks/selectors.py
def task_qs_for_org(*, organization) -> QuerySet[Task]:
    return Task.objects.filter(organization=organization) ...
```

This prevents accidental leaks because:
- you can’t “forget tenant” without calling a different selector.

## 41.6.2 Pattern B (advanced): A TenantQuerySet that auto-filters using context
This can reduce boilerplate but increases hidden magic. Use carefully.

Example:

```python
# core/tenancy.py
from django.db import models
from orgs.tenant_context import get_current_tenant_id


class TenantQuerySet(models.QuerySet):
    def for_current_tenant(self):
        tenant_id = get_current_tenant_id()
        if tenant_id is None:
            raise RuntimeError("No tenant in context")
        return self.filter(organization_id=tenant_id)


class TenantManager(models.Manager):
    def get_queryset(self):
        return TenantQuerySet(self.model, using=self._db)
```

Then in a tenant model:

```python
class Task(models.Model):
    objects = TenantManager()
```

Usage:

```python
Task.objects.for_current_tenant().filter(status="open")
```

### Why this is risky
- A developer might accidentally use `Task._base_manager` or other managers.
- Admin and scripts might not have tenant context, causing RuntimeErrors.
- Hidden filters can confuse query analysis.

**Industry guidance:** Use Pattern A unless you have a strong team-wide convention and strong tests.

---

# 41.7 Tenant Isolation in DRF (API Multi-Tenancy)

API multi-tenancy tends to leak if you don’t enforce it at the queryset level.

## 41.7.1 A base mixin for tenant-scoped viewsets
Create `core/api/tenant.py`:

```python
from __future__ import annotations

from rest_framework.exceptions import NotFound

from orgs.services import get_org_for_user_or_404


class TenantFromUrlMixin:
    """
    Resolves tenant from URL kwarg `org_slug` and membership rules.
    Provides self.tenant.
    """

    tenant_kwarg = "org_slug"

    def get_tenant(self):
        org_slug = self.kwargs.get(self.tenant_kwarg)
        if not org_slug:
            raise NotFound("Tenant not specified.")
        return get_org_for_user_or_404(user=self.request.user, org_slug=org_slug)
```

Then in your `TaskViewSet.get_queryset()`:

```python
org = self.get_tenant()
return task_qs_for_org(organization=org)
```

### Why this matters
- Multi-tenancy must be enforced at the query boundary.
- Permissions alone are not enough if queryset includes other tenants’ data.

---

# 41.8 Tenant Isolation in Caching (Common Leak Vector)

Any cache key must include tenant identity if the cached value is tenant-specific.

Bad key:
- `"top_tasks"` (shared across tenants)

Good keys:
- `f"top_tasks:org:{org.id}:v1"`
- `f"task_counts:org:{org.id}:status:{status}:v2"`

Example utility:

```python
def tenant_cache_key(org_id: int, *parts: str) -> str:
    return "myapp:org:" + str(org_id) + ":" + ":".join(parts)
```

Then:

```python
key = tenant_cache_key(org.id, "task_counts", "v1")
```

**Rule:** if you can’t prove a cache entry is public, include tenant id in key.

---

# 41.9 Tenant Isolation in Background Tasks (Celery)

Background tasks often run without a request object, so they can’t use
`request.tenant`. You must pass tenant identifiers explicitly.

## 41.9.1 Pass org_id into tasks
Good:

```python
run_task_export_job_task.delay(job.id)  # job stores organization_id
```

Or:

```python
some_task.delay(org_id=org.id, ...)
```

Inside tasks:
- always scope queries by org_id

Example safe pattern:

```python
Task.objects.filter(organization_id=org_id, ...)
```

## 41.9.2 Avoid “global queries” in tasks
Never do:

```python
Task.objects.filter(status="open")
```

in multi-tenant background jobs unless it’s truly cross-tenant (rare; typically only for staff/system operations).

---

# 41.10 Tenant Isolation in Webhooks

Webhooks often include tenant identifiers (account_id, org_slug, etc.). You must:

- verify signature
- deduplicate event id
- resolve tenant from webhook payload in a safe mapping (not “trust whatever they send”)
- scope all writes to that tenant

A good pattern:
- store a table mapping provider_account_id → organization_id
- do not use “org_slug from payload” unless your provider guarantees it and you validate it.

---

# 41.11 Tenant Isolation in Channels (WebSockets)

You already used group names:
- `org-<org_id>`

This is exactly correct.

**Rule:** group naming must include tenant boundary so you never broadcast tenant A
events to tenant B sockets.

Also:
- enforce org membership before adding the socket to the group (you did)

---

# 41.12 Admin and Tenant Isolation (A Common “Oops” Area)

Admin is often staff-only, but you may have “tenant admins” who are staff users.

Decide:
- Are tenant admins allowed into Django admin?
- If yes, you must filter admin querysets by tenant.

Example admin pattern (for TaskAdmin) if tenant admins exist:

```python
def get_queryset(self, request):
    qs = super().get_queryset(request)
    if request.user.is_superuser:
        return qs
    # If you store the user’s “current org” somewhere, filter by it.
    # Otherwise admin must be superuser only.
    return qs.none()
```

**Industry reality:** Many SaaS apps keep Django admin only for internal staff and
build a separate “tenant admin UI” for customers.

---

# 41.13 Tests That Catch Tenant Leaks (Non‑Optional)

You want tests that fail the moment someone forgets a tenant filter.

## 41.13.1 Regression test: guessing task id from another org must not work
Create `tests/test_tenant_isolation.py` (pytest) or `tasks/tests_tenant.py`.

Django `TestCase` example:

```python
from django.contrib.auth import get_user_model
from django.test import TestCase
from django.urls import reverse

from orgs.models import Membership, Organization
from tasks.models import Task


class TenantIsolationTests(TestCase):
    def setUp(self):
        User = get_user_model()

        self.alice = User.objects.create_user(username="alice", password="pass12345")
        self.bob = User.objects.create_user(username="bob", password="pass12345")

        self.org_a = Organization.objects.create(name="OrgA", slug="orga")
        self.org_b = Organization.objects.create(name="OrgB", slug="orgb")

        Membership.objects.create(
            organization=self.org_a, user=self.alice, role=Membership.Role.MEMBER
        )
        Membership.objects.create(
            organization=self.org_b, user=self.bob, role=Membership.Role.MEMBER
        )

        self.task_b = Task.objects.create(
            organization=self.org_b,
            title="B task",
            description="x",
            status=Task.Status.OPEN,
            priority=Task.Priority.MEDIUM,
            assigned_to=None,
            created_by=self.bob,
            updated_by=self.bob,
        )

    def test_member_cannot_access_other_org_task_by_id(self):
        self.client.login(username="alice", password="pass12345")

        url = reverse(
            "tasks:detail",
            kwargs={"org_slug": "orga", "task_id": self.task_b.id},
        )
        response = self.client.get(url)

        # Because detail lookup must include organization=org, this should be 404.
        self.assertEqual(response.status_code, 404)
```

### Why this test is powerful
It simulates the real attacker behavior:
- guess IDs
- call endpoints directly
It catches missing org scoping instantly.

---

# 41.14 Multi‑Tenancy Checklist (Use This Every Time)

### Tenant resolution
- [ ] Tenant is derived from path/subdomain and validated
- [ ] Membership/authorization is enforced at tenant resolution
- [ ] `request.tenant` exists for tenant-scoped routes
- [ ] Tenant context cleared after request (no leakage across requests)

### ORM scoping
- [ ] Every tenant model query includes tenant filter (selectors enforce it)
- [ ] `get_object_or_404` always filters by tenant
- [ ] No global `.all()` usage in tenant code paths

### APIs
- [ ] ViewSets filter querysets by tenant derived from URL
- [ ] Object permissions do not rely on unscoped querysets

### Caching
- [ ] Tenant included in cache keys for tenant-specific data
- [ ] Shared caches/CDN do not cache authenticated tenant pages publicly

### Background jobs
- [ ] Tasks receive org_id/job_id and scope queries
- [ ] No cross-tenant background queries unless explicitly intended and protected

### Realtime
- [ ] group names include tenant id
- [ ] membership enforced before join
- [ ] payload contains minimal data

### Tests
- [ ] regression tests for “guess id” across tenants
- [ ] regression tests for export/download across tenants
- [ ] regression tests for API list/detail scoping

---

# 41.15 Exercises (Do These Before Proceeding)

1. Implement `TenantMiddleware` and verify `request.tenant` exists on:
   - `/orgs/<slug>/tasks/`
2. Add a log enrichment:
   - in AccessLogMiddleware, include tenant_id if available
3. Add tenant-aware caching:
   - cache “task counts” per org
   - include org id in cache key
4. Add 3 tenant leak regression tests:
   - task detail id guessing
   - export download id guessing
   - task API detail id guessing
5. Decide and document your tenant identification strategy:
   - path vs subdomain
   - explain why it fits your product

---

## 41.16 Chapter Summary

- Multi-tenancy is primarily about **data isolation** and **preventing leaks**.
- Shared DB + tenant_id is the most common and practical strategy, but requires
  strict scoping discipline.
- Tenant resolution must be authenticated/authorized (membership checks), not just
  “read a slug.”
- The safest pattern is selectors/services that require tenant explicitly.
- Tenant context affects caching, background jobs, webhooks, and WebSockets.
- High-value regression tests are essential: they catch tenant leak bugs before production.

---

Next chapter: **42. Advanced Authorization (Policy‑Based Access)**  
We’ll move from ad-hoc checks to a scalable authorization system: object-level
policies, role definitions, attribute-based rules, auditing, and testing patterns
for permission correctness.

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='../9. Deployment_and_production_operations/40. operations.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='42. advanced_authorization.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
