# Part VI — Async, Realtime, and Background Work  
## 29. Background Tasks (Celery / RQ Concepts) — Production-Safe Jobs, Retries, Scheduling, Idempotency

Background work is how professional Django systems stay responsive and reliable.

You *do not* want to:
- send emails inside request/response in production
- generate large CSV exports while the user waits
- call slow external APIs inside a web request
- run scheduled jobs (cleanup, reminders) manually

Instead you use a **task queue**:
- the web process enqueues a job quickly
- a worker process executes it asynchronously
- failures are retried
- progress/status is tracked

This chapter will:
1) teach the concepts (so you can reason about any queue system), and  
2) implement a real setup with **Celery + Redis** (most common in Django industry), and  
3) show a lightweight alternative using **RQ** (simpler, fewer features).

---

## 29.0 Learning Outcomes

By the end, you should be able to:

- Explain the task queue architecture: **producer → broker → worker → result store**.
- Decide when to use background tasks vs async views vs cron.
- Set up **Celery + Redis** with Django correctly.
- Write tasks with:
  - retries + exponential backoff
  - time limits
  - idempotency (no double-sends / duplicate exports)
- Use `transaction.on_commit()` so tasks only run after DB commit.
- Build a real workflow:
  - “export tasks to CSV” as a background job with progress and download link
  - send realtime notification via Channels when job completes
- Test tasks reliably in CI (eager mode).
- Operate workers in production (separate process, monitoring, safe config).

---

## 29.1 Background Jobs: What They Are (and What They Are Not)

### 29.1.1 Background tasks are for “slow or unreliable work”
Good candidates:
- sending emails (SMTP/provider can be slow)
- generating reports/exports
- resizing images / video processing
- syncing with third-party APIs
- periodic reminders and cleanup tasks
- webhooks processing with retries

Bad candidates:
- tasks that must finish before you can respond safely to user (then it’s not
  background; it’s part of the request)
- CPU-heavy work on the same box without limits (it can starve workers)

### 29.1.2 Background tasks are not async views
- Async views help handle concurrent I/O **within a request**.
- Background tasks move work **out of the request**, so the request stays fast.

Often you use both:
- request enqueues task quickly
- task does the slow work later (and may itself do async I/O if you choose)

---

## 29.2 Task Queue Architecture (Industry Mental Model)

A typical Celery architecture:

```text
Django web process
  |
  |  enqueue task (message)
  v
Broker (Redis / RabbitMQ)
  |
  |  workers pull messages
  v
Celery workers (separate processes)
  |
  |  do work, possibly store results/status
  v
Result backend / DB / cache / files
```

Key components:

- **Broker**: a queue system (Redis or RabbitMQ).
- **Worker**: process that executes tasks.
- **Result backend**: optional; store task return values (often not needed if you
  store status in your DB models).

**Industry note:** Many teams use Redis as broker; RabbitMQ is also common and can
be better for certain patterns. Start with Redis for simplicity.

---

## 29.3 Celery Setup (Django + Redis) — Step-by-Step

We’ll implement the “standard Celery layout” used in many Django projects.

### 29.3.1 Install dependencies

Add to `requirements.txt` (runtime deps):

```text
celery
redis
```

Install:

```bash
python -m pip install -r requirements.txt
python -m pip freeze > requirements.txt
```

You also need Redis running.

### 29.3.2 Run Redis locally (Docker)

```bash
docker run --rm -p 6379:6379 redis:7-alpine
```

Leave it running.

### 29.3.3 Add Celery config file: `config/celery.py`

Create `config/celery.py`:

```python
from __future__ import annotations

import os

from celery import Celery

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings")

app = Celery("config")

# Loads any CELERY_* settings from Django settings.py
app.config_from_object("django.conf:settings", namespace="CELERY")

# Auto-discover tasks.py in installed apps
app.autodiscover_tasks()
```

### 29.3.4 Ensure Celery app loads when Django starts

Edit `config/__init__.py`:

```python
from .celery import app as celery_app

__all__ = ["celery_app"]
```

Why this matters:
- In some deployment patterns, Celery expects the app to be discoverable.
- This is the conventional integration.

### 29.3.5 Add Celery settings in Django settings

In `config/settings.py` (or `prod.py`/`dev.py`), add:

```python
import os

CELERY_BROKER_URL = os.environ.get("CELERY_BROKER_URL", "redis://127.0.0.1:6379/0")

# Optional: store task results (often you can skip this and store results in DB)
CELERY_RESULT_BACKEND = os.environ.get(
    "CELERY_RESULT_BACKEND",
    "redis://127.0.0.1:6379/1",
)

CELERY_ACCEPT_CONTENT = ["json"]
CELERY_TASK_SERIALIZER = "json"
CELERY_RESULT_SERIALIZER = "json"

CELERY_TIMEZONE = "UTC"

# Recommended safety defaults
CELERY_TASK_TRACK_STARTED = True
CELERY_TASK_TIME_LIMIT = 60 * 10      # hard timeout 10 min
CELERY_TASK_SOFT_TIME_LIMIT = 60 * 9  # soft timeout 9 min
```

#### Why JSON-only serialization is an industry standard
Celery supports pickle, but pickle is risky:
- it can deserialize arbitrary Python objects (security risk)
- JSON is safer and more portable

---

## 29.4 Run Celery Worker (Local Development)

In a separate terminal (with venv active):

```bash
celery -A config worker -l info
```

If you want scheduled tasks later, you’ll also run Celery Beat:

```bash
celery -A config beat -l info
```

**Important operational rule:** Django web process and Celery worker are separate
processes. If your worker isn’t running, tasks won’t execute.

---

## 29.5 Your First Task (Prove Wiring Works)

Create `pages/tasks.py`:

```python
from __future__ import annotations

import time

from celery import shared_task


@shared_task
def ping_task() -> dict:
    time.sleep(0.2)
    return {"status": "ok"}
```

### 29.5.1 Call it from Django shell

```bash
python manage.py shell
```

```python
from pages.tasks import ping_task

result = ping_task.delay()
result.id
result.get(timeout=5)
```

Expected:
- the worker logs execution
- `.get()` returns `{"status": "ok"}`

> In production, you often avoid calling `.get()` in web requests (it defeats the
> purpose of background work).

---

## 29.6 The Most Important Celery Practices (Non-Negotiable in Real Systems)

### 29.6.1 Use `transaction.on_commit()` for tasks triggered by DB writes

**Problem:** If you enqueue a task before the DB transaction commits, the worker may
run immediately and not find the DB row yet (or read stale data).

**Solution:** enqueue after commit:

```python
from django.db import transaction

transaction.on_commit(lambda: my_task.delay(obj_id))
```

This is one of the most important “professional Celery + Django” patterns.

### 29.6.2 Make tasks idempotent (assume at-least-once delivery)
Task queues can deliver the same message more than once (worker restart, retry,
broker behavior). Your task should be safe if executed twice.

Common idempotency strategies:
- write a DB row with a unique constraint (e.g., “published_email_sent”)
- check and return if already done
- use unique keys for exports/events

### 29.6.3 Retry only what is retryable
Don’t retry validation errors. Retry:
- network timeouts
- temporary provider failures
- transient DB connection errors

### 29.6.4 Keep tasks small and composable
Prefer:
- “send one email”
- “generate one export job”
over:
- “do everything for 10 minutes in one task”

---

## 29.7 Refactor “Article Published Email” to Background Task (Real Upgrade)

You currently send “published” emails from your service layer. That’s fine for
learning but not ideal for production. We’ll enqueue a task instead.

### 29.7.1 Create a task: `articles/tasks.py`

```python
from __future__ import annotations

import logging

from celery import shared_task
from django.db import transaction

from articles.models import Article
from articles.services_email import send_article_published_email

logger = logging.getLogger(__name__)


@shared_task(
    bind=True,
    autoretry_for=(Exception,),
    retry_backoff=True,
    retry_jitter=True,
    retry_kwargs={"max_retries": 5},
)
def send_article_published_email_task(self, article_id: int) -> None:
    """
    Background email sender with retries.
    Must be idempotent at the business rule layer (see next section).
    """
    try:
        article = Article.objects.select_related("author").get(id=article_id)
    except Article.DoesNotExist:
        logger.warning("Article not found for email task article_id=%s", article_id)
        return

    # Policy: only send for published articles.
    if article.status != Article.Status.PUBLISHED:
        logger.info(
            "Skipping published email; article not published id=%s status=%s",
            article.id,
            article.status,
        )
        return

    send_article_published_email(article=article)
```

#### Explanation of Celery retry options
- `bind=True` gives you `self` (task instance).
- `autoretry_for=(Exception,)` retries on exceptions.
- `retry_backoff=True` uses exponential backoff.
- `retry_jitter=True` adds randomness (prevents retry storms).
- `max_retries=5` prevents infinite retries.

In production you should be more selective than `(Exception,)`, but this is a good
workbook starting point.

### 29.7.2 Add idempotency so “published email” is not sent twice

Create a model to record sending. Add to `articles/models.py`:

```python
from django.db import models


class ArticleNotification(models.Model):
    class Kind(models.TextChoices):
        PUBLISHED_EMAIL = "published_email", "Published email"

    article = models.ForeignKey(
        "articles.Article",
        on_delete=models.CASCADE,
        related_name="notifications",
    )
    kind = models.CharField(max_length=50, choices=Kind.choices)
    created_at = models.DateTimeField(auto_now_add=True)

    class Meta:
        constraints = [
            models.UniqueConstraint(
                fields=["article", "kind"],
                name="unique_article_notification_kind",
            )
        ]

    def __str__(self) -> str:
        return f"{self.kind} for article {self.article_id}"
```

Migrate:

```bash
python manage.py makemigrations
python manage.py migrate
```

Now update the task to be idempotent:

```python
from django.db import IntegrityError

from articles.models import Article, ArticleNotification

@shared_task(...)
def send_article_published_email_task(self, article_id: int) -> None:
    ...
    try:
        ArticleNotification.objects.create(
            article=article,
            kind=ArticleNotification.Kind.PUBLISHED_EMAIL,
        )
    except IntegrityError:
        logger.info("Published email already sent for article_id=%s", article.id)
        return

    send_article_published_email(article=article)
```

#### Why this is the “correct” pattern
Even if:
- the task retries
- the broker delivers twice
- you accidentally enqueue twice

…only one email is sent because the DB enforces uniqueness.

### 29.7.3 Enqueue the task after commit in your publish workflow

Wherever you detect “published_now” (in your article service), do:

```python
from django.db import transaction
from articles.tasks import send_article_published_email_task

if published_now:
    transaction.on_commit(
        lambda: send_article_published_email_task.delay(article.id)
    )
```

This ensures:
- article row exists and committed before task runs
- “published email” logic is not tied to request latency

---

## 29.8 Background Export Job (Tasks CSV) — Full Production Pattern

Your current CSV export is synchronous: user clicks export, server builds CSV in the
request. That’s okay for small data but not at scale.

We’ll build:

- a DB model tracking export jobs
- an endpoint to start export (returns 202 Accepted + job ID)
- a Celery task that generates file in background
- a download endpoint for completed jobs
- an optional realtime notification via Channels when complete

### 29.8.1 Create an ExportJob model

Create `tasks/models_exports.py` (or in `tasks/models.py` if you prefer one file).
To keep it clear, we’ll put it in `tasks/models.py` for now.

Add to `tasks/models.py`:

```python
from __future__ import annotations

from django.conf import settings
from django.db import models

from orgs.models import Organization


class TaskExportJob(models.Model):
    class Status(models.TextChoices):
        PENDING = "pending", "Pending"
        RUNNING = "running", "Running"
        DONE = "done", "Done"
        FAILED = "failed", "Failed"

    organization = models.ForeignKey(
        Organization,
        on_delete=models.CASCADE,
        related_name="task_exports",
    )
    created_by = models.ForeignKey(
        settings.AUTH_USER_MODEL,
        on_delete=models.PROTECT,
        related_name="task_exports",
    )

    # Store filters used for export so the result is reproducible/auditable.
    filters = models.JSONField(default=dict, blank=True)

    status = models.CharField(
        max_length=20,
        choices=Status.choices,
        default=Status.PENDING,
    )
    error = models.TextField(blank=True)

    file = models.FileField(
        upload_to="task_exports/",
        null=True,
        blank=True,
    )

    created_at = models.DateTimeField(auto_now_add=True)
    started_at = models.DateTimeField(null=True, blank=True)
    finished_at = models.DateTimeField(null=True, blank=True)

    class Meta:
        ordering = ["-created_at"]
        indexes = [
            models.Index(fields=["organization", "status", "-created_at"]),
        ]

    def __str__(self) -> str:
        return f"Export {self.id} for org {self.organization_id} ({self.status})"
```

Migrate:

```bash
python manage.py makemigrations
python manage.py migrate
```

### 29.8.2 Start export view (HTTP 202 + job id)

Create `tasks/views_exports.py`:

```python
from __future__ import annotations

from django.contrib.auth.decorators import login_required
from django.core.exceptions import PermissionDenied
from django.http import JsonResponse
from django.urls import reverse
from django.views.decorators.http import require_POST

from orgs.services import get_membership, get_org_for_user_or_404
from tasks.permissions import can_export_tasks
from tasks.models import TaskExportJob
from tasks.tasks_exports import run_task_export_job_task


@login_required
@require_POST
def task_export_start(request, org_slug: str):
    org = get_org_for_user_or_404(user=request.user, org_slug=org_slug)
    membership = get_membership(user=request.user, organization=org)

    if not can_export_tasks(membership=membership):
        raise PermissionDenied

    # Capture filter params from request.GET or request.POST as you prefer.
    # For exports, GET query params are common even if the start endpoint is POST.
    filters = {
        "q": request.GET.get("q", ""),
        "status": request.GET.get("status", ""),
        "priority": request.GET.get("priority", ""),
        "assigned_to": request.GET.get("assigned_to", ""),
    }

    job = TaskExportJob.objects.create(
        organization=org,
        created_by=request.user,
        filters=filters,
        status=TaskExportJob.Status.PENDING,
    )

    # Enqueue background job after commit (best practice).
    from django.db import transaction

    transaction.on_commit(lambda: run_task_export_job_task.delay(job.id))

    return JsonResponse(
        {
            "status": "accepted",
            "job_id": job.id,
            "job_status_url": reverse(
                "tasks:export_status",
                kwargs={"org_slug": org.slug, "job_id": job.id},
            ),
            "job_download_url": reverse(
                "tasks:export_download",
                kwargs={"org_slug": org.slug, "job_id": job.id},
            ),
        },
        status=202,
    )
```

### 29.8.3 Status and download endpoints

Add to `tasks/views_exports.py`:

```python
from django.http import FileResponse, Http404

@login_required
def task_export_status(request, org_slug: str, job_id: int):
    org = get_org_for_user_or_404(user=request.user, org_slug=org_slug)
    job = TaskExportJob.objects.filter(organization=org, id=job_id).first()
    if job is None:
        raise Http404

    # Authorization policy: only org admins can see exports,
    # or allow the creator to see their own export jobs.
    membership = get_membership(user=request.user, organization=org)
    if not can_export_tasks(membership=membership) and job.created_by_id != request.user.id:
        raise PermissionDenied

    return JsonResponse(
        {
            "job_id": job.id,
            "status": job.status,
            "error": job.error,
            "created_at": job.created_at.isoformat(),
            "started_at": job.started_at.isoformat() if job.started_at else None,
            "finished_at": job.finished_at.isoformat() if job.finished_at else None,
            "has_file": bool(job.file),
        }
    )


@login_required
def task_export_download(request, org_slug: str, job_id: int):
    org = get_org_for_user_or_404(user=request.user, org_slug=org_slug)
    job = TaskExportJob.objects.filter(organization=org, id=job_id).first()
    if job is None:
        raise Http404

    membership = get_membership(user=request.user, organization=org)
    if not can_export_tasks(membership=membership) and job.created_by_id != request.user.id:
        raise PermissionDenied

    if job.status != TaskExportJob.Status.DONE or not job.file:
        raise Http404("Export not ready.")

    # FileResponse streams file; better than reading into memory.
    return FileResponse(
        job.file.open("rb"),
        as_attachment=True,
        filename=f"{org.slug}-tasks-export-{job.id}.csv",
    )
```

### 29.8.4 Wire export URLs

In `tasks/urls.py` add:

```python
from tasks import views_exports

urlpatterns += [
    path("exports/start/", views_exports.task_export_start, name="export_start"),
    path("exports/<int:job_id>/", views_exports.task_export_status, name="export_status"),
    path("exports/<int:job_id>/download/", views_exports.task_export_download, name="export_download"),
]
```

Now your org-scoped export endpoints exist under:

- `/orgs/<org_slug>/tasks/exports/start/?status=open...`

(Your `orgs/urls.py` already scopes tasks URLs under `/orgs/<org_slug>/tasks/`.)

---

## 29.9 The Export Worker Task (Generate CSV in Background)

Create `tasks/tasks_exports.py`:

```python
from __future__ import annotations

import csv
import io
import logging

from celery import shared_task
from django.core.files.base import ContentFile
from django.utils import timezone

from orgs.models import Membership
from tasks.models import Task, TaskExportJob
from tasks.selectors import filter_tasks, task_qs_for_org

logger = logging.getLogger(__name__)


@shared_task(
    bind=True,
    autoretry_for=(Exception,),
    retry_backoff=True,
    retry_jitter=True,
    retry_kwargs={"max_retries": 3},
)
def run_task_export_job_task(self, job_id: int) -> None:
    job = TaskExportJob.objects.select_related("organization", "created_by").get(
        id=job_id
    )

    if job.status in {TaskExportJob.Status.DONE, TaskExportJob.Status.RUNNING}:
        # Idempotency: do nothing if already done/running.
        return

    job.status = TaskExportJob.Status.RUNNING
    job.started_at = timezone.now()
    job.error = ""
    job.save(update_fields=["status", "started_at", "error"])

    try:
        org = job.organization
        filters = job.filters or {}

        qs = task_qs_for_org(organization=org).order_by("-created_at")
        qs = filter_tasks(
            qs=qs,
            q=filters.get("q") or "",
            status=filters.get("status") or None,
            priority=int(filters["priority"])
            if str(filters.get("priority") or "").isdigit()
            else None,
            assigned_to=filters.get("assigned_to") or None,
            actor=job.created_by,
        )

        # Write CSV to memory (fine for moderate sizes).
        # For huge exports, use StreamingHttpResponse or chunked temp files.
        buffer = io.StringIO()
        writer = csv.writer(buffer)
        writer.writerow(
            ["id", "title", "status", "priority", "assigned_to", "created_at"]
        )

        for t in qs.iterator(chunk_size=2000):
            writer.writerow(
                [
                    t.id,
                    t.title,
                    t.status,
                    t.priority,
                    t.assigned_to.username if t.assigned_to else "",
                    t.created_at.isoformat(),
                ]
            )

        content = buffer.getvalue().encode("utf-8")
        buffer.close()

        job.file.save(
            f"{org.slug}-tasks-export-{job.id}.csv",
            ContentFile(content),
            save=False,
        )
        job.status = TaskExportJob.Status.DONE
        job.finished_at = timezone.now()
        job.save(update_fields=["file", "status", "finished_at"])

        # Optional: realtime notification via Channels (if you built Chapter 28)
        try:
            from realtime.broadcast import broadcast_org_event

            broadcast_org_event(
                org_id=org.id,
                payload={
                    "type": "export_done",
                    "job_id": job.id,
                    "download_path": f"/orgs/{org.slug}/tasks/exports/{job.id}/download/",
                },
            )
        except Exception:
            logger.exception("Failed broadcasting export_done job_id=%s", job.id)

    except Exception as e:
        job.status = TaskExportJob.Status.FAILED
        job.finished_at = timezone.now()
        job.error = str(e)
        job.save(update_fields=["status", "finished_at", "error"])
        raise
```

### Critical explanations (why it’s written this way)

- **Idempotency**:
  - If task runs twice, it checks status and avoids double work.
- **Iterator**:
  - `.iterator(chunk_size=2000)` avoids loading all rows into memory.
- **DB status tracking**:
  - Web UI can poll `/export_status` endpoint.
- **FileField storage**:
  - Even in dev, you store exports under `MEDIA_ROOT/task_exports/`.
  - In production you might store in S3 and give signed URL downloads.
- **Retries**:
  - Export job will retry on transient failures up to 3 times.
  - If the task fails consistently, job status becomes FAILED and error is stored.

---

## 29.10 Scheduled Tasks (Celery Beat) — Daily Cleanup and Reminders

Celery Beat is a scheduler that enqueues tasks at intervals (like cron, but in the
Celery ecosystem).

### 29.10.1 Example periodic task: remind about due tasks daily

Create `tasks/tasks_periodic.py`:

```python
from __future__ import annotations

import logging
from datetime import date, timedelta

from celery import shared_task
from django.utils import timezone

from tasks.models import Task

logger = logging.getLogger(__name__)


@shared_task
def daily_due_tasks_reminder() -> None:
    today = timezone.localdate()
    tomorrow = today + timedelta(days=1)

    qs = Task.objects.filter(due_date__in=[today, tomorrow]).select_related(
        "organization",
        "assigned_to",
    )

    count = qs.count()
    logger.info("daily_due_tasks_reminder due_count=%s", count)

    # In a real app:
    # - group by assigned_to
    # - send one email per user
    # - respect notification preferences
    # - be idempotent per day per user
```

### 29.10.2 Schedule it via settings (simple baseline)
In settings:

```python
from celery.schedules import crontab

CELERY_BEAT_SCHEDULE = {
    "daily-due-tasks-reminder": {
        "task": "tasks.tasks_periodic.daily_due_tasks_reminder",
        "schedule": crontab(hour=9, minute=0),
    },
}
```

Run beat:

```bash
celery -A config beat -l info
```

#### Production note: “settings-based beat schedule” vs DB-based schedules
- Settings-based is simple but requires deploy to change schedule.
- Many teams use `django-celery-beat` to manage schedules in the database via admin.
Use DB-based scheduling when:
- ops team wants to change schedules without deployments
- you have multiple dynamic periodic jobs

---

## 29.11 Testing Celery Tasks (Deterministic CI)

### 29.11.1 Eager mode (run tasks inline during tests)
In test settings (or `settings.py` guarded by env var), use:

```python
CELERY_TASK_ALWAYS_EAGER = True
CELERY_TASK_EAGER_PROPAGATES = True
```

Meaning:
- `.delay()` runs immediately in the same process
- exceptions propagate (so your tests fail correctly)

### 29.11.2 Test that publish workflow enqueues task (or sends email once)
If you use eager mode, `.delay()` executes and you can assert email outbox length.

Example (Django TestCase):

```python
from django.core import mail
from django.test import TestCase, override_settings
from django.utils import timezone

from articles.models import Article
from tests.factories import ArticleFactory, UserFactory


@override_settings(CELERY_TASK_ALWAYS_EAGER=True, CELERY_TASK_EAGER_PROPAGATES=True)
class CeleryEmailTests(TestCase):
    def test_published_email_task_sends_once(self):
        user = UserFactory(email="u@example.com")

        article = ArticleFactory(
            status=Article.Status.PUBLISHED,
            published_at=timezone.now(),
            author=user,
        )

        from articles.tasks import send_article_published_email_task

        send_article_published_email_task.delay(article.id)

        self.assertEqual(len(mail.outbox), 1)

        # Calling again should not send again because of ArticleNotification uniqueness
        send_article_published_email_task.delay(article.id)
        self.assertEqual(len(mail.outbox), 1)
```

### 29.11.3 Test export job workflow
- start job (creates TaskExportJob)
- run Celery task eagerly
- assert job DONE and file exists

---

## 29.12 Worker Operations (Production Safety + Performance)

### 29.12.1 Run workers as separate services
In production you typically run:
- web (gunicorn/uvicorn)
- celery worker(s)
- celery beat (scheduler)
- redis (broker) (managed service often)

### 29.12.2 Concurrency settings matter
Celery workers can use prefork processes by default.
You must tune:
- number of workers
- concurrency per worker
- task time limits

### 29.12.3 Avoid “worker death by memory”
If tasks leak memory (common in image processing), use:
- worker recycling options (Celery config)
- smaller tasks
- separate worker pool for heavy tasks

### 29.12.4 Monitoring
Common monitoring tools:
- logs + metrics
- Celery events
- Flower (a web UI) (commonly used)

Even without Flower, you should:
- log task starts/failures
- alert on failure spikes

---

## 29.13 RQ (Redis Queue) — A Simpler Alternative (Concept + Quickstart)

If you want simpler than Celery:
- RQ uses Redis directly
- fewer features than Celery (especially scheduling and complex routing)
- easier to understand initially

High-level pattern:
- create job function
- enqueue with `django-rq`
- run `rqworker`

Use RQ when:
- you need “background jobs” but not complex workflows/schedules
- you want minimal moving parts
- you’re okay with fewer built-in retry/beat patterns (some exist, but ecosystem differs)

Because your workbook already includes scheduling and advanced patterns, Celery is a
better “mastery-level” tool, but knowing RQ exists is valuable.

---

## 29.14 Chapter Capstone Lab (Do This to Lock It In)

1. **Celery wiring**
   - Add `config/celery.py` and `config/__init__.py` integration
   - Add broker settings
   - Start Redis + worker

2. **Email offloading**
   - Create `send_article_published_email_task`
   - Add idempotency via `ArticleNotification` unique constraint
   - Use `transaction.on_commit(...)` when enqueuing

3. **Export job**
   - Add `TaskExportJob` model
   - Add start/status/download endpoints
   - Add `run_task_export_job_task` that generates CSV + stores file
   - (Optional) broadcast `export_done` via Channels group

4. **Tests**
   - enable eager mode for tests
   - test email idempotency
   - test export job completion sets DONE and file exists

---

## 29.15 Common Background Job Mistakes (And How to Avoid Them)

### Mistake A: Tasks depend on uncommitted DB rows
Fix: always enqueue with `transaction.on_commit`.

### Mistake B: Tasks are not idempotent
Fix: unique constraints + “already processed” checks.

### Mistake C: Workers block on slow network and retry storms happen
Fix:
- use backoff + jitter
- cap retries
- add timeouts to external calls
- distinguish retryable vs non-retryable errors

### Mistake D: Huge exports blow memory
Fix:
- stream writing (iterator + file streaming)
- chunked processing
- background generation + download link
- consider object storage for huge files

### Mistake E: Security leak in export jobs
Fix:
- store org_id and actor_id
- re-check permissions at execution time if needed
- scope queries by org always

---

## 29.16 Exercises (Do These Before Proceeding)

1. Add a task queue for “send comment moderation alert to staff”:
   - when a comment is submitted, enqueue an email to staff
   - ensure it’s idempotent per comment

2. Add a periodic cleanup task:
   - delete failed export jobs older than 30 days
   - schedule via Celery beat

3. Add a separate Celery queue:
   - `emails` queue and `exports` queue
   - route tasks accordingly
   - run two workers with different concurrency

4. Add logging context:
   - pass `request_id` into export start task
   - include it in task logs (so you can trace from API request to job logs)

---

## 29.17 Chapter Summary

- Background tasks keep your web requests fast and your system resilient.
- Celery + Redis is a common professional stack.
- The three big rules:
  1) `transaction.on_commit()` for DB-triggered tasks  
  2) idempotency (assume tasks can run twice)  
  3) retries with backoff + timeouts for transient failures
- For large work (exports), use job models + async generation + download endpoints.
- You can integrate background jobs with realtime (Channels) for great UX.

---

Next chapter: **Part VII — 30. PostgreSQL for Django (Practical Database Mastery)**  
We’ll move beyond “ORM basics” into real production DB work: PostgreSQL setup,
indexes, query plans, transactions/locks, full-text search, and safe migrations for
large tables.

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='28. realtime_with_websocets.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='../7. Data_integrations_and_advanced_orm/30. postgres_for_django.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
