# Part II — Core Django  
## 9. Django ORM (Querying Like a Pro)

This chapter teaches you how to **think in QuerySets**.

If you can design correct, efficient ORM queries, you can build:
- fast list/detail pages
- clean filtering/search
- correct permissions checks
- scalable APIs
- reports and dashboards

If you *can’t*, you’ll eventually run into:
- N+1 query performance disasters
- duplicated/missing results
- slow pages as data grows
- “works locally, times out in production” problems

We’ll use your existing models:

- `articles.models.Article`
- `articles.models.Tag`

---

## 9.0 Learning Outcomes

By the end of this chapter, you should be able to:

1. Explain what a **QuerySet** is (lazy, composable, cacheable).
2. Predict exactly **when** the database is hit (evaluation triggers).
3. Write expressive queries using:
   - filters/lookups (`icontains`, `gte`, `in`, `isnull`, etc.)
   - `Q` objects (OR/AND/NOT logic)
   - `F` expressions (field-to-field operations)
4. Use relationships correctly:
   - filtering across relations (`tags__slug="django"`)
   - avoiding duplicates with `.distinct()`
5. Use aggregation and annotation:
   - `Count`, `Max`, `Min`, `Avg`, `Sum`
   - annotate tag counts, article counts, etc.
6. Optimize queries with:
   - `select_related` (FK/O2O)
   - `prefetch_related` (M2M/reverse FK)
   - `Prefetch(...)` for filtered prefetches
7. Perform safe bulk operations:
   - `update()`, `delete()`
   - `bulk_create()`, `bulk_update()`
8. Debug ORM behavior:
   - inspect generated SQL
   - measure query counts
   - use `assertNumQueries` in tests
9. Understand transactions basics:
   - `atomic()`
   - `select_for_update()` (conceptually; DB-dependent)

---

## 9.1 ORM Mental Model: QuerySets Are **Lazy Programs**, Not Results

### 9.1.1 QuerySet = “a query description”
When you write:

```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED)
```

you have **not** fetched articles yet. You have created a QuerySet object that
describes *how* to fetch them.

Only later—when you “evaluate” it—does Django hit the database.

### 9.1.2 Why laziness exists (real value in production)
Laziness allows:

- composing filters progressively (based on user inputs)
- avoiding unnecessary DB hits (don’t fetch if not needed)
- letting Django optimize some query generation

Example pattern in real views:

```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED)

tag = request.GET.get("tag")
if tag:
    qs = qs.filter(tags__slug=tag)

q = request.GET.get("q")
if q:
    qs = qs.filter(...)
```

If the user didn’t provide `tag`, you never add that constraint. All of this is
built *before* the DB is queried.

---

## 9.2 When the DB Is Hit (QuerySet Evaluation Triggers)

You must know evaluation triggers because they are the source of performance bugs.

A QuerySet is evaluated when you:

### 9.2.1 Iterate over it
```python
for a in Article.objects.all():
    print(a.title)
```

### 9.2.2 Convert to list
```python
articles = list(Article.objects.all())
```

### 9.2.3 Slice in a way that forces evaluation
- `qs[:10]` returns a sliced QuerySet (still lazy).
- but `qs[0]` fetches a single row immediately.

Examples:

```python
qs = Article.objects.order_by("-created_at")

first_ten = qs[:10]  # lazy
first = qs[0]        # evaluated immediately
```

### 9.2.4 Call methods that must return concrete values
These hit the DB:

- `.get(...)`
- `.count()`
- `.exists()`
- `.first()`, `.last()`
- `.aggregate(...)`

Examples:

```python
Article.objects.filter(status="published").count()
Article.objects.filter(slug="hello-django").exists()
```

### 9.2.5 Beware `len(qs)` and `bool(qs)`
- `len(qs)` evaluates and loads all objects, then counts in Python.
  - This is often a mistake; use `qs.count()` for DB count.
- `bool(qs)` triggers a query to see if any rows exist.
  - Use `qs.exists()` to make intent clear.

Bad:

```python
if qs:
    ...
```

Better:

```python
if qs.exists():
    ...
```

---

## 9.3 QuerySet Caching (Important Subtlety)

Once you evaluate a QuerySet, Django caches the results **inside that QuerySet
instance**.

Example:

```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED)

list(qs)     # hits DB
list(qs)     # does NOT hit DB again (uses cached results)
```

But if you create a new QuerySet (even from the old one), it’s a new query program:

```python
qs2 = qs.filter(slug="hello-django")
list(qs2)  # new DB query
```

**Practical implication:** If you reuse the same QuerySet object multiple times in a
view, you might get caching benefits. But don’t rely on it as a “performance trick”
—use proper optimization patterns (select_related/prefetch_related) instead.

---

## 9.4 Basic Retrieval: `all()`, `get()`, `filter()`, `exclude()`

### 9.4.1 `all()`
```python
Article.objects.all()
```

Returns a QuerySet for all rows.

### 9.4.2 `get()` (exactly one row)
```python
article = Article.objects.get(slug="hello-django")
```

- If 0 rows: raises `Article.DoesNotExist`
- If >1 row: raises `Article.MultipleObjectsReturned`

Use `get()` when you expect exactly one object (detail pages).

### 9.4.3 `filter()` (0..N rows)
```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED)
```

Returns a QuerySet (possibly empty).

### 9.4.4 `exclude()`
```python
qs = Article.objects.exclude(status=Article.Status.ARCHIVED)
```

### 9.4.5 `first()` and `last()`
```python
latest = Article.objects.order_by("-created_at").first()
```

`first()` returns:
- an object or `None`
- hits the DB with `LIMIT 1`

---

## 9.5 Field Lookups (The Core Language of Filtering)

Lookups are specified using double underscores `__`.

### 9.5.1 String lookups
- `contains`: case-sensitive substring
- `icontains`: case-insensitive substring (common search)
- `startswith`, `istartswith`
- `endswith`, `iendswith`
- `exact`, `iexact`

Examples:

```python
Article.objects.filter(title__icontains="django")
Article.objects.filter(slug__startswith="hello-")
Tag.objects.filter(name__iexact="django")
```

### 9.5.2 Numeric/date comparisons
- `gt`, `gte`, `lt`, `lte`
- `range`

Examples:

```python
from django.utils import timezone

now = timezone.now()

Article.objects.filter(created_at__lte=now)
Article.objects.filter(id__gte=10)
Article.objects.filter(created_at__range=(start, end))
```

### 9.5.3 Membership: `in`
```python
Article.objects.filter(status__in=[Article.Status.DRAFT, Article.Status.PUBLISHED])
```

### 9.5.4 Null checks: `isnull`
```python
Article.objects.filter(published_at__isnull=True)
```

### 9.5.5 Relationship traversal (preview; we’ll use heavily)
```python
Article.objects.filter(tags__slug="django")
Tag.objects.filter(articles__status=Article.Status.PUBLISHED)
```

---

## 9.6 AND / OR / NOT Logic with `Q` Objects (Non-Negotiable Skill)

### 9.6.1 Why `Q` exists
Normal `.filter(a=1, b=2)` is always AND:

```python
Article.objects.filter(status="published", slug="hello-django")
# status = published AND slug = hello-django
```

But for OR logic you need `Q`.

### 9.6.2 OR search example (title OR body)
This is the correct version of the search pattern you previewed earlier.

```python
from django.db.models import Q

qs = Article.objects.filter(status=Article.Status.PUBLISHED)

q = "views"
qs = qs.filter(Q(title__icontains=q) | Q(body__icontains=q))
```

### 9.6.3 Combine AND + OR
Example: published AND (tag=django OR tag=routing)

```python
from django.db.models import Q

qs = (
    Article.objects.filter(status=Article.Status.PUBLISHED)
    .filter(Q(tags__slug="django") | Q(tags__slug="routing"))
    .distinct()
)
```

Why `.distinct()`?
- Joining across many-to-many can produce duplicate rows.
- `.distinct()` removes duplicates at the SQL level.

### 9.6.4 NOT logic
```python
from django.db.models import Q

qs = Article.objects.filter(~Q(status=Article.Status.ARCHIVED))
```

Or with exclude:

```python
qs = Article.objects.exclude(status=Article.Status.ARCHIVED)
```

Both work; choose the clearer one.

---

## 9.7 Ordering, Slicing, Pagination (Query-Level Concepts)

### 9.7.1 Ordering
```python
Article.objects.order_by("-created_at")
Article.objects.order_by("title", "-created_at")
```

Notes:
- `-field` means descending.
- Ordering affects pagination correctness and performance.

### 9.7.2 Slicing
```python
qs = Article.objects.order_by("-created_at")
page_1 = qs[:20]
page_2 = qs[20:40]
```

This translates to SQL `LIMIT`/`OFFSET` (classic offset pagination). You learned the
limitations earlier: shifting results and slow large offsets.

### 9.7.3 “Stable ordering” is mandatory
Never paginate without explicit ordering. Otherwise DB may return rows in arbitrary
order and pages will be inconsistent.

Bad:

```python
Article.objects.all()[0:20]
```

Good:

```python
Article.objects.order_by("-created_at")[0:20]
```

---

## 9.8 Selecting Only What You Need: `values()`, `values_list()`, `only()`, `defer()`

### 9.8.1 `values()` and `values_list()`
These return dictionaries/tuples instead of model instances.

Use cases:
- APIs that don’t need model methods
- exporting data
- building lookup maps efficiently

Examples:

```python
Article.objects.filter(status="published").values("id", "slug", "title")
```

Returns a QuerySet of dicts like:

```python
{"id": 1, "slug": "hello-django", "title": "Hello Django"}
```

`values_list`:

```python
Article.objects.values_list("slug", flat=True)
```

Returns a list-like QuerySet of slugs.

### 9.8.2 `only()` and `defer()` (advanced, use carefully)
- `only("title")` loads only some fields initially.
- accessing deferred fields later triggers extra queries.

This can easily create hidden N+1 issues.

Rule of thumb:
- prefer `values()` when you truly want partial data
- use `only()`/`defer()` only when you fully understand access patterns

---

## 9.9 Relationships and Query Performance: `select_related` vs `prefetch_related`

This is the biggest ORM performance topic.

### 9.9.1 What problem are we solving? (N+1 queries)
If you load articles and then for each article load tags:

```python
articles = Article.objects.all()

for a in articles:
    list(a.tags.all())
```

You might cause:
- 1 query to fetch articles
- N queries to fetch tags for each article

Total: **1 + N** queries → slows dramatically as N grows.

### 9.9.2 `select_related` (for ForeignKey / OneToOne)
`select_related` performs a SQL join and fetches related object in one query.

Example (if Article had `author = ForeignKey(User)`):

```python
Article.objects.select_related("author").all()
```

Then `article.author` doesn’t require another query.

**Rule:** use `select_related` for single-valued relations (FK/O2O).

### 9.9.3 `prefetch_related` (for ManyToMany and reverse FK)
Many-to-many requires separate query + Python-side joining.

```python
qs = Article.objects.prefetch_related("tags")
```

This generally runs:
- one query for articles
- one query for tags relation across all those articles
Then it attaches tags to each article instance without per-article queries.

**Rule:** use `prefetch_related` for multi-valued relations (M2M/reverse FK).

### 9.9.4 Prove it with a query count test
Add a test to show the effect. Create `articles/tests_queries.py`:

```python
from django.test import TestCase
from django.utils import timezone

from articles.models import Article, Tag


class PrefetchTests(TestCase):
    def setUp(self):
        django_tag = Tag.objects.create(name="Django", slug="django")

        for i in range(10):
            a = Article.objects.create(
                title=f"Article {i}",
                slug=f"article-{i}",
                body="Body",
                status=Article.Status.PUBLISHED,
                published_at=timezone.now(),
            )
            a.tags.add(django_tag)

    def test_n_plus_one_without_prefetch(self):
        qs = Article.objects.filter(status=Article.Status.PUBLISHED)

        with self.assertNumQueries(1 + 10):
            # 1 query for articles + 10 queries for tags (likely)
            for a in qs:
                list(a.tags.all())

    def test_prefetch_removes_n_plus_one(self):
        qs = Article.objects.filter(
            status=Article.Status.PUBLISHED
        ).prefetch_related("tags")

        with self.assertNumQueries(2):
            for a in qs:
                list(a.tags.all())
```

Notes:
- Query counts can vary depending on DB backend and middleware, but the pattern is
  what matters.
- This is the most concrete way to “feel” ORM performance.

### 9.9.5 Filtered prefetch with `Prefetch`
Sometimes you want to prefetch only certain related objects.

Example: prefetch only tags whose slug starts with `d`:

```python
from django.db.models import Prefetch

qs = Article.objects.prefetch_related(
    Prefetch("tags", queryset=Tag.objects.filter(slug__startswith="d"))
)
```

Use case in real apps:
- prefetch only “active” related items
- prefetch only recent comments
- prefetch only items visible to the user

---

## 9.10 Aggregation and Annotation (Reporting Without Leaving the ORM)

### 9.10.1 Aggregation = compute a single summary value
Example: count of published articles:

```python
from django.db.models import Count

Article.objects.filter(status=Article.Status.PUBLISHED).aggregate(total=Count("id"))
# {"total": 123}
```

### 9.10.2 Annotation = compute a value per row
Example: annotate each Tag with how many published articles use it:

```python
from django.db.models import Count, Q

qs = Tag.objects.annotate(
    published_articles_count=Count(
        "articles",
        filter=Q(articles__status=Article.Status.PUBLISHED),
        distinct=True,
    )
).order_by("-published_articles_count", "name")
```

Then:

```python
for tag in qs:
    print(tag.slug, tag.published_articles_count)
```

Why `distinct=True`?
- joins can duplicate rows; distinct avoids inflated counts.

### 9.10.3 Common annotation patterns
- count related items
- compute max/min dates
- compute sums (orders totals)
- flags like “has_published” (exists-like logic via annotation)

---

## 9.11 `exists()`, `count()`, and “Don’t Accidentally Load Everything”

### 9.11.1 `exists()` is the right way to ask “is there at least one?”
```python
has_any = Article.objects.filter(status="published").exists()
```

This is typically more efficient than:

```python
Article.objects.filter(status="published").count() > 0
```

because DB can stop early.

### 9.11.2 `count()` runs `COUNT(*)` (usually efficient)
```python
total = Article.objects.filter(status="published").count()
```

Avoid `len(qs)` unless you already evaluated the queryset for other reasons.

---

## 9.12 Bulk Updates and Deletes (Fast, but Know What You’re Skipping)

### 9.12.1 `update()` runs a single SQL UPDATE
```python
from django.utils import timezone

Article.objects.filter(status=Article.Status.DRAFT).update(
    status=Article.Status.ARCHIVED,
    updated_at=timezone.now(),
)
```

Important: `update()` does **not** call `save()` on each instance.
That means:
- model `save()` overrides won’t run
- signals won’t run (advanced topic)
- validation won’t run

This is good for performance, but you must be intentional.

### 9.12.2 `delete()` on QuerySets
```python
Article.objects.filter(status=Article.Status.ARCHIVED).delete()
```

Also bulk-ish; can cascade to related objects based on on_delete rules.

---

## 9.13 `F` Expressions (Database-Side Updates, Avoid Race Conditions)

Suppose you had a view counter (not in your model yet). If you do:

```python
a = Article.objects.get(...)
a.views += 1
a.save()
```

Two concurrent requests can overwrite each other (lost update problem).

With `F`, you tell DB: “increment current value” atomically:

```python
from django.db.models import F

Article.objects.filter(id=1).update(views=F("views") + 1)
```

Even if two requests happen at once, DB increments correctly.

This pattern is essential for counters, balances, stock counts, etc.

---

## 9.14 `get_or_create()` and `update_or_create()` (Common Productivity Tools)

### 9.14.1 `get_or_create()`
Useful for tags:

```python
tag, created = Tag.objects.get_or_create(
    slug="django",
    defaults={"name": "Django"},
)
```

Meaning:
- if tag exists, returns it
- else creates it using defaults

### 9.14.2 `update_or_create()`
```python
tag, created = Tag.objects.update_or_create(
    slug="django",
    defaults={"name": "Django Framework"},
)
```

Meaning:
- creates if missing
- updates fields if present

Caution in high-concurrency systems:
- can still race under heavy concurrent inserts without proper constraints
- `unique=True` on slug helps; DB enforces it

---

## 9.15 Debugging ORM: See the SQL, Measure Queries, Understand Plans

### 9.15.1 Print generated SQL (quick inspection)
```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED).order_by("-created_at")
print(qs.query)
```

This prints SQL-ish representation. It’s not always identical to final DB SQL, but
it’s very useful.

### 9.15.2 `explain()` (query plan insight)
Many backends support:

```python
print(qs.explain())
```

This helps you see:
- whether indexes are used
- whether full scans happen

In production tuning, this is extremely valuable (especially with PostgreSQL).

### 9.15.3 Count queries in tests: `assertNumQueries`
You already saw this above. It’s a professional technique to stop performance
regressions early.

### 9.15.4 Django Debug Toolbar (dev-only, very common)
You’ll later add it to see:
- SQL queries per request
- time per query
- duplicate queries
- template rendering time

For now, know: it’s the go-to tool to find N+1 queries.

---

## 9.16 Transactions (Basics You Need Now)

### 9.16.1 Why transactions matter
A transaction groups multiple DB operations into an “all succeed or all fail” unit.

Without transaction:
- operation 1 succeeds
- operation 2 fails
- DB is left in a half-updated state

### 9.16.2 `atomic()` (simple transaction)
```python
from django.db import transaction

with transaction.atomic():
    tag = Tag.objects.create(name="New", slug="new")
    article = Article.objects.create(
        title="Tx demo",
        slug="tx-demo",
        body="Body",
        status=Article.Status.DRAFT,
    )
    article.tags.add(tag)
```

If any line fails (constraint error, exception), the whole block rolls back.

### 9.16.3 `select_for_update()` (advanced locking concept)
Used in financial/stock systems to prevent concurrent updates causing inconsistent
state.

Example concept (requires DB that supports it well, typically PostgreSQL):

```python
from django.db import transaction

with transaction.atomic():
    a = Article.objects.select_for_update().get(id=1)
    # safely update based on current state
    a.status = Article.Status.ARCHIVED
    a.save()
```

You’ll use this later in “advanced concurrency” scenarios.

---

# 9.17 LAB: Build Real ORM Queries for Your Articles App

## Lab 1 — Seed realistic data (shell)
Run:

```bash
python manage.py shell
```

Paste:

```python
from django.utils import timezone
from articles.models import Article, Tag

Tag.objects.all().delete()
Article.objects.all().delete()

tags = {}
for name, slug in [
    ("Django", "django"),
    ("Routing", "routing"),
    ("CBV", "cbv"),
    ("Templates", "templates"),
]:
    tags[slug] = Tag.objects.create(name=name, slug=slug)

now = timezone.now()

def make_article(i, slug, title, status, tag_slugs):
    a = Article.objects.create(
        title=title,
        slug=slug,
        body=("Body " * 50) + f"#{i}",
        status=status,
        published_at=now if status == Article.Status.PUBLISHED else None,
    )
    a.tags.add(*[tags[s] for s in tag_slugs])

make_article(1, "hello-django", "Hello Django", Article.Status.PUBLISHED,
             ["django", "routing"])
make_article(2, "views-and-urls", "Views and URLs", Article.Status.PUBLISHED,
             ["django", "routing"])
make_article(3, "templates-101", "Templates 101", Article.Status.PUBLISHED,
             ["django", "templates"])
make_article(4, "cbv-guide", "CBV Guide", Article.Status.PUBLISHED, ["cbv", "django"])
make_article(5, "draft-note", "Draft Note", Article.Status.DRAFT, ["django"])

print("Seeded:", Article.objects.count(), "articles")
```

Now you have predictable data to query.

## Lab 2 — Write and verify queries (shell)
Still in shell:

### A) Published articles ordered newest first
```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED).order_by("-created_at")
list(qs.values_list("slug", flat=True))
```

### B) Published articles tagged “django”
```python
qs = Article.objects.filter(
    status=Article.Status.PUBLISHED,
    tags__slug="django",
).distinct()
list(qs.values_list("slug", flat=True))
```

### C) Search: published AND (title OR body contains “urls”)
```python
from django.db.models import Q

qs = Article.objects.filter(status=Article.Status.PUBLISHED).filter(
    Q(title__icontains="urls") | Q(body__icontains="urls")
)
list(qs.values_list("slug", flat=True))
```

### D) Annotate tags with published article counts
```python
from django.db.models import Count, Q

qs = Tag.objects.annotate(
    published_count=Count(
        "articles",
        filter=Q(articles__status=Article.Status.PUBLISHED),
        distinct=True,
    )
).order_by("-published_count", "slug")

[(t.slug, t.published_count) for t in qs]
```

### E) Prefetch tags to avoid N+1
```python
qs = Article.objects.filter(status=Article.Status.PUBLISHED).prefetch_related("tags")

for a in qs:
    print(a.slug, [t.slug for t in a.tags.all()])
```

---

## 9.18 Exercises (Do These Before Proceeding)

1. **Write a function** `search_articles(tag=None, q=None)` that returns a QuerySet:
   - starts with `Article.objects.filter(status=PUBLISHED)`
   - if tag provided, filter tags by slug
   - if q provided, filter with OR across title/body using `Q`
   - ensure `.distinct()` is applied appropriately
   - ensure ordering is `-created_at`

2. Add a “top tags” panel query (for templates):
   - Return top 5 tags by published article count.
   - Use `annotate(Count(...))` and `order_by(...)`.

3. Prove you fixed N+1:
   - Render a page that lists published articles and their tags.
   - In a test, wrap the view call in `assertNumQueries(...)` and make sure it
     stays small (2–3 queries). (You’ll refine exact counts later.)

4. Write one query using `values()` that returns:
   - slug, title, and published_at for all published articles.

5. Explain (in 5–10 lines):
   - Why QuerySets are lazy and why that’s beneficial.
   - Why `.distinct()` is often needed with many-to-many joins.

---

## 9.19 Chapter Summary (What you must retain)

- QuerySets are lazy query programs; learn evaluation triggers.
- Use lookups (`__icontains`, `__gte`, `__in`, `__isnull`) to express filters.
- Use `Q` for OR/NOT logic; avoid “union QuerySets” for search when you need one
  coherent filter.
- Use `.distinct()` when many-to-many joins can duplicate rows.
- Use `select_related` for FK/O2O and `prefetch_related` for M2M/reverse relations.
- Use `annotate`/`aggregate` for reporting.
- Use `update()`/`F()` for safe bulk and concurrent updates.
- Debug by inspecting SQL, counting queries, and using `assertNumQueries`.

---

Next chapter: **Part II — Chapter 10: Admin Site (Productivity Power Tool)**  
We’ll register your `Article` and `Tag` models in Django admin and make it
production-friendly: search, filters, inlines, custom actions, and admin
performance tuning (including query optimizations).