# What is a database index

It's a data structure used to quickly locate data in a database table without having to search through all rows. It uses extra space in the disk (used to store the indexes) in order to increase speed performance. A single table can have multiple indexes, each one can be associated to one or more columns of that table.

# How to create an index in Django

Index creation on Django is done when defining the model class. Although there are multiple ways to create an index, the simplest one is to us pass the `db_index=True` kwarg in the definition of a model field.

You can also set a list of `indexes` inside a model `Meta` class and defining which field(s) each index will use, as in the example below. More on the different types of indexes later.

```python
from django.db import models

class MyModel(models.Model):
    foo = models.CharField(max_length=255)
    bar = models.CharField(max_length=255)
    class Meta:
        indexes = [
            models.Index(fields=['foo']),
            models.Index(fields=['bar']),
        ]
```

However, Django may automatically create some indexes, even if not explicitly asked to, to optimize for the most common cases. Let's consider the following scenario:

```python
from django.db import models

PLAYER_POSITION_CHOICES = [
    ('GK', 'Goalkeeper'),
    ('RB', 'Right back'),
    ('CB', 'Center back'),
    ('LB', 'Left back'),
    ('DM', 'Defensive midfielder'),
    ('CM', 'Central midfielder'),
    ('AM', 'Attacking midfielder'),
    ('CF', 'Center forward'),
    ('RW', 'Right winger'),
    ('LW', 'Left winger'),
]

class City(models.Model):
    name = models.CharField(max_length=255)

class Team(models.Model):
    name = models.CharField(max_length=255)
    tv_name = models.CharField(max_length=3, unique=True)
    city = models.ForeignKey(City, on_delete=models.CASCADE)

class Player(models.Model):
    name = models.CharField(max_length=255)
    position = models.CharField(max_length=2, choices=PLAYER_POSITION_CHOICES)
    team = models.ForeignKey(Team, on_delete=models.CASCADE)

class Stadium(models.Model):
    name = models.CharField(max_length=255)
    nickname = models.CharField(max_length=255)
    city = models.ForeignKey(City, on_delete=models.CASCADE)
    teams = models.ManyToManyField(Team)
```

In [12]:
%%sh
django-admin sqlmigrate teams 0001

BEGIN;
--
-- Create model City
--
CREATE TABLE "teams_city" ("id" bigserial NOT NULL PRIMARY KEY, "name" varchar(255) NOT NULL);
--
-- Create model Team
--
CREATE TABLE "teams_team" ("id" bigserial NOT NULL PRIMARY KEY, "name" varchar(255) NOT NULL, "tv_name" varchar(3) NOT NULL UNIQUE, "city_id" bigint NOT NULL);
--
-- Create model Stadium
--
CREATE TABLE "teams_stadium" ("id" bigserial NOT NULL PRIMARY KEY, "name" varchar(255) NOT NULL, "nickname" varchar(255) NOT NULL, "city_id" bigint NOT NULL);
CREATE TABLE "teams_stadium_teams" ("id" bigserial NOT NULL PRIMARY KEY, "stadium_id" bigint NOT NULL, "team_id" bigint NOT NULL);
--
-- Create model Player
--
CREATE TABLE "teams_player" ("id" bigserial NOT NULL PRIMARY KEY, "name" varchar(255) NOT NULL, "position" varchar(2) NOT NULL, "team_id" bigint NOT NULL);
ALTER TABLE "teams_team" ADD CONSTRAINT "teams_team_city_id_0a7041e6_fk_teams_city_id" FOREIGN KEY ("city_id") REFERENCES "teams_city" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE

By running `sqlmigrate`, we can inspect the SQL that Django will execute to create the table.

Notice how it calls `CREATE INDEX` multiple times, even though we never explicitly define a single index for any of our fields.

```sql
CREATE INDEX "teams_team_city_id_0a7041e6" ON "teams_team" ("city_id");
CREATE INDEX "teams_stadium_city_id_54e53781" ON "teams_stadium" ("city_id");
CREATE INDEX "teams_player_team_id_4ee5cf70" ON "teams_player" ("team_id");
CREATE INDEX "teams_stadium_teams_stadium_id_1521a159" ON "teams_stadium_teams" ("stadium_id");
CREATE INDEX "teams_stadium_teams_team_id_57c6bc0d" ON "teams_stadium_teams" ("team_id");
```

The indexes above are created to optimize the queries we make to access our one-to-many/many-to-many fields, but there's another index there that's used for the `tv_name` field.
```sql
CREATE INDEX "teams_team_tv_name_7d091fee_like" ON "teams_team" ("tv_name" varchar_pattern_ops);
```

This last index is created because we defined `tv_name` as `unique=True` directly in the field declaration. It will use the `varchar_pattern_ops` PostgreSQL function to try to pattern match incoming values with the values stored in the database. Maybe you don't need that extra optimization. In our case, `tv_name` is a short string (only 3 characters) and represents the name of the teams that will be displayed in the TV broadcast of a football match between them (i.e.: SAO for São Paulo, SAN for Santos, SAL for Salgueiro, etc).

In this context of short names where we can even have names that are almost the same, maybe running a full equality comparison is enough in terms of speed, and can save us some valuable database space. As we still want `tv_name` to be unique, we can remove `unique=True` from the field and create a unique constraint in the model `Meta` class.

```python
class Team(models.Model):
    name = models.CharField(max_length=255)
    tv_name = models.CharField(max_length=3)
    city = models.ForeignKey(City, on_delete=models.CASCADE)

    class Meta:
        constraints = [
            models.UniqueConstraint(
                fields=['tv_name'],
                name='%(app_label)s_%(class)s_tv_name_unique',
            ),
        ]
```

In [1]:
%%sh
django-admin sqlmigrate teams 0002

BEGIN;
--
-- Alter field tv_name on team
--
DROP INDEX IF EXISTS "teams_team_tv_name_7d091fee_like";
--
-- Create constraint teams_team_tv_name_unique on model team
--
ALTER TABLE "teams_team" ADD CONSTRAINT "teams_team_tv_name_unique" UNIQUE ("tv_name");
COMMIT;


# Types of indexes

# Tradeoffs

In [None]:
# Space x Time, show examples were optimizing for the fastest speed will end up taking a lot of space in disk