# Indexing

 the rate of data generation has grown substantially

 performance tuning. 

 One of the most common performance optimization techniques
is indexing. 
An index allows the SQL engine to quickly look up specific
records in a table. 

# Creating an Index

```
CREATE INDEX <index name> ON <table name> (<column list>);

```

We make a reasonable assumption that there are going to be
a lot of queries using the language field in the WHERE clause, and creating
an index on it would increase performance.

```

tesdb=# CREATE INDEX language_idx 
        ON proglang_tbl(language);
        
```

 Let’s verify our index creation attempt by listing
the table description 

in visual studio code => Open terminal => psql 
```
tesdb=# \d proglang_tbl;
```

# Using EXPLAIN to See Indexes at Work

 first let’s
go about creating a large table to run our index-enabled queries on.

A quick and dirty way to get a large table would be to use a cartesian
product or cross joins. We already have our proglang_tbl with 5 columns
and 9 rows in it. Doing a cartesian product on each of the fields with each
other should yield us 9 to the power 5 = 59049 rows 

```
tesdb=# SELECT a.language,
                b.author,
                c.year,
                d.standard,
                e.id
        INTO biglang_tbl
        FROM proglang_tbl a, proglang_tbl b, proglang_tbl c, proglang_tbl d, proglang_tbl e;

tesdb=# SELECT count(*) FROM biglang_tbl;
```

to analyze the query
time for finding all Fortran rows using EXPLAIN 

```
tesdb=# EXPLAIN 
        SELECT * FROM biglang_tbl 
        WHERE language="Fortran";
```

mentioned a Seq Scan or a sequential scan


after
creating an index on the language field

```
tesdb=# CREATE INDEX biglang_idx 
        ON biglang_tbl (language);

tesdb=# EXPLAIN 
        SELECT * FROM biglang_tbl 
        WHERE language='Fortran';
```

The current
output mentions Bitmap Heap Scan and Bitmap Index Scan,

Both plans mention a parameter-like cost=<1st value>..<2nd
value>. The second value is the estimated total cost of query execution.
The smaller this value is, the greater is the efficiency of query execution.

# Unique Indexes

can optionally specify the keyword UNIQUE during index creation 

```
CREATE UNIQUE INDEX <index name> ON <table name> (<column
list>)


tesdb=# CREATE UNIQUE INDEX id_idx ON biglang_tbl (id);

```

ID field is actually duplicated many
times due to our cross join conditions. Creating a unique index on this
field would result in an error

Checking a table with `\d <table_name>` will show that primary key + unique are also indexes

# How Do Indexes Work?

 it is helpful to think
of it as an ordered lookup table. 

The values of the field being indexed are
sorted and stored along with the pointers to the locations of the actual
record in the base table

The
SQL interpreter would not have to traverse through the whole of the
table to find the two rows with ANSI as the standard field. 

![](images/working_of_index.png)

When someone writes a query with a WHERE clause finding the specific
value of a standard, this index would come into effect automatically
without the user having to specify using it.

# Index Overheads

like with everything else in the world, there is no
free lunch.

with N fields,
then for every DML statement like INSERT, UPDATE, or DELETE, the N
indexes have to be kept in sync. This makes changing data slow for large
tables, sometimes annoyingly and sometimes worryingly


Another serious overhead that too many indexes bring is their storage
requirements. Indexes occupy physical space on the disk just like a table.



```
tesdb=# SELECT pg_size_pretty(pg_total_relation_size('biglang_tbl'))

tesdb=# SELECT pg_size_pretty(pg_relation_size('biglang_idx'));
```

Our index is roughly 11% of the size of our table! 

A good rule of thumb is to rely on the primary key and unique indexes
a lot during your queries. 

Over time you will start seeing patterns of slowrunning queries. 

# Deleting an Index

```
DROP INDEX <index name>


tesdb=# DROP INDEX biglang_idx;

SELECT COUNT(*) FROM biglang_tbl;
```

DROP INDEX doesn’t change the contents of the
underlying table


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```


```
tesdb=# 
```