# Different Tables and Views

- Base table :
    - Organized storage
    - Contains Data
    - Loaded with ETL process from physical storage
    - Sources are programs and systems
    - Used for base storage

- Temporary table :
    - Organized storage
    - Contains Data
    - Loaded from QUERY process from transient memory
    - Sources are base tables
    - Used for speeding up query in large table

- Standard View :
    - Stored query
    - Contains direction of Data
    - Never Loaded ( QUERY exists in memory )
    - Sources are base tables
    - Used for complex calculations

- Materialized view :
    - Stored query
    - Contains Data
    - Loads when refreshed 
    - Sources are base tables
    - Used for complex calculations that take too much resources and slows performance


# Information schema

Provides metadata about database.
- `BASE TABLE` : base table
- `LOCAL TEMPORARY` : temporary table
- `VIEW` : view or materialized view

# Database storage types

- Row oriented storage:
    - 1 record = 1 row for all columns = 1 tuple
    - Relation between columns
    - Relational Database like PostGreSQL
    - Transaction focus
    - Write Intensive
    - eg: Student information table
    - 1 row stored in same location
    - Fast to append / delete whole records
    - 1 column info return time = all column info return time 
    - Fast to perform row operations
    - returning all rows is slow
    - Row aggregations
    - Reduce / Improve performance tweaking with : `WHERE` , `INNER JOIN` , `DISTINCT` , `LIMIT`
    - Example query : `SELECT * FROM some_table WHERE col = 'val' ORDER BY col2`
- Column-oriented storage:
    - 1 record = all rows for 1 column 
    - Relation between rows
    - NoSQL Database like Cassandra or Amazon Redshift
    - Analysis focus
    - Read Intensive
    - eg: Table of all zoo animal names
    - 1 column stored in same location
    - Fast to append / delete whole columns
    - 1 row info return time = all row info return time 
    - Fast to perform column operations / calculations
    - returning all columns is slow
    - Column aggregations
    - Improve performance by : 
        - Use `SELECT *` sparingly
        - Use the `information_schema.columns` to explore data :
         ```
         SELECT column_name, data_type FROM information_schema.columns
         WHERE table_catalog = 'schama_name' AND table_name = 'some_table'
         
         ```
    - Example query : `SELECT MIN(col), MAX(col) FROM some_table WHERE col2 = 'val'`

# Database Optimization for Row-Oriented Database

- Requires set-up and maintenance 
- Any combination of columns can form partition or indexes. Only known to database administrator or by documentation

2 ways :
- Partitions :
    - Method of splitting one table into many smaller tables
    - Storage flexibilty (stored across multiple machines)
    - Faster queries [to use specific machine and locate then partition when using]
    - common filter conditions / columns : Date, locations 
- Indexes : 
    - Method of creating sorted column keys to improve search
    - Referene to Data Location. Referencing flexibility (pointers)
    - Faster queries [on tables that are slow to query]
    - Common filter columns : Primary Key

# Partition structure

- Parent table :
    - Visible in database front end
    - Write queries here (input records)

- Children tables (centered around a partition column/s):
    - Not visible in database front end
    - Queries search here (read records)

    

# Partition query assessment

```
EXPLAIN
SELECT some_col
FROM partition_parent_table
WHERE partition_col = 'Africa'
```

# Finding existing indexes

- See `PG_TABLES` schema 
- See `pg_indexes` view
- Metadata about database

`SELECT * FROM pg_indexes`

# Creating Indexes

```
CREATE INDEX index_name
ON table_name (col_name);
```

- Use `CONCURRENTLY` to prevent the table being locked when index is being created

```
CREATE INDEX CONCURRENTLY index_name
ON table_name (col_name1, col_name2);
```

# Where not to use index

- Small tables
- Columns with many nulls
- Frequently updated tables:
    - Index will be fragmented (needs re-indexing: an additional step)
    - Write data twice : into the table, into the index