#### General
All the data needed for a database cluster is stored within the cluster's data directory, commonly referred to as PGDATA (after the name of the environment variable that can be used to define it). A common location for PGDATA is /var/lib/pgsql/data. Multiple clusters, managed by different server instances, can exist on the same machine.

Contents of PGDATA:
Item	                |   Description
------------------      |   ------------
PG_VERSION	            |    A file containing the major version number of PostgreSQL
base	                |    Subdirectory containing per-database subdirectories
global	                |    Subdirectory containing cluster-wide tables, such as pg_database
pg_tblspc	            |    Subdirectory containing symbolic links to tablespaces
pg_wal	                |    Subdirectory containing WAL (Write Ahead Log) files
pg_xact	                |    Subdirectory containing transaction commit status data
pg_commit_ts	        |    Subdirectory containing transaction commit timestamp data
pg_stat	                |    Subdirectory containing permanent files for the statistics subsystem
pg_stat_tmp	            |    Subdirectory containing temporary files for the statistics subsystem
pg_dynshmem	            |    Subdirectory containing files used by the dynamic shared memory subsystem
current_logfiles	    |    File recording the log file(s) currently written to by the logging collector
pg_subtrans	            |    Subdirectory containing subtransaction status data
pg_logical	            |    Subdirectory containing status data for logical decoding
pg_multixact	        |    Subdirectory containing multitransaction status data (used for shared row locks)
pg_notify	            |    Subdirectory containing LISTEN/NOTIFY status data
pg_replslot	            |    Subdirectory containing replication slot data
pg_serial	            |    Subdirectory containing information about committed serializable transactions
pg_snapshots	        |    Subdirectory containing exported snapshots
pg_twophase	            |    Subdirectory containing state files for prepared transactions
postgresql.auto.conf    |    A file used for storing configuration parameters that are set by ALTER SYSTEM
postmaster.opts	        |    A file recording the command-line options the server was last started with
postmaster.pid	        |    A lock file recording the current postmaster process ID (PID), cluster data directory path, postmaster start timestamp, port number, Unix-domain socket directory path (could be empty), first valid listen_address (IP address or *, or empty if not listening on TCP), and shared memory segment ID (this file is not present after server shutdown)

#### Directories tree visualization
![Directories](./helpers/directories-detailed.jpeg)

#### Base
For each database in the cluster there is a subdirectory within PGDATA/base, named after the database's OID in pg_database. This subdirectory is the default location for the database's files.\
in particular, its system catalogs are stored there.\
Each table and index is stored in a fork, named after the table or index's filenode number (which sometimes will be the same as the OID of the relation but can be changed over time), which can be found in pg_class.relfilenode.


In [None]:
export PGHOST=db
export PGUSER=postgres
export PGDATABASE=postgres

In [None]:
psql << EOM
    DROP TABLE IF EXISTS my_table;
    CREATE TABLE my_table(
        ID INT,
        NAME VARCHAR(20)
    );

    SELECT
        oid
        ,relname
        ,relfilenode
    FROM pg_class
    WHERE relname = 'my_table';
EOM

In [None]:
psql << EOM
    SELECT pg_relation_filepath('my_table');
EOM

In [None]:
# Change of file node number
psql << EOM
    SELECT pg_relation_filepath('my_table');
    TRUNCATE my_table;
    SELECT pg_relation_filepath('my_table'); -- New file node
EOM

#### Relation forks
- main fork (no suffix)
- more data when reaching the limit (suffix .1, .2, ...)
- free space map (suffix: fsm) - For heap and indexes (except hash index) keep track of available space in the relation
- visibility map (suffix: vm) - For heap only, keep track of which pages contain only tuples that are known to be visible to all active transactions
- initialization fork (suffix: init) - an empty table or index of the appropriate type for un-logged tables (without WAL).

![RelationFork](./helpers/relation-fork.png)

#### Table Spaces
Tablespaces in PostgreSQL allow database administrators to define locations in the file system where the files representing database objects can be stored. Once created, a tablespace can be referred to by name when creating database objects. By using tablespaces, an administrator can control the disk layout of a PostgreSQL installation. This is useful in at least two ways:
1. If the partition or volume on which the cluster was initialized runs out of space and cannot be extended, a tablespace can be created on a different partition and used until the system can be reconfigured.\
1. Tablespaces allow an administrator to use knowledge of the usage pattern of database objects to optimize performance. For example, an index which is very heavily used can be placed on a very fast, highly available disk, such as an expensive solid state device. At the same time a table storing archived data which is rarely used or not performance critical could be stored on a less expensive, slower disk system.

The directory $PGDATA/pg_tblspc contains symbolic links that point to each of the non-built-in tablespaces defined in the cluster.

In [None]:
# Need root permissions, can be done inside the container
mkdir -p /ssd/postgresql/data
chown -R postgres /ssd/postgresql/data

In [None]:
# Make sure the appropriate user has ownership of this directory to be able to manipulate the objects there

psql << EOM
    CREATE TABLESPACE fastspace LOCATION '/ssd/postgresql/data';
    CREATE TABLE foo(i int) TABLESPACE fastspace;
    INSERT INTO foo VALUES (1), (2), (3);
    SELECT * FROM foo;
EOM

In [None]:
# Check available tablespaces

psql << EOM
    SELECT spcname FROM pg_tablespace;
EOM

#### Important catalogs
- pg_class: describes tables and other objects that have columns or are otherwise similar to a table
- pg_database: stores information about the available databases
- pg_tablespace: stores tablespaces