Creating Tables and Editing Data in PostgreSQL
---

# Tables

## Creation

The basic syntax for creating a table is:
```sql
CREATE TABLE tbl_name (
    column_name type [options][,
    addition columns separated by commas]
    [table constraints separated by commas]
);
```

The specific available types vary across dialects. PostgreSQL includes:
- `boolean`, with states 
- `integer` or `int` (i32), plus `smallint` (i16) and `bigint` (i64)
- Autoincrementing, unique, non-null unsigned integers: `serial` (u32), `smallserial` (u16), and `bigserial` (u64). These are frequently used as primary keys.
- Exact decimal numbers: `decimal` or `numeric`. These should be specified with `NUMERIC(precision, scale)`, where `precision` represents the total number of digits to store and `scale` represents where to place the decimal. Positive values express move the decimal to the left, negative to the right. The `money` type is a wrapper on a `numeric` with scale of 2 and region-defined display defaults.
- Floating point numbers: `real` (f32) and `double precision` (f64)
- Text types: `text` represents strings, and in PostgreSQL is the native type for string functions. However, it is not in the SQL standard, and thus not all dialects use it. The standard types are `varchar(n)` or `char(n)`, where `n` represents the maximum length in characters and the latter will pad shorter strings to the maximum with spaces. Alone, `char` represents a single character, while `varchar` is similar to `text`. Note that `varchar` is an alias for `character varying` and `char` for `character`, with the additional PostgreSQL alias `bpchar`.
- Date and time: `timestamp`, `date`, `time`, `interval`. The dates and times are represented in input and output as strings. The standard format is ISO 8601: `TIMESTAMP 'YYYY-MM-DD HH:MM:SS'`. The time types can be specified as `with time zone`.

Options and table constraints include:
- `PRIMARY KEY`, or as a table constraint `PRIMARY KEY(col)`. In the latter, lists of multiple columns can be used to define unique key.
- `UNIQUE`. Note that by default, two different `NULL` values are considered distinct and will not violate the constraint. To insist on at most one `NULL`, use `UNIQUE NULLS NOT DISTINCT`. Single or combinations of columns can be asserted distinct as a table constraint: `UNIQUE(col)`.
- `REFERENCES tbl(col)`: marks the column as a key linked to another table. This other table must be defined first. This will require that any non-null value set in this row corresponds to a value in the referenced column. This can be a reference to the same table for self-joins. In the absence of `(col)`, it is assume to refer to the other table's primary key. To write as a table constraint (which can also use multiple columns as the key): `FOREIGN KEY(col) REFERENCES tbl(col)`.
- `DEFAULT val`
- `NOT NULL`.
- To define a column as a function of other values: `GENERATED ALWAYS AS expression STORED`
- Constraining values: `[CONSTRAINT name] CHECK (condition)`. `CONSTRAINT name` is an optional convenience for error output. Typically `condition` will be a test involving the value of the row being defined. If multiple columns are involved, it should be written as a table constraint instead, which has the same syntax. In general, `NULL` values are assumed to *satisfy* any check.

## Modification

Commands to change the schema of a table begin with:
```sql
ALTER TABLE tbl ...;
```

Specific operations include:
- `ADD COLUMN ...`, with same syntax as original definition of columns.
- `DROP COLUMN col`. The keyword `CASCADE` will result in dropping any dependencies (e.g. constraints) of that column.
- Adding a table constraint `ADD ...`, with the same syntax as when creating table.
- Named constraints can be removed with `DROP CONSTRAINT name`. Unnamed constraints have auto-generated names that can be determined with `\d tbl`.
- Changing the properties of a column: `ALTER COLUMN col ...`. To change type, use `TYPE newtype`. To add or remove constraints or defaults use `SET` or `DROP`, e.g. `SET DEFAULT val` or `DROP NOT NULL`.
- Renaming: `RENAME TO newname` or `RENAME COLUMN col TO newname`

## Indexing

If a large table is going to be repeatedly queried in a way that involves searching a particular column or columns, this can be made more efficient by constructing an index. This typically involves generating a binary tree from the data with pointers to the rows, though PostgreSQL offers other options, e.g. a hash table.
```sql
CREATE INDEX index_name ON tbl(col);
```
If multiple columns are supplied, this creates an index in which the key is defined by multiple values. The name is not strictly required.

Delete an index with
```sql
DROP INDEX index_name;
```

## Deletion

```sql
DROP tbl;
```

# Data

## Adding a row

```sql
INSERT INTO tbl (col1, col2 ...) VALUES (val1, val2...);
```
The primary key will iterate automatically, and any columns with defined defaults and no value will use those. If constraints are violated or there is missing data without a default, an error is raised.

If every column will be filled, the paranthetical clause after `tbl` can be ommitted and values (or the keyword `DEFAULT`) provided in the order of definition. The phrase `DEFAULT VALUES` can be used when every column has a default.

Multiple rows can be defined by separating parantheses with commas.

Instead of `VALUES`, it is also possible to use a query:
```sql
INSERT INTO tbl (col1, col2)
    SELECT col1, col2 FROM tbl2 WHERE ...;
```

## Reading data from a file

PostgreSQL, unlike SQLite, does not have a function to parse a `csv` file into a table directly. Instead, the schema of a table needs to be defined first and then data is copied from the file. If the `csv` file contains all of the columns in the table, in the same order, with a header:
```sql
COPY tbl
FROM 'file.csv'
WITH
    DELIMITER ',',
    HEADER TRUE;
```

As with `INSERT` statements, a subset of columns or a custom order can be defined with `tbl(col...)`. Strictly speaking, the `DELIMITER` should default to comma with `.csv` extension. `HEADER TRUE` marks the first line as a header; in practice, this just means it will be skipped.

## Editing data points

```sql
UPDATE tbl SET col1 = val1 [WHERE ...];
```
Without a `WHERE` restriction, every value will be changed. To change one particular row, filter on the primary key.

The value expression can refer to the current value of the column (e.g. `col1 = col1 * 2`).

Updates to different columns can be separated by commas.

## Deleting rows

```sql
DELETE FROM tbl WHERE ...;
```

## Outputting results from data updating

`INSERT`, `UPDATE`, and `DELETE` commands all allow for a final `RETURNING` clause. This operates as a `SELECT` on the rows that were inserted, updated, or deleted.