## Number Types
### Integer
<div style="display: inline-block">
    
| Canonical Type (Postgres) | Postgres Aliases     | MySQL Equivalent(s)        | SQLite Equivalent | Storage Size (typical) | Range (signed) |
|----------------------------|----------------------|----------------------------|-------------------|-------------------------|----------------|
| `smallint`                 | `int2`              | `SMALLINT`                 | `INTEGER` (dynamic) | 2 bytes                 | -32,768 → 32,767 |
| `integer`                  | `int`, `int4`       | `INT`, `INTEGER`, `MEDIUMINT` | `INTEGER` (dynamic) | 4 bytes                 | -2,147,483,648 → 2,147,483,647 |
| `bigint`                   | `int8`              | `BIGINT`                   | `INTEGER` (dynamic, up to 8 bytes) | 8 bytes | -9,223,372,036,854,775,808 → 9,223,372,036,854,775,807 |

What happens when we insert into integer type with value out of range?

In [1]:
# %%
%load_ext sql

# %%
%sql postgresql://postgres:root@localhost:5432/dvdrental

In [8]:
%config SqlMagic.style = '_DEPRECATED_DEFAULT'

In [5]:
%%sql

DROP TABLE IF EXISTS integer_example;
CREATE TABLE integer_example (
  id int2,
  salary int
);

INSERT INTO integer_example (id, salary) VALUES (
  210000, 56422  
);

 * postgresql://postgres:***@localhost:5432/dvdrental
Done.
Done.
(psycopg2.errors.NumericValueOutOfRange) smallint out of range

[SQL: INSERT INTO integer_example (id, salary) VALUES (
  210000, 56422  
);]
(Background on this error at: https://sqlalche.me/e/20/9h9h)


We get error indicating value is out of range.

### Numeric
Provides ability to store arbitrarily long numbers (with fractional part) and mathrmatical operations on it return exact result. Though operations on this data type is slow.
<div style="display: inline-block">

| Canonical Type (Postgres) | Postgres Aliases             | MySQL Equivalent(s)         | SQLite Equivalent         | Storage Size (typical) | Precision & Scale |
|----------------------------|------------------------------|-----------------------------|---------------------------|------------------------|------------------|
| `numeric(p, s)`            | `decimal(p, s)`              | `DECIMAL(p, s)`, `NUMERIC` | `NUMERIC`, `DECIMAL`*     | Variable (depends on digits) | Up to 131072 digits before the decimal point, up to 16383 after |
| `numeric` (no precision)   | same (`decimal`)             | `DECIMAL` (default p=10, s=0 if omitted) | `NUMERIC` (no strict enforcement) | Variable | Arbitrary precision in Postgres; in MySQL defaults to 10,0; in SQLite flexible (values stored as TEXT, REAL, or INTEGER) |

Precision and scale is defined as:
- **Precision (p):** is the total number of significant digits a number can have (both left and right of the decimal point).
- **Scale (s):** is the number of digits allowed to the right of the decimal point.

Examples:
- `NUMERIC(5,2)`
  - Allowed: 123.45, 99.9, 0.01
  - Not allowed: 1234.56, 123.456
- `NUMERIC(10,0)`
  - Allowed: 9999999999
  - Not allowed: 1.0

What happens when we insert numbers that violate precision and scale constraints?

In [None]:
%%sql

DROP TABLE IF EXISTS numeric_example;
CREATE TABLE numeric_example (
    val DECIMAL(5,2)
);

INSERT INTO numeric_example (val) VALUES
(123.456),  -- # converted to 123.46
(1234.56);  -- # error, postgres does not truncate numbers to the left of decimal
            -- # number is greater than the maximum allowed value of 999.99

`NUMERIC` type can also store special values `Infinity`, `-Infinity` and `NaN`.

In [9]:
%%sql

SELECT 'Infinity'::NUMERIC - 'Infinity'::NUMERIC; -- # results in NaN
                                                  -- # 'Infinity' can be shortened to 'inf'

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


?column?
""


### FLoating Point
Are variable precision numeric numbers which can result in inexact value but are faster than `NUMERIC` types. Comparing two floating-point values for equality might not always work as expected. Operations on floating point data types are much faster though on the expense of accuracy.

<div style="display: inline-block">

| Canonical Type (Postgres) | Postgres Aliases | MySQL Equivalent                                 | SQLite Equivalent           |
|---------------------------|------------------|--------------------------------------------------|-----------------------------|
| REAL                      | FLOAT4           | FLOAT (≤ 24 precision bits)                      | REAL (always 8-byte double) |
| DOUBLE PRECISION          | FLOAT8           | DOUBLE, DOUBLE PRECISION, REAL (alias to DOUBLE) | REAL (always 8-byte double) |

These types can also accept `Infinity`, '-Infinity' and 'NaN'.

### Serial
Is not a real data type, but an alias for integer sequence. It is often used as autoincrementing primary key, though Postgres now has better way to achieve the same.

In [None]:
%%sql

CREATE TABLE seial_example (
    id SERIAL
);

-- # same as:
CREATE SEQUENCE serial_example_id_sequence AS INTEGER;
CREATE TABLE serial_example (
    id INTEGER NOT NULL DEFAULT NEXTVAL('serial_example_id_sequence')
);
ALTER SEQUENCE serial_example_id_sequence OWNED BY serial_example.id; 
-- # remove sequence when id column or the table is removed

There is a `BIGSERIAL` data type as well which uses `BIGINT` for sequence. One thing to note is that we may get gaps in the autogenerated number sequence while using these data types. This is because `SEQUENCE` is not transaction aware. So a rolled back transaction can consume one or many numbers from the sequence without those numbers being present in the table.

To get the autogenerated sequence name, use:

In [36]:
%%sql

SELECT PG_GET_SERIAL_SEQUENCE('serial_example', 'id');

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


pg_get_serial_sequence
public.serial_example_id_seq


## Character Types
### Char
Defined to be $n$ characters long, this type adds padding with spaces when number of characters are less than max specified. NOT RECOMMENDED.

<div style="display: inline-block">

| Postgres Type           | Aliases     | MySQL Equivalent                | SQLite Equivalent | Notes                                                                 |
|--------------------------|------------|---------------------------------|-------------------|-----------------------------------------------------------------------|
| CHAR(n)                 | CHARACTER(n) | CHAR(n)                        | TEXT              | Fixed-length string, space-padded.                                   |

In [None]:
%%sql

DROP TABLE IF EXISTS char_example;
CREATE TABLE char_example (
	name CHAR(5)
);

INSERT INTO char_example (name) VALUES ('a'); -- # stored as 'a    '

### Varchar
Defined to be maximum of $n$ characters long, no padding is added in this case.

<div style="display: inline-block">

| Postgres Type | Aliases              | MySQL Equivalent | SQLite Equivalent | Notes                                     |
|---------------|----------------------|------------------|-------------------|------------------------------------------ |
| VARCHAR(n)    | CHARACTER VARYING(n) | VARCHAR(n)       | TEXT              | Variable-length string with length limit. |

In [10]:
%%sql

DROP TABLE IF EXISTS varchar_example;
CREATE TABLE varchar_example (name VARCHAR(5));

INSERT INTO varchar_example (name) VALUES ('HuckleBerry Finn'); -- # Error since length limit exceeded

 * postgresql://postgres:***@localhost:5432/dvdrental
Done.
Done.
(psycopg2.errors.StringDataRightTruncation) value too long for type character varying(5)

[SQL: INSERT INTO varchar_example (name) VALUES ('HuckleBerry Finn'); -- # Error]
(Background on this error at: https://sqlalche.me/e/20/9h9h)


### Text

<div style="display: inline-block">

| Postgres Type | Aliases | MySQL Equivalent            | SQLite Equivalent | Notes                             |
|---------------|---------|-----------------------------|-------------------|-----------------------------------|
| TEXT          | –       | TEXT (same as long VARCHAR) | TEXT              | Unbounded variable-length string. |  

A thing to note is that there is no performance benefit when using `CHAR` or `VARCHAR` over `TEXT`. Infact `CHAR` can often take up more space due to extra padding involved. Thus, it is better not to use fixed width type to enforce length constraints.

## Binary Type
The `BYTEA` data type allows storing binary data. This is similar to `BLOB` type in other databases.

In [None]:
%%sql

DROP TABLE IF EXISTS bytea_example;
CREATE TABLE file_store (
    filename TEXT,
    data bytea
);

INSERT INTO bytea_example VALUES ('readme.txt', '\xaf24e5'); -- # Hex string starts with \x

It is not recommended to store large file data into this field - a dedicated file store like S3 is recommended for that purpose. One usecase for this data type is to use it to store file hashes:

In [13]:
%%sql

SELECT MD5('Some random text'); -- # this returns type text, use DECODE('hex str', 'hex') to convert to bytea
SELECT SHA256('Some random text'); -- # this returns type bytea

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
1 rows affected.


sha256
<memory at 0x0000026F6EAAAE00>


## UUID
Represents universally unique identifier. Database like MySQL and SQLite have no direct equivalent. UUIDs are 128 bits long and adhere to UUID string format. So why use `UUID` data type instead of storing the same as `TEXT`? Because `UUID` efficiently stores it in 128 bits, whereas `TEXT` occupies a larger size.

In [14]:
%%sql

DROP TABLE IF EXISTS uuid_example;
CREATE TABLE uuid_example (
  random_identifier UUID  
);

INSERT INTO uuid_example VALUES ('27c0b626-cd0e-4225-81aa-87af6082aa6b');
INSERT INTO uuid_example VALUES (GEN_RANDOM_UUID()); -- # out of the box function to generate UUID
SELECT * FROM uuid_example;

 * postgresql://postgres:***@localhost:5432/dvdrental
Done.
Done.
1 rows affected.
1 rows affected.
2 rows affected.


random_identifier
27c0b626-cd0e-4225-81aa-87af6082aa6b
958f06f7-3a42-4fb7-886a-5f822f19760e


In [16]:
%%sql

SELECT PG_COLUMN_SIZE('958f06f7-3a42-4fb7-886a-5f822f19760e') -- # 37 bytes
UNION
SELECT PG_COLUMN_SIZE('958f06f7-3a42-4fb7-886a-5f822f19760e'::UUID); -- # 16 bytes

 * postgresql://postgres:***@localhost:5432/dvdrental
2 rows affected.


pg_column_size
16
37


## Boolean Type
Unlike MySQL which uses `TINYINT` to represent boolean, Postgres provides a native boolean datatype defined using the `BOOLEAN` keyword. It occupies 1 byte of space.

In [None]:
%%sql

DROP TABLE IF EXISTS boolean_example;
CREATE TABLE boolean_example (
  status BOOLEAN  
);


INSERT INTO boolean_example VALUES
(TRUE),
(FALSE),
('t'),     -- # true
('f'),     -- # false
('true'),  -- # true
('false'), -- # false
('1'),     -- # true
('0'),     -- # false
('on'),    -- # true
('off'),   -- # false
(NULL);    -- # unknown

In [17]:
%%sql

SELECT '1'::BOOLEAN;

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


bool
True


## Date and Time Types
Postgres provides multiple data types to store date and time information with or without time zones. You can use:
- `TIMESTAMP`: date and time information without a specific time zone attached to it
- `TIMESTAMPTZ`: date and time in a specific time zone.

Comparing the two with MySQL and SQLite offerings:

<div style="display: inline-block">

| PostgreSQL   | Alias                         | MySQL       | SQLite                                 |
|--------------|-------------------------------|------------ |----------------------------------------|
| `TIMESTAMP`  | `TIMESTAMP WITHOUT TIME ZONE` | `DATETIME`  | `TEXT` (`'YYYY-MM-DD HH:MM:SS'`)       |
| `TIMESTAMPTZ`| `TIMESTAMP WITH TIME ZONE`    | `TIMESTAMP` | `INTEGER` (epoch), `REAL` (Julian day) |

 **ISO 8601:** format represents date as `2025-12-24T12:45:55.122+05:30`. UTC time zone can be simplified to `2025-12-24T12:45:55.122Z`. One can also replace "T" with a space.

In [19]:
%%sql

DROP TABLE IF EXISTS timestamp_example;
CREATE TABLE timestamp_example (
    create_dt TIMESTAMP WITH TIME ZONE
);

INSERT INTO timestamp_example VALUES 
 	('23 March 2025'),
 	('2025-12-12T03:00:00.123+00:00'),
 	('03/04/2023');

SELECT * FROM timestamp_example;

 * postgresql://postgres:***@localhost:5432/dvdrental
3 rows affected.


create_dt
2025-03-23 00:00:00+05:30
2025-12-12 08:30:00.123000+05:30
2023-04-03 00:00:00+05:30


Few things to note here:
- All the entries are converted to ISO format when storing to database
- All entries are converted to a specific time zone (Asia/Kolkata in this case)
- 6 decimal points for the entry with explicitly specified milliseconds (up from 3)
- Postgres automatically parses variety of different formats
- In the last entry specifically, it knows that the date is 3 and month is 4

How does this happen? If we type `SHOW DATESTYLE`, we get the following:

In [20]:
%%sql

SHOW DATESTYLE;

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


DateStyle
"ISO, DMY"


This tells Postgres to return datetime in ISO format and whenever there is an ambiguity between date and month, assume that date is written first, then month and then year (DMY).  
What about the time zone?

In [21]:
%%sql

SHOW TIME ZONE;

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


TimeZone
Asia/Calcutta


This is the server time zone and therefore all date time are defaulted to this time zone. We can change `DATESTYLE` and `TIME ZONE` for a session by:

In [None]:
%%sql

SET DATESTYLE = 'German, DMY';
SET TIME ZONE 'UTC';

SELECT * FROM timestamp_example;
-- # 22.03.2025 18:30:00 UTC
-- # 12.12.2025 03:00:00.123 UTC
-- # 02.04.2023 18:30:00 UTC

In [26]:
%%sql

SET DATESTYLE = 'ISO, DMY';
SET TIME ZONE 'Asia/Kolkata';

 * postgresql://postgres:***@localhost:5432/dvdrental
Done.
Done.


[]

`TIMESTAMP` and `TIMESTAMPTZ` optionally accept a parameter from 0 to 6 to specify scale of milliseconds. Default is 6.

**Unix Epoch:** Postgres provides a handy method to convert Unix timestamp:

In [27]:
%%sql

SELECT TO_TIMESTAMP(123456); -- # TIMESTAMPTZ

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


to_timestamp
1970-01-02 15:47:36+05:30


**Timezone:** it is recommended to always use timezone names like `America/New York` or `Asia/Kolkata`. Or use abbreviated names like `CST`. Avoid using timezone offset like `+05:30`. To retrieve a date time converted to a given time zone:

In [30]:
%%sql

SELECT '2025-08-25T00:00:00'::TIMESTAMPTZ AT TIME ZONE 'America/New_York';

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


timezone
2025-08-24 14:30:00


To view all the in-built time zones:

In [None]:
%%sql

SELECT * FROM pg_timezone_names

Postgres also has data types that just store either date or time:
- `TIME` and `TIMETZ`: time with time zone makes little sense, but Postgres has it
- `DATE`

In [31]:
%%sql

SELECT '12:55:00.192'::TIME(1); -- # Only one digit of milliseconds, round up

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


time
12:55:00.200000


There are some constants that can be resolved to timestamp: 

In [32]:
%%sql

SELECT 'today'::TIMESTAMPTZ
UNION
SELECT 'yesterday'::TIMESTAMPTZ
UNION
SELECT 'tomorrow'::TIMESTAMPTZ
UNION
SELECT 'epoch'::TIMESTAMPTZ;

 * postgresql://postgres:***@localhost:5432/dvdrental
4 rows affected.


timestamptz
2025-08-26 00:00:00+05:30
2025-08-27 00:00:00+05:30
2025-08-25 00:00:00+05:30
1970-01-01 05:30:00+05:30


Postgres provides a data type `INTERVAL` that can be used to store time duration:

In [33]:
%%sql

SELECT 'P1Y2M12DT4H12M14S'::INTERVAL;

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


interval
"437 days, 4:12:14"


Output format can be changed:

In [None]:
%%sql

SET INTERVALSTYLE = 'ISO_8601'; -- # Default is Postgres
SELECT '1 year 2 months 12 days 4 hours 12 minutes 14 seconds'::INTERVAL; -- # P1Y2M12DT4H12M14S

## JSON
Postgres has two data types to store JSON:
- `JSON`: uses `TEXT` type to store the data with some validations to ensure what is being stored is valid JSON
- `JSONB`: stores data in a binary format.

In [None]:
%%sql

SELECT '{"a":    25}'::JSON,  -- # retains all whitespace as it is stored as text
       '{"a":    25}'::JSONB; -- # removes unnecessary spaces

In [None]:
%%sql

SELECT '{"a": 25, "a": 30}'::JSON,  -- # retains repeating property, stays {"a": 25,"a": 30}
       '{"a": 25, "a": 30}'::JSONB; -- # repeating property is overriden, converts to {"a": 30}

In [None]:
%%sql

SELECT '{"c": 3, "a": 1, "b": 2}'::JSON,  -- # retains order of property: c,a,b
       '{"c": 3, "a": 1, "b": 2}'::JSONB; -- # property order can change: a,b,c

`JSONB` is the recommended format, even though its size can be a little bit larger:

In [41]:
%%sql

SELECT PG_COLUMN_SIZE('{"c": 3, "a": 1, "b": 2}'::JSON) AS json_size,
       PG_COLUMN_SIZE('{"c": 3, "a": 1, "b": 2}'::JSONB) AS jsonb_size;

 * postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.


json_size,jsonb_size
28,60


## Full Text Search Types
Postgres has decent support for full text search through its types:
- `TSVECTOR`: is a representation of text used for FTS. It is the indexed form of text.
- `TSQUERY`: is structured search expression

More details in Full Text Search page.