## Overview of Data Types

Let us get an overview of supported datatypes in Postgres.
* Here is the sample `CREATE TABLE` command for the review.

```sql
CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt DATE DEFAULT CURRENT_DATE,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```

* While creating tables in RDBMS databases, we should specify data types for the columns.
  * `SERIAL` is nothing but integer which is populated by a special database object called as sequence. It is typically used for surrogate primary key.
  * When `SERIAL` is specified, a sequence with **table_name_serial_column_seq** naming convention will be created. In our case it is `users_user_id_seq`.
  * `INT` or `INTEGER` is used to define columns with integer values. Most of the ids are defined as integer.
  * `FLOAT` or `DOUBLE` can be used to define columns used to store price, salary etc.
  * `VARCHAR` with length is used to define variable length columns such as name, email id etc. Postgresql also support `TEXT`.
  * `CHAR` can be used to define fixed length string columns - single character fields such as gender which store M or F, three character days or months etc.
  * `BOOLEAN` is used to store **true** and **false** values.
  * We can also use `DATE` or `TIMESTAMP` to store date or time respectively.
* We can add columns, drop columns, modify columns by changing data types as well as specify default values using `ALTER TABLE` command.
* Let us perform these below tasks to understand about Data Types. Drop and recreate users table with the following details.
  * `user_id` - integer and populated using sequence
  * `user_first_name` - `not null` and alpha numeric or string up to 30 characters
  * `user_last_name` - `not null` and alpha numeric or string up to 30 characters
  * `user_email_id` - `not null` and alpha numeric or string up to 50 characters
  * `user_email_validated` - `true` or `false` (`BOOLEAN`)
  * `user_password` - alpha numeric up to 200 characters
  * `user_role` - single character with U or A (for now we will use `VARCHAR(1)`)
  * `is_active` - `true` or `false` (`boolean`)
  * `created_dt` - `not null` and date with out timestamp. It should be defaulted to system date.

In [1]:
%load_ext sql

In [2]:
%env DATABASE_URL=postgresql://itversity_sms_user:itversity@localhost:5432/itversity_sms_db

env: DATABASE_URL=postgresql://itversity_sms_user:itversity@localhost:5432/itversity_sms_db


In [3]:
%sql DROP TABLE IF EXISTS users CASCADE

Done.


[]

In [4]:
%%sql

CREATE TABLE users (
  user_id SERIAL,
  user_first_name VARCHAR(30) NOT NULL,
  user_last_name VARCHAR(30) NOT NULL,
  user_email_id VARCHAR(50) NOT NULL,
  user_email_validated BOOLEAN,
  user_password VARCHAR(200),
  user_role VARCHAR(1),
  is_active BOOLEAN,
  created_dt DATE DEFAULT CURRENT_DATE
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [5]:
%%sql 

SELECT table_catalog, 
    table_name,
    column_name,
    data_type,
    character_maximum_length,
    column_default,
    is_nullable,
    ordinal_position
FROM information_schema.columns 
WHERE table_catalog = 'itversity_sms_db'
    AND table_schema = 'public'
    AND table_name = 'users'
ORDER BY ordinal_position

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
9 rows affected.


table_catalog,table_name,column_name,data_type,character_maximum_length,column_default,is_nullable,ordinal_position
itversity_sms_db,users,user_id,integer,,nextval('users_user_id_seq'::regclass),NO,1
itversity_sms_db,users,user_first_name,character varying,30.0,,NO,2
itversity_sms_db,users,user_last_name,character varying,30.0,,NO,3
itversity_sms_db,users,user_email_id,character varying,50.0,,NO,4
itversity_sms_db,users,user_email_validated,boolean,,,YES,5
itversity_sms_db,users,user_password,character varying,200.0,,YES,6
itversity_sms_db,users,user_role,character varying,1.0,,YES,7
itversity_sms_db,users,is_active,boolean,,,YES,8
itversity_sms_db,users,created_dt,date,,CURRENT_DATE,YES,9


Most Common Best Practices to create tables and columns.
* Make sure the table names and column names are meaningful. It is a good practice to define column names using alphabets. In some cases we can consider 1 or 2 digits (eg: `phone_number1`, `phone_number2`)
* Separate the words in the column names using `_`. It is called as snake case. Some organizations follow camel case (`user_id` is example for snake case and `userId` is an example for camel case).
* Snake case is a popular choice when it comes to column names.
* Use right data types based on the usage of the columns.
* The best practices are typically enforced by Data Architects and they will reflect in Data Model diagram. Engineers typically refer Data Model diagram to come up with relevant `CREATE TABLE` statements.

In [6]:
%sql DROP TABLE IF EXISTS xyz

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [7]:
%%sql

-- syntactically right, but not meaningful
-- Make sure both the table names and column names are meaningful
CREATE TABLE xyz (
    i INT,
    n VARCHAR
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [8]:
%sql DROP TABLE IF EXISTS users CASCADE

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [9]:
%%sql

-- Syntactically right
-- But not meaningful data types
-- Table will be created successfully, but data cannot be populated
CREATE TABLE users (
    user_id TEXT,
    user_first_name INT,
    user_last_name INT
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

Common Syntax rules on column details.
* Each column should have associated data type.
* Column Names should start with _ or alphabet (a to z).
* Column Names typically have alphabets or numbers or under scores (_). We can have other characters such as white spaces, however we need to enclose the names in double quotes. For example `"user id"`.

In [10]:
%sql DROP TABLE IF EXISTS users CASCADE

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [11]:
%%sql

-- Syntactically incorrect (as the name start with number)
-- Fail to create table
CREATE TABLE users (
    9user_id INT PRIMARY KEY,
    user_first_name VARCHAR,
    user_last_name VARCHAR
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
(psycopg2.errors.SyntaxError) syntax error at or near "9"
LINE 4:     9user_id INT PRIMARY KEY,
            ^

[SQL: -- Syntactically incorrect (as the name start with number)
-- Fail to create table
CREATE TABLE users (
    9user_id INT PRIMARY KEY,
    user_first_name VARCHAR,
    user_last_name VARCHAR
)]
(Background on this error at: https://sqlalche.me/e/14/f405)


In [13]:
%%sql

-- Syntactically incorrect (as the name have non alphanumeric character)
-- In these scenarios we need to specify columns in double quotes "user?id" or "user id"
-- Fail to create table
CREATE TABLE users (
    "user?id" INT PRIMARY KEY,
    user_first_name VARCHAR,
    user_last_name VARCHAR
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]