# Getting Started

As part of this section we will primarily understand different ways to get started with Postgres.
* Connecting to Database
* Using psql
* Setup SQL Workbench
* SQL Workbench and Postgres
* SQL Workbench Features
* Setup Postgres using Docker
* Data Loading Utilities

## Connecting to Database

We will be using JupyterHub based environment to master Postgresql. Let us go through the steps involved to get started using JupyterHub environment.
* We will use Python Kernel with sql magic command and for that we need to first load the sql extension.
* Create environment variable `DATABASE_URL` using SQL Alchemy format.
* Write a simple query to get data from information schema table to validate database connectivity.

In [None]:
%load_ext sql

In [None]:
%env DATABASE_URL=postgresql://training:itversity!23@localhost:5432/training_sms

In [None]:
%sql SELECT * FROM information_schema.tables LIMIT 10

## Using psql

Let us understand how to use `psql` utility to perform database operations.
* We need to have at least Postgres Client installed on the server from which you want to use psql to connect to Postgres Server.
* If you are on the server where **Postgres Database Server** is installed, `psql` will be automatically available.
* We can run `sudo -u postgres psql -U postgres` from the server provided you have sudo permissions on the server. Otherwise we need to go with `psql -U postgres -W` which will prompt for the password.
* **postgres** is the super user for the postgres server and hence typically developers will not have access to it in non development environments.
* As a developer, we can use following command to connect to a database setup on postgres server using user credentials.

```shell
psql -U sms_user -h <host_ip_or_dns_alias> -d <db_name> -W
```
* We typically use `psql` to troubleshoot the issues in non development servers. IDEs such as **SQL Alchemy** might be better for regular usage as part of development and unit testing process.
* For this course, we will be primarily using Jupyter based environment for practice.

## Setup SQL Workbench

Let us understand how to setup and use SQL Workbench.

**Why SQL Workbench**

Let us see the details why we might have to use SQL Workbench.
* Using Database CLIs such psql for postgres, mysql etc can be cumbersome for those who are not comfortable with command line interfaces.
* Database IDEs such as SQL Workbench will provide required features to run queries against databases with out worrying to much about underlying data dictionaries.
* SQL Workbench provide required features to review databases and objects with out writing queries or running database specific commands.
* Also Database IDEs provide capabilities to preserve the scripts we develop.
> **In short Database IDEs such as SQL Workbench improves productivity.**

**Alternative IDEs**

There are several IDEs in the market.
* TOAD
* SQL Developer for Oracle
* MySQL Workbench
and many others

**Install Workbench**

Here are the instructions to setup SQL Workbench.
* Download SQL Workbench (typically zip file)
* Unzip and launch

Once installed we need to perform below steps which will be covered in detail as part of next topic.
* Download JDBC driver for the database we would like to connect.
* Get the database connectivity information and connect to the database.

## SQL Workbench and Postgres

Let us connect to Postgres Database using SQL Workbench.

* We are trying to connect to Postgres Database that is running as part of Docker container running in a Ubuntu 18.04 VM provisioned from GCP.
* We have published Postgres database port to port 5433 on Ubuntu 18.04 VM.
* We typically use ODBC or JDBC to connect to a Database from remote machines (our PC).
* Here are the pre-requisites to connect to a Database on GCP.
  * Make sure 5433 port is opened as part of the firewalls.
  * If you have telnet configured on your system on which SQL Workbench is installed, make sure to validate by running telnet command using ip or DNS Alias and port number 5433.
  * Ensure that you have downloaded right JDBC Driver for Postgres.
  * Make sure to have right credentials (username and password).
  * Ensure that you have database created on which the user have permissions.
* You can validate credentials and permissions to the database by installing postgres client on Ubuntu 18.04 VM and then by connecting to the database using the credentials.
* Once you have all the information required along with JDBC jar, ensure to save the information as part of the profile. You can also validate before saving the details by using **Test** option.

## SQL Workbench Features

Here are some of the key features, you have to familiar with related to SQL Workbench.
* Saving profiles to connect to multiple databases.
* Develop SQL files and preserve them for future usage.
* Access data dictionary or information schema to validate tables, columns, sequences, indexes, constraints etc.
* Generate scripts out of existing data.
* Ability to manage database objects with out writing any commands. We can drop tables, indexes, sequences etc by right clicking and then dropping.

Almost all leading IDEs provide all these features in similar fashion.

**Usage Scenarios**

Here are **some of the usage scenarios** for database IDEs such as SQL Workbench as part of day to day responsibilities.
* Developers for generating and validating data as part of unit testing.
* Testers to validate data for their test cases.
* Business Analysts and Data Analysts to run ad hoc queries to understand the data better.
* Developers to troubleshoot data related production issues using read only accounts.

## Setup Postgres using Docker

In some cases you might want to have postgres setup on your machine. Let us understand how we can setup Postgres using Docker.

* If you are using Windows or Mac, ensure that you have installed Docker Desktop.
* If you are using Ubuntu based desktop, make sure to setup Docker.
* Here are the steps that can be used to setup Postgres database using Docker.
  * Pull the postgres image using `docker pull`
  * Create the container using `docker create`.
  * Start the container using `docker start`.
  * Alternatively we can use `docker run` which will pull, create and start the container.
  * Use `docker logs` or `docker logs -f` to review the logs to ensure Postgres Server is up and running.

```shell
docker pull postgres

docker container create \
  --name itv_pg \
  -p 5433:5432 \
  -h itv_pg \
  -e POSTGRES_PASSWORD=itversity \
  postgres

docker start itv_pg

docker logs itv_pg
```

## Data Loading Utilities

Let us understand how we can load the data into databases using utilities provided.
* Most of the databases provide data loading utilities.
* One of the most common way of getting data into database tables is by using data loading utilities provided by the underlying datatabase technology.
* We can load delimited files into database using these utilities.
* Here are the steps we can follow to load the delimited data into the table.
  * Make sure files are available on the server from which we are trying to load.
  * Ensure the database and table are created for the data to be loaded.
  * Run relevant command to load the data into the table.
  * Make sure to validate by running queries.
* Let us see a demo by loading a sample file into the table in Postgres database. The performance will be better if the files are loaded from the server directly.

### Using COPY Command
We can use COPY Command using `psql` to copy the data into the table.
* Make sure database is created along with the user with right permissions. Also the user who want to use `COPY` command need to have pg_read_server_files role assigned.

```shell
docker exec -it itv_pg psql -U postgres
```

```sql
CREATE DATABASE itversity_sms_db;
CREATE USER itversity_sms_user WITH PASSWORD 'sms_password';
GRANT ALL ON DATABASE itversity_sms_db TO itversity_sms_user;
GRANT pg_read_server_files TO itversity_sms_user;
```

* Exit and connect as non system user created.

```shell
psql -U itversity_sms_user \
  -h localhost \
  -p 5433 \
  -d itversity_sms_db \
  -W
```

* Create the `users` table.

```sql
CREATE TABLE users (
  user_id SERIAL PRIMARY KEY,
  user_first_name VARCHAR(30) NOT NULL,
  user_last_name VARCHAR(30) NOT NULL,
  user_email_id VARCHAR(50) NOT NULL,
  user_email_validated BOOLEAN DEFAULT FALSE,
  user_password VARCHAR(200),
  user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
  is_active BOOLEAN DEFAULT FALSE,
  created_dt DATE DEFAULT CURRENT_DATE
);
```

* Create the file with sample data. In this case data is added to users.csv under **~/sms_db**.

```text
user_first_name,user_last_name,user_email_id,user_role,created_dt
Gordan,Bradock,gbradock0@barnesandnoble.com,A,2020-01-10
Tobe,Lyness,tlyness1@paginegialle.it,U,2020-02-10
Addie,Mesias,amesias2@twitpic.com,U,2020-03-05
Corene,Kohrsen,ckohrsen3@buzzfeed.com,U,2020-04-15
Darill,Halsall,dhalsall4@intel.com,U,2020-10-10
```

* Copy the files onto the server. In this case it is running in docker container.

```shell
docker cp users.csv itv_pg:/tmp
```

* Use copy command to load the data

```shell
COPY users(user_first_name, user_last_name, user_email_id, user_role, created_dt)
FROM '/tmp/users.csv'
DELIMITER ','
CSV HEADER;
```

* Validate by running queries

```sql
SELECT * FROM users;
```
