# **Data Definition Language - Practice**

We will practice using pgadmin and PostgreSQL to create a simple relational database and tables.

In previous lecture, we already known about the ERD model and we will make it and directly apply to make our database.


We will using a simple case that contains two tables only. You may look at the ERD model below.

<img src="https://github.com/FTDS-learning-materials/phase-0/blob/main/img/erd-w2d2am.png?raw=true"><img/>

You can see that there are table `students` and `campus` that have relation by `campus.id` and `students.campus_id`.

## Create the Database

First step is that you will make a database named `FTDS`. To create a database in PostgreSQL, you can run on pgadmin:

```sql
create DATABASE FTDS;

```

## Create Tables

Next, you will create two tables in one time. To running more than queries, you have to put `;` in the end of each query like below:

```sql
-- create students table
CREATE TABLE students (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50),
    age INTEGER,
    campus_id INTEGER,
    total_grade FLOAT
);

-- create campus table
CREATE TABLE campus (
    id SERIAL PRIMARY KEY,
    campus_name VARCHAR(50),
    batch VARCHAR(10),
    start_date DATE
);

```

# **Data Manipulation Language - Practice**

## INSERT

Now, you already have two tables but they are empty. The tables expected to be like this:


**students table:**

| id |    name        | age | campus_id | total_grade |
|----|----------------|-----|-----------|-------------|
| 1  | Rafif Iman     | 20  | 1         | 85.5        |
| 2  | Hana Arisona   | 21  | 2         | 90.2        |
| 3  | Raka Purnomo   | 19  | 1         | 78.9        |
| 4  | Danu Irfansyah | 20  | 3         | 92.7        |
| 5  | Rachman Ardhi  | 22  | 2         | 88.1        |

**campus table:**

| id | campus_name | batch | start_date |
|----|-------------|-------|------------|
| 1  | Remote      | RMT-1 | 2023-01-01 |
| 2  | Jakarta     | HCK-2 | 2023-02-01 |
| 3  | BSD         | BSD-4 | 2023-03-01 |
| 4  | Surabaya    | SUB-1 | 2023-04-01 |
| 5  | Singapore   | SIN-1 | 2023-05-01 |




To insert the data, you can use `INSERT` keyword. You may run two queries simultantly.

```sql

-- Insert data into the students table
INSERT INTO students (name, age, campus_id, total_grade)
VALUES
    ('Rafif Iman', 20, 1, 85.5),
    ('Hana Arisona', 21, 2, 90.2),
    ('Raka Purnomo', 19, 1, 78.9),
    ('Danu Irfansyah', 20, 3, 92.7),
    ('Rachman Ardhi', 22, 2, 88.1);

-- Insert data into the campus table
INSERT INTO campus (campus_name, batch, start_date)
VALUES
    ('Remote', 'RMT-1', '2023-01-01'),
    ('Jakarta', 'HCK-2', '2023-02-01'),
    ('BSD', 'BSD-4', '2023-03-01'),
    ('Surabaya', 'SUB-1', '2023-04-01'),
    ('Singapore', 'SIN-1', '2023-05-01');

```


You can see your table by running DQL command:

```sql
SELECT * FROM students;
```

## UPDATE

Oh no! You made a mistake. You write down the grade of Hana (id=2) which is 90.2, but the correct one is 95.3.

To edit the value in the cell, you can run this query:

```sql
UPDATE Students
SET total_grade = 95.3
WHERE id = 2;
```

## DELETE

In Q3 2023, the Singapore campus had been closed due to some reason. But, the info still on the database. To remove it, you can use `DELETE` keyword.

```sql
DELETE FROM campus
WHERE id = 5;
```

or

```sql
DELETE FROM campus
WHERE name = 'Singapore';
```

## EXPLAIN

`EXPLAIN` is used to analyze the execution plan of a query. We can get the information about how the query will execute and give a result.

```sql
EXPLAIN SELECT students.name, students.age, campus.campus_name
FROM ctudents
JOIN campus ON Students.campus_id = campus.id
WHERE campus.campus_name = 'Jakarta';
```

You will retrieve an output like this from query above:

```sql
QUERY PLAN
----------------------------------------------------------------------------------------
Nested Loop
  Join Filter: (students.campus_id = campus.id)
  -> Seq Scan on campus
        Filter: (campus_name = 'Jakarta'::text)
        Rows Removed by Filter: XXXX
  -> Bitmap Heap Scan on students
        Recheck Cond: (campus_id = campus.id)
        -> Bitmap Index Scan on students_campus_id_idx
              Index Cond: (campus_id = campus.id)
(9 rows)
```


- The QUERY PLAN section outlines the steps involved in executing the query. In this example, it involves a nested loop join between the ctudents and campus tables.

- The Join Filter line indicates the condition used for joining the tables, which is students.campus_id = campus.id. It specifies that the campus_id of the students table should match the id of the campus table.

- The Seq Scan on campus line represents a sequential scan on the campus table, filtering for rows where the campus_name is equal to 'Jakarta'.

- The Rows Removed by Filter line indicates the number of rows that were filtered out based on the condition. The actual count (XXXX) will depend on your specific data.

- The Bitmap Heap Scan on Students line indicates a bitmap heap scan on the Students table, rechecking the condition for each row.

- The Bitmap Index Scan on students_campus_id_idx line represents an index scan on the students_campus_id_idx index, specifically on the campus_id column.

### EXPLAIN ANALYZE

You may add `ANALYZE` keyword when you want to know the execute plan of your query. When you use the `EXPLAIN ANALYZE` keywords, PostgreSQL not only provides the execution plan but also actually executes the query and provides the detailed runtime analysis.

Here's an example query:

```sql
EXPLAIN ANALYZE SELECT students.name, students.age, sampus.campus_name
FROM students
JOIN campus ON students.campus_id = campus.id
WHERE campus.campus_name = 'Jakarta';
```

When you use the `EXPLAIN ANALYZE` keywords, PostgreSQL not only provides the execution plan but also actually executes the query and provides the detailed runtime analysis. Here's an example of the possible output for the given query:

```sql
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=0.56..8.58 rows=1 width=47) (actual time=0.045..0.050 rows=2 loops=1)
  ->  Index Scan using campus_campus_name_idx on campus  (cost=0.28..4.30 rows=1 width=24) (actual time=0.026..0.027 rows=1 loops=1)
        Index Cond: (campus_name = 'Jakarta'::text)
  ->  Index Scan using students_campus_id_idx on students  (cost=0.28..4.28 rows=1 width=31) (actual time=0.014..0.015 rows=2 loops=1)
        Index Cond: (campus_id = campus.id)
Planning Time: 0.179 ms
Execution Time: 0.113 ms
```


- The QUERY PLAN section provides the execution plan for the query.

- The Nested Loop indicates that a nested loop join is performed.

- The Index Scan using campus_campus_name_idx shows that an index scan is used to find the campus with the name 'Jakarta'.

- The Index Scan using students_campus_id_idx line indicates that an index scan is used to find the students matching the campus_id.

- The Planning Time displays the time taken for query planning.

- The Execution Time shows the actual execution time of the query.

This output provides both the execution plan and the runtime analysis of the query, giving you insights into how the query was executed and the time taken for each step.

You may see that after 'Nested Loop...' line in the output, there is a information that is `(cost=0.28..4.30 rows=1 width=24)`.

In the output, `(cost=0.28..4.30 rows=1 width=24)` represents the estimated cost, number of rows, and width of the result for a specific step in the query execution plan.

> `cost=0.28..4.30` indicates the estimated cost range for the step. The cost represents an arbitrary unit of computational effort, and the range specifies the minimum and maximum estimated costs.

> `rows=1` represents the estimated number of rows that will be returned by the step.

> `width=24` represents the estimated width of the result in bytes.

The cost values are relative and help the query planner compare different plans to choose the one with the lowest cost. The number of rows and width estimates are used to optimize memory allocation and data transfer during query execution.

In this case, if you use `campus_id = campus.id` for the `WHERE` condition instead of `campus_name = 'Jakarta'` since it's faster and less computational cost.


# ALTER, TRUNCATE, DROP

## ALTER

Once again, you had a mistake. You forget to add email column in students table.

To add a new column, you may run this query:

```sql
ALTER TABLE students
ADD COLUMN email VARCHAR(100);
```

and also, perhaps `campus_name` column in campus table has to be renamed into `name` in order to simplify the column name. To do it, you can run this query:

```sql
ALTER TABLE campus
RENAME COLUMN campus_name TO name;
```

## TRUNCATE

Suddenly, you want to reset the campus table and make it empty. To do it, you can use `TRUNCATE` keyword:

```sql
TRUNCATE TABLE campus;
```

## DROP

Then, you want to remove campus table from your database because it is useless. To do it, you may run this query:

```sql
DROP TABLE campus;
```


## ALTER and DROP Combination

Also, you think that you don't need `email` column in students table. You can remove it by run this query:

```sql
ALTER TABLE students
DROP COLUMN email;
```