# 5, SQL _ meta data of the database _ interact with the data and the table structure
----
## OUTLINE:
1. sqlite_master:
- Check the name of all tables
- Check the properties of a table
2. Interact with tables within the database: 
- Create a copy of a table: 
- Add new columns to a table: ALTER TABLE tab_name ADD 
- Update the values of a row: UPDATE tab_name Set col = val WHERE 
- Rename a table
- Find the average number associated with each value within a category: GROUP BY
- Filter output: HAVING vs WHERE
- Logical operators: BETWEEN AND
- Truncate a table: TRUNCATE TABLE
- Drop a table: DROP TABLE

In [1]:
%load_ext sql
#the magic command to load the SQL extension

%sql sqlite:///testdatabase.db
#connect and create a db on your computer if it does not exist"

### a. Check name of all tables within the database:
```sql
SELECT name 
FROM sqlite_master
WHERE type = 'table'
```
- `sqlite_master` is the table that contains the metadata of a particular sqlite database. 

In [2]:
%%sql
SELECT name
FROM sqlite_master
WHERE type = 'table'

 * sqlite:///testdatabase.db
Done.


name
population
Indian_female_2021
Indian_male_2021
Indian_all_2021


### b. Check the property of a particular within the database:
```sql
SELECT *
FROM sqlite_master
WHERE type = 'table'
AND name = 'table_name';
```

In [3]:
%%sql
SELECT *
FROM sqlite_master
WHERE type = 'table'
AND name = 'population'

 * sqlite:///testdatabase.db
Done.


type,name,tbl_name,rootpage,sql
table,population,population,2,"CREATE TABLE population (  YEAR INT NOT NULL,  ETHNIC_GROUP VARCHAR(30) NOT NULL,  GENDER VARCHAR(30) NOT NULL,  AGE VARCHAR(30) NOT NULL,  VALUE INT,  PRIMARY KEY (YEAR, ETHNIC_GROUP, GENDER, AGE, VALUE) )"


- In sqlite3, it is IMPOSSIBLE to DIRECTLY change the datatype of a column.

### c. Create a copy of the table:

In [4]:
%%sql
CREATE TABLE copy_pop AS 
    SELECT *
    FROM population

 * sqlite:///testdatabase.db
Done.


[]

### d. Add new columns:
- Create a new column to encode the Gender variable: 1 for female and 0 for male.

- Add new column:
```sql
%%sql
ALTER TABLE table_name
ADD col1 datatype,
col2 datatype,...
```

In [5]:
%%sql
ALTER TABLE copy_pop
ADD GENDER_CODE INT

 * sqlite:///testdatabase.db
Done.


[]

### e. Update the value of a row:
```sql
UPDATE table_name
SET col = value
WHERE <condition>
```

In [6]:
%%sql
UPDATE copy_pop
SET GENDER_CODE = 0
WHERE GENDER = 'Male'

 * sqlite:///testdatabase.db
4375 rows affected.


[]

In [7]:
%%sql
UPDATE copy_pop
SET GENDER_CODE = 1
WHERE GENDER = 'Female'

 * sqlite:///testdatabase.db
4375 rows affected.


[]

- Delete rows associated to All genders value

In [8]:
%%sql
DELETE FROM copy_pop
WHERE GENDER LIKE 'All%'

 * sqlite:///testdatabase.db
4375 rows affected.


[]

- Delete rows associated to All ages value


In [9]:
%%sql
DELETE FROM copy_pop
WHERE AGE LIKE 'All%'

 * sqlite:///testdatabase.db
350 rows affected.


[]

In [10]:
%%sql
SELECT *
FROM copy_pop

 * sqlite:///testdatabase.db
Done.


YEAR,ETHNIC_GROUP,GENDER,AGE,VALUE,GENDER_CODE
2022,All ethnicity,Male,0 - 4 Years,86509.0,0
2022,All ethnicity,Male,5 - 9 Years,93894.0,0
2022,All ethnicity,Male,10 - 14 Years,92341.0,0
2022,All ethnicity,Male,15 - 19 Years,94952.0,0
2022,All ethnicity,Male,20 - 24 Years,112471.0,0
2022,All ethnicity,Male,25 - 29 Years,125777.0,0
2022,All ethnicity,Male,30 - 34 Years,129526.0,0
2022,All ethnicity,Male,35 - 39 Years,110784.0,0
2022,All ethnicity,Male,40 - 44 Years,110672.0,0
2022,All ethnicity,Male,45 - 49 Years,116042.0,0


### g. Rename the table:
```sql
ALTER TABLE copy_pop
RENAME TO new_name
```

In [11]:
%%sql
ALTER TABLE copy_pop
RENAME TO population_official

 * sqlite:///testdatabase.db
Done.


[]

### h. Find the average number associated with  each value within a category:
```sql
SELECT AVG(VALUES)
FROM table
GROUP BY category_col
```
- Find the average of all age-groups for each ethnic_groups in 2021. 

In [12]:
%%sql
SELECT ETHNIC_GROUP, GENDER, AVG(VALUE) AS AVG_VALUE_2021
FROM population_official
WHERE YEAR = 2021
GROUP BY ETHNIC_GROUP, GENDER;

 * sqlite:///testdatabase.db
Done.


ETHNIC_GROUP,GENDER,AVG_VALUE_2021
All ethnicity,Female,106816.54166666669
All ethnicity,Male,97713.54166666669
Chinese,Female,84585.875
Chinese,Male,76186.79166666667
Indians,Female,7082.375
Indians,Male,6873.833333333333
Malays,Female,13667.708333333334
Malays,Male,13312.125
Other Ethnic Groups,Female,1480.5833333333333
Other Ethnic Groups,Male,1340.7916666666667


- Sort the table:
    - By the order of the ethnic_groups
    - By the order of gender: female -> male

In [13]:
%%sql
SELECT ETHNIC_GROUP, GENDER, AVG(VALUE) AS AVG_VALUE_2021
FROM population_official
WHERE YEAR = 2021
GROUP BY ETHNIC_GROUP, GENDER
ORDER BY ETHNIC_GROUP DESC, GENDER ASC;

 * sqlite:///testdatabase.db
Done.


ETHNIC_GROUP,GENDER,AVG_VALUE_2021
Other Ethnic Groups,Female,1480.5833333333333
Other Ethnic Groups,Male,1340.7916666666667
Malays,Female,13667.708333333334
Malays,Male,13312.125
Indians,Female,7082.375
Indians,Male,6873.833333333333
Chinese,Female,84585.875
Chinese,Male,76186.79166666667
All ethnicity,Female,106816.54166666669
All ethnicity,Male,97713.54166666669


### i. Filter output:
- Find the average of all age groups for each gender of each ethnic group in 2021. Show only groups that have an average above 10000. 
- Sort the table:
    - By the order of the ethnic_groups
    - By the order of gender: female -> male
=> Use `HAVING` 
- HAVING vs WHERE:
    - WHERE: states the condition of INPUT
    - HAVING: states the condition of OUTPUT

### j. Logical operators:
    - `>` : bigger than
    - `<` : smaller than
    - `>=`: bigger than or equal
    - `<=`: smaller than or equal
    - `BETWEEN X AND Y`: >=X and <=Y

In [14]:
%%sql
SELECT ETHNIC_GROUP, GENDER, AVG(VALUE) AS AVG_VALUE_2021_above10000
FROM population_official
WHERE YEAR = 2021
GROUP BY ETHNIC_GROUP, GENDER
HAVING AVG(VALUE) >10000
ORDER BY ETHNIC_GROUP DESC, GENDER ASC;

 * sqlite:///testdatabase.db
Done.


ETHNIC_GROUP,GENDER,AVG_VALUE_2021_above10000
Malays,Female,13667.708333333334
Malays,Male,13312.125
Chinese,Female,84585.875
Chinese,Male,76186.79166666667
All ethnicity,Female,106816.54166666669
All ethnicity,Male,97713.54166666669


- Find the average of all age groups for each gender of each ethnic group in 2021. Show only groups that have an average >=10000 and <=90000
- Sort the table:
    - By the order of the ethnic_groups
    - By the order of gender: female -> male

In [15]:
%%sql
SELECT GENDER, ETHNIC_GROUP, AVG(VALUE) AS average_2021above
FROM population_official
WHERE YEAR = 2021
GROUP BY ETHNIC_GROUP, GENDER
HAVING AVG(VALUE) BETWEEN 10000 AND 90000
ORDER BY ETHNIC_GROUP DESC, GENDER ASC

 * sqlite:///testdatabase.db
Done.


GENDER,ETHNIC_GROUP,average_2021above
Female,Malays,13667.708333333334
Male,Malays,13312.125
Female,Chinese,84585.875
Male,Chinese,76186.79166666667


### k. Truncate the table: remove all the rows of the table
```sql
TRUNCATE TABLE table_name;
```
- TRUNCATE vs DELETE:
    - TRUNCATE: 
        - Is a DDL statement
        - Faster
        - Remove all rows, cannot be integrated with criteria
    - DELETE:
        - Is a DML statement
        - Slower
        - Allow users to integrate with criteria

### l. Drop the table: remove the whole table itself
```sql
DROP TABLE table_name;
```
- DROP vs TRUNCATE:
    - DROP: Remove the WHOLE table
    - TRUNCATE: Remove rows only, still keeps columns