# Final Project: Advanced SQL Techniques


## Scenario
- You have to analyse the following datasets for the city of Chicago, as available on the Chicago City data portal.

- - Socioeconomic indicators in Chicago
- - Chicago public schools
- - Chicago crime data

- Based on the information available in the different tables, you have to run specific queries using Advanced SQL techniques that generate the required result sets.

- The lab will be followed by a graded quiz that will have questions on all problems in this lab. Hence, remember to take screenshots of your SQL queries and their outputs for reference.

In [1]:
import mysql.connector as mysql 
import pandas as pd 
import os
from dotenv import load_dotenv

In [2]:
!docker stop mysql-container
!docker restart mysql-container
!docker ps 

mysql-container
mysql-container
CONTAINER ID   IMAGE     COMMAND                  CREATED      STATUS                  PORTS                                                  NAMES
50e33c579e3f   mysql     "docker-entrypoint.sâ€¦"   4 days ago   Up Less than a second   0.0.0.0:3306->3306/tcp, :::3306->3306/tcp, 33060/tcp   mysql-container


In [4]:
load_dotenv('/workspaces/IBM-DS-Course/.env')
user=os.getenv('USER')
password=os.getenv('PASSWORD')
host = 'localhost'
port=3306

def get_db_connection(db=None):
    return mysql.connect(
        host=host, user=user, password=password,port=port, database=db

    )
db = 'chicago_data'
create = f'CREATE DATABASE IF NOT EXISTS {db};'
use = f'USE {db};'


conn = get_db_connection()
cursor = conn.cursor()
cursor.execute(create)
conn.commit()
cursor.close()
conn.close()

In [23]:
file_paths = {
    'chicago_socioeconomic_data': '/workspaces/IBM-DS-Course/Course 6 Db and SQL /Module 6 /6.3 advance sql asm /6.3 .sql /chicago_socioeconomic_data.sql',
    'chicago_public_schools': '/workspaces/IBM-DS-Course/Course 6 Db and SQL /Module 6 /6.3 advance sql asm /6.3 .sql /chicago_public_schools.sql',
    'chicago_crime': '/workspaces/IBM-DS-Course/Course 6 Db and SQL /Module 6 /6.3 advance sql asm /6.3 .sql /chicago_crime.sql'
}


In [29]:
# check table exists
conn = get_db_connection(db)
cursor = conn.cursor()
# cursor.execute(use)

for table_name, file_path in file_paths.items():
    
    table_exists_query = f'SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = "{table_name}" AND TABLE_SCHEMA="{db}";'
    cursor.execute(table_exists_query)
    rspn = cursor.fetchone()
    if not rspn:
        try:
            with open(file_path, 'r') as sql_script:
                sql_cmds = sql_script.read()
                
                for cmd in sql_cmds.split(';'):
                    if cmd.strip():
                        cursor.execute(cmd)
                    conn.commit()
                    print(f'{file_path} was preformed successfully')
    
        except Exception as err:
            print(f'{file_path} failed because {err}')
    else:
        table_info_query = f'SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = "{table_name}" AND TABLE_SCHEMA="{db}";'
        cursor.execute(table_info_query)
        rslt = cursor.fetchall()
        column_names_df = pd.DataFrame(rslt, columns=["COLUMN_NAME"])
        globals()[f'{table_name}_df'] = pd.DataFrame(column_names_df)
    
# finally:
cursor.close()
conn.close()

In [31]:
for table_name, file_path in file_paths.items():
    print(f'{table_name}_df')

chicago_socioeconomic_data_df
chicago_public_schools_df
chicago_crime_df


## Ex 1 

Using Joins
You have been asked to produce some reports about the communities and crimes in the Chicago area. You will need to use SQL join queries to access the data stored across multiple tables.


### Question 1
Write and execute a SQL query to list the school names, community names and average attendance for communities with a hardship index of 98.


### Question 2
Write and execute a SQL query to list all crimes that took place at a school. Include case number, crime type and community name.


## Ex2: Creating a View


For privacy reasons, you have been asked to create a view that enables users to select just the school name and the icon fields from the CHICAGO_PUBLIC_SCHOOLS table. By providing a view, you can ensure that users cannot see the actual scores given to a school, just the icon associated with their score. You should define new names for the view columns to obscure the use of scores and icons in the original table.



Question 1
Write and execute a SQL statement to create a view showing the columns listed in the following table, with new column names as shown in the second column.
| Column name in CHICAGO_PUBLIC_SCHOOLS|	Column name in view|
|-----|----|
|NAME_OF_SCHOOL|	School_Name|
|Safety_Icon	|Safety_Rating|
|Family_Involvement_Icon|	Family_Rating|
|Environment_Icon|	Environment_Rating|
|Instruction_Icon|	Instruction_Rating|
|Leaders_Icon	|Leaders_Rating|
|Teachers_Icon|	Teachers_Rating|


Write and execute a SQL statement that returns all of the columns from the view.



Write and execute a SQL statement that returns just the school name and leaders rating from the view.


## Ex 3: Creating a Stored Procedure


The icon fields are calculated based on the value in the corresponding score field. You need to make sure that when a score field is updated, the icon field is updated too. To do this, you will write a stored procedure that receives the school id and a leaders score as input parameters, calculates the icon setting and updates the fields appropriately.



### Question 1
Write the structure of a query to create or replace a stored procedure called UPDATE_LEADERS_SCORE that takes a in_School_ID parameter as an integer and a in_Leader_Score parameter as an integer.
Take a screenshot showing the SQL query.



### Question 2
Inside your stored procedure, write a SQL statement to update the Leaders_Score field in the CHICAGO_PUBLIC_SCHOOLS table for the school identified by in_School_ID to the value in the in_Leader_Score parameter.
Take a screenshot showing the SQL query.



### Question 3
Inside your stored procedure, write a SQL IF statement to update the Leaders_Icon field in the CHICAGO_PUBLIC_SCHOOLS table for the school identified by in_School_ID using the following information.
|Score lower limit|	Score upper limit|	Icon|
|----|----|----|
|80	|99|	Very strong|
|60	|79	|Strong|
|40|	59|	Average|
|20	|39	|Weak|
|0|	19|	Very weak|



### Question 4
Run your code to create the stored procedure.

Write a query to call the stored procedure, passing a valid school ID and a leader score of 50, to check that the procedure works as expected.


## Ex 4: Using Transactions


You realise that if someone calls your code with a score outside of the allowed range (0-99), then the score will be updated with the invalid data and the icon will remain at its previous value. There are various ways to avoid this problem, one of which is using a transaction.



### Question 1
Update your stored procedure definition. Add a generic ELSE clause to the IF statement that rolls back the current work if the score did not fit any of the preceding categories.


### Question 2
Update your stored procedure definition again. Add a statement to commit the current unit of work at the end of the procedure.

Run your code to replace the stored procedure.



Write and run one query to check that the updated stored procedure works as expected when you use a valid score of 38.



Write and run another query to check that the updated stored procedure works as expected when you use an invalid score of 101.

