# HW1. SQL Basics

## Objectives

In this homework assignment, you will use SQL to store and query a database. You will learn the followings:

 - How to create a database
 - How to load an available database
 - How to create a table 
 - How to change a table after creation
 - How to insert data into a table
 - How to select certain rows or columns from a table
 - How to join two tables together
 - How to use expressions
 

You will also use [SQLite](https://en.wikipedia.org/wiki/SQLite) as the DBMS. In contrast to many other database management systems (e.g., Oracle, DB2, and SQL Server), SQLite is not a client–server database engine. Rather, it is embedded into the end program. This unique feature has led it to be adopted by [billions of applications](https://www.sqlite.org/mostdeployed.html). 

This assignment has a total of five questions and 25 points. 

## Setup 

1. Install Jupyter Notebook 

    [Jupyter](https://jupyter.org/) is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Please install it using [Anaconda](https://www.anaconda.com/distribution/) for python 3.7.
    

2. Install SQLite

    If you are using Mac OS X or Linux, SQLite should be pre-installed. Open a terminal and type sqlite3. To exit, type ".exit". 
    
    If you are using Windows, please follow the instructions on [here](http://www.sqlitetutorial.net/download-install-sqlite/) to install SQLite.
    
    

3. Install ipython-sql

    [ipython-sql](https://github.com/catherinedevlin/ipython-sql) is a jupyter notebook extension. It allows using SQL queries inside jupyter notebooks. 
    
    Please install it using Anaconda. Open a terminal and type:


    conda install -c conda-forge ipython-sql

## Initial Test

Please download and follow the steps on the [test.ipynb](test.ipynb) to test your environment before you start your homework assignment.

# Homework

Please download [hw1.ipynb](hw1.ipynb) and complete the questions on it for your homework.

## Q1: Create a database (4 points)

The goal is to create a database to manage student and course information in computing science department (named `cssys`), and then create three tables in this database. The first table is named `students`, the second table is named `courses`, and the third table is called `transcript`. 

To start, please execute the following cell to load the [ipython-sql](https://github.com/catherinedevlin/ipython-sql) extension.

In [1]:
%load_ext sql

1.1 (1 point) Create an empty database  named `cssys`

In [2]:
%sql sqlite:///cssys.db

'Connected: @cssys.db'

1.2 (1 point) Create a  table  named `students`

Please create a table named `students`.
The `students` table has six attributes: student id, first name, last name, age, gender, gpa:
* `id` - integer
* `firstname` - char(15)
* `lastname` - char(15)
* `age` - integer
* `gender` - char(1)
* `gpa` - double
* `id` is Primary Key

In [3]:
%%sql

CREATE TABLE students (
    id integer, 
    firstname char(15), 
    lastname char(15), 
    age integer, 
    gender char(1), 
    gpa double, 
    
    PRIMARY KEY(id)
)

 * sqlite:///cssys.db
Done.


[]

1.3 (1 point) Create a table  named `courses`

Please create a table named `courses`.
The `courses` table has four attributes: id, name, credit, pre-requisites:
* `id`- integer
* `name` - varchar(30)
* `credit` - integer
* `prereq` - integer
* (`id`, `prereq`) is Primary Key

In [4]:
%%sql

CREATE TABLE courses (
    id integer, 
    name varchar(30), 
    credit integer, 
    prereq integer, 
    
    PRIMARY KEY(id, prereq)
)

 * sqlite:///cssys.db
Done.


[]

1.4 (1 point) Create a table  named `transcript`

Please create a table named `transcript`.
The `transcript` table has five columns: studentid, courseid, mark, semester, credit:
* `studentid`- integer
* `courseid` - integer
* `mark` - double
* `semester` - integer (represented as `year + 01:fall, 02:spring, 03:summer`)
* `credit` - integer
* (`studentid`,`courseid`) is Primary Key

In [5]:
%%sql

CREATE TABLE transcript (
    studentid integer, 
    courseid integer, 
    mark double,  
    semester integer 'year + 01:fall, 02:spring, 03:summer', 
    credit integer,
                         
    PRIMARY KEY (studentid, courseid)
)

 * sqlite:///cssys.db
Done.


[]

## Q2: Modify a database (1 point)

Please write SQL queries replace the age attribute with a Date of Birth (dob) attribute to table `students`. Please decide on the type and the default value of this attribute and include it in your response. You should decide on how to perform this step (delete/recreate/modify). 

In [6]:
%%sql


ALTER TABLE students 
    RENAME TO students_backup;

CREATE TABLE students (
    id integer PRIMARY KEY, 
    firstname char(15), 
    lastname char(15), 
    gender char(1), 
    gpa double);

INSERT INTO students (id, firstname, lastname, gender, gpa)
    SELECT id, firstname, lastname, gender, gpa FROM students_backup;

ALTER TABLE students
    ADD COLUMN dob date;

 * sqlite:///cssys.db
Done.
Done.
0 rows affected.
Done.


[]

## Q3: Add data to a database (3 points)

3.1 (1 point) Add rows to `students`.

Please write SQL queries to insert the following rows to the `students` table. Change the format of date of birth attribute value based on your definition of its type.
```
1001, adam, smith, 2000-01-03, m, 3.1
1002, alice, frank, 1999-03-11, f , 3.4
1003, bob, hal, 1999-09-01, m, 2
```

In [7]:
%%sql

INSERT INTO students (id, firstname, lastname, dob, gender, gpa)
    VALUES (1001, 'adam', 'smith', '2000-01-03', 'm', 3.1),
                    (1002, 'alice', 'frank', '1999-03-11', 'f', 3.4),
                    (1003, 'bob', 'hal', '1999-09-01', 'm', 2)

 * sqlite:///cssys.db
3 rows affected.


[]

3.2 (1 point) Add rows to `courses`.

Please write SQL queries to insert the following rows to the `courses` table.
```
100, programming, 3, NULL
110, math, 3, NULL
120, web, 4, NULL
301, networking, 4, 200
301, networking, 4, 150
301, networking, 4, 210
354, database, 3, 120
354, database, 3, 110
360, os, 3, 150
360, os, 3, 210
```

In [8]:
%%sql

INSERT INTO courses (id, name, credit, prereq)
    VALUES (100, 'programming', 3, NULL),
                    (110, 'math', 3, NULL),
                    (120, 'web', 4, NULL),
                    (301, 'networking', 4, 200),
                    (301, 'networking', 4, 150),
                    (301, 'networking', 4, 210),
                    (354, 'database', 3, 120),
                    (354, 'database', 3, 110),
                    (360, 'os', 3, 150),
                    (360, 'os', 3, 210)

 * sqlite:///cssys.db
10 rows affected.


[]

3.3 (1 point) Add rows to `transcript`.

Please write SQL queries to insert the following rows to the `transcript` table.
```
1001, 100, 3, 201801, 3
1001, 110, 3.5, 201801, 3
1001, 120, 2.7, 201801, 4
1001, 301, 3.4, 201802, 4
1002, 100, 3, 201801, 3
1002, 110, 3.2, 201901, 3
1002, 301, 3.1, 201902, 4
1003, 100, 2.5, 201801, 3
1003, 120, 3.5, 201901, 4
1003, 301, 2.8, 201902, 4
1003, 354, 4, 201903, 3
1003, 360, 3.5, 201802, 3

```

In [10]:
%%sql

INSERT INTO transcript (studentid, courseid, mark, semester, credit)
    VALUES (1001, 100, 3, 201801, 3), 
                    (1001, 110, 3.5, 201801, 3),
                    (1001, 120, 2.7, 201801, 4),
                    (1001, 301, 3.4, 201802, 4),
                    (1002, 100, 3, 201801, 3),
                    (1002, 110, 3.2, 201901, 3),
                    (1002, 301, 3.1, 201902, 4),
                    (1003, 100, 2.5, 201801, 3),
                    (1003, 120, 3.5, 201901, 4),
                    (1003, 301, 2.8, 201902, 4),
                    (1003, 354, 4, 201903, 3),
                    (1003, 360, 3.5, 201802, 3)

 * sqlite:///cssys.db
12 rows affected.


[]

## Q4: Query a database (10 points)

Please write the SQL query for each of the requests below.

4.1 (1 point) Please write a SQL query to show all rows in the `students` table.

In [11]:
%%sql

select * FROM students

 * sqlite:///cssys.db
Done.


id,firstname,lastname,gender,gpa,dob
1001,adam,smith,m,3.1,2000-01-03
1002,alice,frank,f,3.4,1999-03-11
1003,bob,hal,m,2.0,1999-09-01


4.2 (1 point) Please write a SQL query to show the rows whose `credit` is `3` in the `courses` table.

In [12]:
%%sql

SELECT * FROM courses WHERE credit == 3

 * sqlite:///cssys.db
Done.


id,name,credit,prereq
100,programming,3,
110,math,3,
354,database,3,120.0
354,database,3,110.0
360,os,3,150.0
360,os,3,210.0


4.3 (1 point) Please write a SQL query to show the rows whose `mark` is larger than 3 and `credit` is no smaller than 3 in the `transcript` table. 

In [13]:
%%sql

SELECT * FROM transcipt
WHERE mark > 3 AND credit >= 3

 * sqlite:///cssys.db
Done.


studentid,courseid,mark,semester,credit
1001,110,3.5,201801,3
1001,301,3.4,201802,4
1002,110,3.2,201901,3
1002,301,3.1,201902,4
1003,120,3.5,201901,4
1003,354,4.0,201903,3
1003,360,3.5,201802,3


4.4 (1 point) Please write a SQL query to show `studentid`, `courseid` and `mark` of all rows in the `transcript` table. 

In [14]:

%%sql

SELECT studentid, courseid, mark
FROM Transcipt

 * sqlite:///cssys.db
Done.


studentid,courseid,mark
1001,100,3.0
1001,110,3.5
1001,120,2.7
1001,301,3.4
1002,100,3.0
1002,110,3.2
1002,301,3.1
1003,100,2.5
1003,120,3.5
1003,301,2.8


4.5 (1 point) Please write a SQL query to show `studentid`, `courseid` and `mark` of all rows in the `transcript` table whose semester value is `201902`.

In [15]:
%%sql

SELECT studentid, courseid, mark
FROM Transcipt
WHERE semester == 201902

 * sqlite:///cssys.db
Done.


studentid,courseid,mark
1002,301,3.1
1003,301,2.8


4.6 (1 point) Please write a SQL query to show _distinct_ `courseid` of all rows in the `transcript` table.

In [16]:
%%sql

SELECT DISTINCT courseid
FROM Transcipt

 * sqlite:///cssys.db
Done.


courseid
100
110
120
301
354
360


4.7 (1 point) Please write a SQL query to show the `firstname` and `lastname` and `gpa` from `students` table and sort it based on `gpa`.

In [17]:
%%sql

SELECT firstname, lastname, gpa
FROM Students
ORDER BY gpa DESC

 * sqlite:///cssys.db
Done.


firstname,lastname,gpa
alice,frank,3.4
adam,smith,3.1
bob,hal,2.0


4.8 (3 points) Please write a SQL query to compute `lettergrade` of each row in the `transcript` table, and show `studentid`, `courseid` and `lettergrade` of all rows in the `transcript` table. `lettergrade` is computed as follows: 

* If `mark` >= 3.5, then `lettergrade` = "A"
* If 3 <= `mark` < 3.5, then `lettergrade` = "B"
* If 2.5 <= `mark` < 3, then `lettergrade` = "C"
* If 2 <= `mark` < 2.5, then `lettergrade` = "D"
* If `mark` < 2, then `lettergrade` = "F"

In [18]:
%%sql

SELECT studentid, courseid,
    (CASE 
        WHEN mark >= 3.5 THEN 'A'
        WHEN mark >=3 AND mark < 3.5 THEN 'B'
        WHEN mark >= 2.5 AND mark < 3 THEN 'C'
        WHEN mark >= 2 AND mark < 2.5 THEN 'D'
        WHEN mark < 2 THEN 'F'
     END) 
     as lettergrade
FROM Transcipt

 * sqlite:///cssys.db
Done.


studentid,courseid,lettergrade
1001,100,B
1001,110,A
1001,120,C
1001,301,B
1002,100,B
1002,110,B
1002,301,B
1003,100,C
1003,120,A
1003,301,C


## Q5: Load & Query a database (7 points)

Suppose you work at a bank as a data analyst. Your main job is to analyze the data stored in their database to find out information that can help the business. Please download the database at this [link](bank.db).

The database has six tables. The following shows their schemas. Primary key attributes are underlined and foreign keys are noted in superscript.
 - Customer = {<span style="text-decoration:underline">customerID</span>, firstName, lastName, income, birthDate}
 - Account = {<span style="text-decoration:underline">accNumber</span>, type, balance, branchNumber<sup>FK-Branch</sup>}
 - Owns = {<span style="text-decoration:underline">customerID</span><sup>FK-Customer</sup>, <span style="text-decoration:underline">accNumber</span><sup>FK-Account</sup>}
 - Transactions = {<span style="text-decoration:underline">transNumber</span>, <span style="text-decoration:underline">accNumber</span><sup>FK-Account</sup>, amount}
 - Employee = {<span style="text-decoration:underline">sin</span>, firstName, lastName, salary, branchNumber<sup>FK-Branch</sup>}
 - Branch = {<span style="text-decoration:underline">branchNumber</span>, branchName, managerSIN<sup>FK-Employee</sup>, budget}

Please run the next cell after downloading the database, before you start.

In [19]:
%sql sqlite:///bank.db

'Connected: @bank.db'

5.1 (1 point) Suppose you talked with a customer, you remember their name started with 'M', included an 'r' and finished with an 'a', but you are not sure about the complete spelling. Please write a SQL query to show the first name and last name of the customers with such first name.

In [20]:
%%sql

SELECT firstName, lastName 
FROM Customer 
WHERE firstName LIKE 'M%r%a'

 * sqlite:///bank.db
   sqlite:///cssys.db
Done.


firstName,lastName
Martha,Young
Martha,Butler
Maria,Morgan
Maria,Young


5.2 (1 point) Please write a SQL query to show names of the branches and first name and last name of their managers.

In [21]:
%%sql

SELECT  branchName, firstName, lastName
FROM Branch, Employee
WHERE managerSIN = sin 



 * sqlite:///bank.db
   sqlite:///cssys.db
Done.


branchName,firstName,lastName
London,Phillip,Edwards
Latveria,Victor,Doom
New York,Victor,Doom
Berlin,Deborah,Hernandez
Moscow,Cheryl,Thompson


5.3 (1 point) Please write a SQL query to find out employees who are also customers. 

In [22]:
%%sql

SELECT  Customer.firstName, Customer.lastName
FROM Customer, Employee
WHERE Customer.firstName == Employee.firstName AND Customer.lastName == Employee.lastName
ORDER BY Customer.firstName ASC


 * sqlite:///bank.db
   sqlite:///cssys.db
Done.


firstName,lastName
Amanda,White
Amy,Ross
Anna,Cooper
Anne,Ramirez
Arthur,Jones
Carl,Murphy
Charles,Smith
Deborah,Hernandez
Dennis,Collins
Douglas,Wright


5.4 (2 points) Please write a SQL query to show account number, account type, account balance, and transaction amount of the accounts with balance higher than 100,000 and transaction amouns higher than 15000, starting with the accounts with the highest transaction amount and highest account balance. 

In [23]:
%%sql

SELECT   Account.accNumber, Account.type, balance, amount 
FROM Account
JOIN Transactions on Account.accNumber = Transactions.accNumber 
WHERE balance > 100000 AND amount >15000
ORDER BY amount DESC, balance DESC

 * sqlite:///bank.db
   sqlite:///cssys.db
Done.


accNumber,type,balance,amount
9,SAV,132271.23,114869.79
8,BUS,121267.54,114680.63
31,CHQ,111209.89,110249.28
1,SAV,118231.13,109587.15
25,SAV,105997.07,109068.54
13,CHQ,112505.84,108440.2
20,CHQ,107270.59,108278.46
4,BUS,106503.6,104550.76
26,SAV,112046.36,104346.46
6,CHQ,107309.23,104247.4


5.5 (2 points) Please write a SQL query to find the customer ID, first name, and last name of customers who own accounts at London and Berlin branches, order by last name and first name.

In [24]:
%%sql

SELECT  DISTINCT Customer.customerID, firstName, lastName
FROM Customer
JOIN Owns on Customer.customerID = Owns.customerID
JOIN Account on Owns.accNumber = Account.accNumber 
JOIN Branch on Account.branchNumber = Branch.branchNumber 
WHERE branchName == 'London' 

INTERSECT

SELECT  DISTINCT Customer.customerID, firstName, lastName
FROM Customer
JOIN Owns on Customer.customerID = Owns.customerID
JOIN Account on Owns.accNumber = Account.accNumber 
JOIN Branch on Account.branchNumber = Branch.branchNumber 
WHERE branchName == 'Berlin' 

ORDER BY lastName ASC, firstName ASC

 * sqlite:///bank.db
   sqlite:///cssys.db
Done.


customerID,firstName,lastName
66418,Stephanie,Adams
89197,Lawrence,Anderson
41545,Terry,Bailey
33726,Jerry,Cook
86357,Andrew,Evans
44922,Dennis,Flores
87978,Christopher,Gonzalez
10839,Amy,Hayes
99537,Deborah,Hernandez
13697,Charles,Hill


## Submission

Complete the code in this notebook [hw1.ipynb](hw1.ipynb), and submit it to through Canvas system to your Homework (1) activity.