# Interactive Session 2

This Jupyter notebook is provided as a companion to Interactive Session 2, in order to practice what you have learned, and to present some new class material. It will help you to answer quiz questions included in this Interactive Session, and will give you an opportunity to experiment before answering the questions.

Please note that not all queries in the cells of this notebook are supposed to run properly. Some of them will fail and you are expected to find the reason for that.

## Initial Steps

Please run the following few cells before we start. They create the required tables and insert some tuples so that we can start experimenting with them.

In [1]:
%load_ext sql

In [2]:
%sql sqlite:///is2.db

In [4]:
%%sql
CREATE TABLE Students (
    sid CHAR(11),
    name CHAR(20) NOT NULL,
    school CHAR(10),
    age INTEGER,
    gpa REAL,
    PRIMARY KEY (sid)
);

RuntimeError: (sqlite3.OperationalError) table Students already exists
[SQL: CREATE TABLE Students (
    sid CHAR(11),
    name CHAR(20) NOT NULL,
    school CHAR(10),
    age INTEGER,
    gpa REAL,
    PRIMARY KEY (sid)
);]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
If you need help solving this issue, send us a message: https://ploomber.io/community


In [5]:
%%sql
CREATE TABLE Enrolled(
    stid CHAR(20),
    cid CHAR(20),
    grade CHAR(5),
    PRIMARY KEY (stid, cid),
    FOREIGN KEY (stid) REFERENCES Students(sid)
);

### Inner Join

In [6]:
%%sql 
INSERT INTO Students(sid, name, school, age, gpa)
VALUES  ('1002', 'Aiden', 'UBC', 19, 3.5),
        ('1003', 'Alice', 'SFU', 18, 3.7),
        ('1004', 'Bob', 'UBC', 22, 3.1),
        ('1005', 'David', 'SFU', 20, 3.2),
        ('1006', 'John', 'SFU', 21, 3.1),
        ('1007', 'Mary', 'UBC', 21, 3.4),
        ('1008', 'Mike', 'SFU', 24, 3.1),
        ('1009', 'Sarah', 'UBC', 18, 3.0);
        
SELECT * FROM Students

sid,name,school,age,gpa
1002,Aiden,UBC,19,3.5
1003,Alice,SFU,18,3.7
1004,Bob,UBC,22,3.1
1005,David,SFU,20,3.2
1006,John,SFU,21,3.1
1007,Mary,UBC,21,3.4
1008,Mike,SFU,24,3.1
1009,Sarah,UBC,18,3.0


In [7]:
%%sql
INSERT INTO Enrolled(stid, cid, grade)
VALUES  ('1003', '120','A'),
        ('1003', '125','B'),
        ('1003', '150','A');
SELECT * from Enrolled

stid,cid,grade
1003,120,A
1003,125,B
1003,150,A


In [8]:
%%sql
SELECT name, cid FROM  Students INNER JOIN Enrolled ON sid = stid;

name,cid
Alice,120
Alice,125
Alice,150


Why do you see each name multiple times?

### LEFT Outer Join



In [9]:
%%sql
SELECT name,cid FROM  Students LEFT OUTER JOIN Enrolled ON sid = stid;

name,cid
Aiden,
Alice,120.0
Alice,125.0
Alice,150.0
Bob,
David,
John,
Mary,
Mike,
Sarah,


SQLite does support the 'RIGHT JOIN' clause and also the 'FULL OUTER JOIN' clause, but we can also express these two using 'LEFT OUTER JOIN'.

You can implement 'RIGHT OUTER JOIN' by changing the table orders.

In [10]:
%%sql
SELECT name, cid FROM Enrolled LEFT OUTER JOIN Students ON sid = stid;

name,cid
Alice,120
Alice,125
Alice,150


And 'FULL OUTER JOIN' by using 'UNION' on the two queries above:

In [11]:
%%sql
SELECT name,cid FROM  Students LEFT OUTER JOIN Enrolled ON sid = stid
UNION
SELECT name, cid FROM Enrolled LEFT OUTER JOIN Students ON sid = stid;

name,cid
Aiden,
Alice,120.0
Alice,125.0
Alice,150.0
Bob,
David,
John,
Mary,
Mike,
Sarah,


### Distinct

In [12]:
%%sql
SELECT DISTINCT gpa FROM Students;

gpa
3.5
3.7
3.1
3.2
3.4
3.0


What if you ask for attribute name as well? Would it give you all possible distinct combinations? Or will it select the first student with each distinct gpa value and pass it to you?

In [13]:
%%sql
SELECT DISTINCT gpa, name FROM Students;

gpa,name
3.5,Aiden
3.7,Alice
3.1,Bob
3.2,David
3.1,John
3.4,Mary
3.1,Mike
3.0,Sarah


### Aggregation

In [14]:
%%sql
SELECT count(*) FROM Students;

count(*)
8


In [15]:
%%sql
SELECT count(DISTINCT gpa) FROM Students;

count(DISTINCT gpa)
6


In [16]:
%%sql
SELECT SUM(gpa) FROM Students;

SUM(gpa)
26.1


In [17]:
%%sql
SELECT AVG(gpa) FROM Students;

AVG(gpa)
3.2625


In [18]:
%%sql
SELECT MIN(gpa) FROM Students;

MIN(gpa)
3.0


In [19]:
%%sql
SELECT MAX(gpa) FROM Students

MAX(gpa)
3.7


In [20]:
%%sql
SELECT COUNT(DISTINCT gpa) FROM Students;

COUNT(DISTINCT gpa)
6


In [21]:
%%sql
SELECT SUM(DISTINCT gpa) FROM Students;

SUM(DISTINCT gpa)
19.9


In [22]:
%%sql
SELECT AVG(gpa) FROM Students WHERE age = 19;

AVG(gpa)
3.5


Note that incorrect use of aggregations in queries in some RDBMSs might not return Error, but will not results in meaningful or useful answers:

In [23]:
%%sql
SELECT name, AVG(gpa) FROM Students;

name,AVG(gpa)
Aiden,3.2625


In [24]:
%%sql
SELECT name, AVG(DISTINCT gpa) FROM Students;

name,AVG(DISTINCT gpa)
Aiden,3.3166666666666664


In [25]:
%%sql
SELECT sid, count(DISTINCT gpa) FROM Students;

sid,count(DISTINCT gpa)
1002,6


### Grouping

In [26]:
%%sql
SELECT AVG(gpa) FROM Students GROUP BY age;

AVG(gpa)
3.35
3.5
3.2
3.25
3.1
3.1


In [27]:
%%sql
SELECT age, avg(gpa) FROM Students WHERE gpa > 2.5 GROUP BY age;

age,avg(gpa)
18,3.35
19,3.5
20,3.2
21,3.25
22,3.1
24,3.1


### Having

In [28]:
%%sql
SELECT AVG(gpa), age
FROM Students 
WHERE gpa > 2.5
GROUP BY age
HAVING COUNT(*) >= 2;

AVG(gpa),age
3.35,18
3.25,21


In [29]:
%%sql
SELECT AVG(gpa), age
FROM Students 
WHERE gpa > 2.5
GROUP BY age
HAVING age>= 24;

AVG(gpa),age
3.1,24


In [30]:
%%sql
SELECT AVG(gpa), age
FROM Students 
WHERE gpa > 2.5
GROUP BY age
HAVING COUNT(DISTINCT gpa)>= 2;

AVG(gpa),age
3.35,18
3.25,21


### Additional Practice: SWITCH CASE
Using the queries below, try to figure out the format and usage for 'SWITCH CASE'.

In [31]:
%%sql
SELECT sid, name,
CASE
    WHEN gpa >= 3.5 THEN 'A'
    WHEN gpa < 3.5 AND gpa >= 3 THEN 'B'
    WHEN gpa < 3 AND gpa >= 2.5 THEN 'C'
    WHEN gpa < 2.5 AND gpa >= 2 THEN 'D'
    When gpa < 2 THEN 'F'
END AS lettergrade
FROM Students;

sid,name,lettergrade
1002,Aiden,A
1003,Alice,A
1004,Bob,B
1005,David,B
1006,John,B
1007,Mary,B
1008,Mike,B
1009,Sarah,B


In [32]:
%%sql
SELECT sid, name, 'A' AS lettergrade FROM Students WHERE gpa >=3.5
UNION
SELECT sid, name, 'B' AS lettergrade FROM Students WHERE gpa < 3.5 AND gpa >= 3
UNION
SELECT sid, name, 'C' AS lettergrade FROM Students WHERE gpa < 3 AND gpa >= 2.5
UNION
SELECT sid, name, 'D' AS lettergrade FROM Students WHERE gpa < 2.5 AND gpa >= 2
UNION
SELECT sid, name, 'F' AS lettergrade FROM Students WHERE gpa < 2

sid,name,lettergrade
1002,Aiden,A
1003,Alice,A
1004,Bob,B
1005,David,B
1006,John,B
1007,Mary,B
1008,Mike,B
1009,Sarah,B


### Clean up Steps

In [None]:
%%sql
DROP TABLE Enrolled;
DROP TABLE Students;