# Lab 1 - SQL

*Objective:* to practice writing SQL queries.

To run this lab as a `jupyter` notebook, you can download it
[here](lab1.zip) (the zip-file contains the notebook and the
database).

## Background

We have a database to handles the academic achievements of
students at LTH -- in it we have three tables:

* `students` -- contains student data:
   - `ssn` -- social security number ('personnummer')
   - `first_name`
   - `last_name`


* `courses` -- describes the courses:
   - `course_code`
   - `course_name`
   - `level` ("G1", "G2", or "A")
   - `credits`

  
* `taken_courses` -- keeps track of which courses the
   students have taken, once a student has passed a course,
   we add a row in this table:
   - `ssn` -- the social security number of the student
   - `course_code` -- what course has been taken
   - `grade`

![There should be an image here](lab1.png)

Some sample data:

~~~ {.text}
ssn           first_name   last_name
---           ----------   ---------
861103–2438   Bo           Ek
911212–1746   Eva          Alm
950829–1848   Anna         Nyström
...           ...          ...

course_code   course_name                   level    credits
-----------   -----------                   -----    -------
EDA016        Programmeringsteknik          G1       7.5
EDAA01        Programmeringsteknik - FK     G1       7.5
EDA230        Optimerande kompilatorer      A        7.5
...           ...                           ...      ...

ssn           course_code   grade
---           -----------   -----
861103–2438   EDA016        4
861103–2438   EDAA01        3
911212–1746   EDA016        3
...           ...           ...
~~~


The tables have been created with the following SQL
statements:

~~~ {.sql}
CREATE TABLE students (
  ssn          CHAR(11),
  first_name   TEXT NOT NULL,
  last_name    TEXT NOT NULL,
  PRIMARY KEY  (ssn)
);

CREATE TABLE courses (
  course_code   CHAR(6),
  course_name   TEXT NOT NULL,
  level         CHAR(2),
  credits       DOUBLE NOT NULL CHECK (credits > 0),
  PRIMARY KEY   (course_code)
);

CREATE TABLE taken_courses (
  ssn           CHAR(11),
  course_code   CHAR(6),
  grade         INTEGER NOT NULL CHECK (grade >= 3 AND grade <= 5),
  PRIMARY KEY   (ssn, course_code),
  FOREIGN KEY   (ssn) REFERENCES students(ssn),
  FOREIGN KEY   (course_code) REFERENCES courses(course_code)
);
~~~


All courses offered at the "Computer Science and
Engineering" program at LTH during the academic year 2013/14
are in the table 'courses`. Also, the database has been
filled with made up data. SQL statements like the following
have been used to insert the data:

~~~ {.sql}
INTO   students (ssn, first_name, last_name)
VALUES ('950705-2308', 'Anna', 'Johansson'),
       ('930702-3582', 'Anna', 'Johansson'),
       ('911212-1746', 'Eva', 'Alm'),
       ('910707-3787', 'Eva', 'Nilsson'),
       ...
~~~


## Assignments

In [2]:
%load_ext sql

In [3]:
%sql sqlite:///lab1.sqlite

'Connected: None@lab1.sqlite'

The tables `students`, `courses` and `taken_courses` already
exist in your database. If you change the contents of the
tables, you can always recreate the tables with the
following command (at the mysql prompt):

~~~ {.sh}
sqlite3 lab1.db < setup-lab1-db.sql
~~~


After some of the questions there is a number in brackets.
This is the number of rows generated by the question. For
instance, [72] after question a) means that there are 72
students in the database.

a) What are the names (first name, last name) of all the
   students? [72]

In [3]:
%%sql
SELECT first_name, last_name
FROM students

Done.


first_name,last_name
Anna,Johansson
Anna,Johansson
Eva,Alm
Eva,Nilsson
Elaine,Robertson
Maria,Nordman
Helena,Troberg
Lotta,Emanuelsson
Anna,Nyström
Maria,Andersson


b) Same as question a) but produce a sorted listing. Sort
   first by last name and then by first name.

In [4]:
%%sql
SELECT first_name, last_name
FROM students
ORDER BY last_name, first_name

Done.


first_name,last_name
Daniel,Ahlman
Eva,Alm
Martin,Alm
Erik,Andersson
Erik,Andersson
Maria,Andersson
Niklas,Andersson
Märit,Aspegren
Daniel,Axelsson
Henrik,Berg


c) What are the names of the students who were born in 1985?
   [4]

In [6]:
%%sql
SELECT first_name, last_name, ssn
FROM students
WHERE SUBSTR(ssn, 1, 2) = "85"

Done.


first_name,last_name,ssn
Ulrika,Jonsson,850706-2762
Bo,Ek,850819-2139
Filip,Persson,850517-2597
Henrik,Berg,850208-1213


d) The next-to-last digit in the social security number is
   even for females, and odd for males. List the names of
   all female students in our database. Hint: the `SUBSTR`
   function can be useful. [26]

In [24]:
%%sql
SELECT first_name, last_name, ssn
FROM students
WHERE SUBSTR(ssn, 10, 1) IN ("0", "2", "4", "6", "8")


Done.


first_name,last_name,ssn
Anna,Johansson,950705-2308
Anna,Johansson,930702-3582
Eva,Alm,911212-1746
Eva,Nilsson,910707-3787
Elaine,Robertson,931213-2824
Maria,Nordman,951122-1048
Helena,Troberg,910308-1826
Lotta,Emanuelsson,941003-1225
Anna,Nyström,950829-1848
Maria,Andersson,860819-2864


e) How many students are registered in the database?

In [25]:
%%sql
SELECT COUNT() as Total
from students

Done.


Total
72


f) Which courses are offered by the department of
   Mathematics (their course codes have the form `FMAxxx`)?
   [22]

In [47]:
%%sql
SELECT * 
FROM courses
WHERE course_code LIKE "FMA%"

Done.


course_code,course_name,level,credits
FMA021,Kontinuerliga system,A,7.5
FMA051,Optimering,A,6.0
FMA091,Diskret matematik,G1,6.0
FMA111,Matematiska strukturer,A,6.0
FMA120,Matristeori,A,6.0
FMA125,"Matristeori, projektdel",A,3.0
FMA135,Geometri,G1,6.0
FMA140,Olinjära dynamiska system,A,6.0
FMA145,"Olinjära dynamiska system, projektdel",A,3.0
FMA170,Bildanalys,A,6.0


g) Which courses give more than 7.5 credits? [16]

In [28]:
%%sql
SELECT * 
FROM courses
WHERE credits > 7.5

Done.


course_code,course_name,level,credits
EDA270,Coachning av programvaruteam,A,9.0
EDAA05,Datorer i system,G1,8.0
EIEF01,Tillämpad mekatronik,G2,10.0
EIEN01,"Mekatronik, industriell produktframtagning",A,10.0
EIT020,Digitalteknik,G2,9.0
EITF01,Digitala bilder – kompression,G2,9.0
ESS050,Elektromagnetisk fältteori,G2,9.0
ETIA01,Elektronik,G1,8.0
EXTA35,Introduktionskurs i kinesiska för civilingenjörer,G1,15.0
EXTF60,"Introduktionskurs i kinesiska för civilingenjörer, del 2",G2,15.0


h) How may courses are there for each level (`G1`, `G2`, and
   `A`)?

In [30]:
%%sql
SELECT level, COUNT()
FROM courses
GROUP BY level

Done.


level,COUNT()
A,87
G1,31
G2,60


i) Which courses (course codes only) have been taken by the
   student with social security number 910101–1234? [35]

In [58]:
%%sql
SELECT course_code
FROM taken_courses
WHERE ssn="910101-1234"

Done.


course_code
EDA070
EDA385
EDAA25
EDAF05
EEMN10
EIT020
EIT060
EITF40
EITN40
EITN50


j) What are the names of these courses, and how many credits
   do they give?

In [49]:
%%sql
SELECT * 
FROM courses
WHERE course_code IN(
    SELECT course_code
    FROM taken_courses
    WHERE ssn="910101-1234")

Done.


course_code,course_name,level,credits
EDA070,Datorer och datoranvändning,G1,3.0
EDA385,"Konstruktion av inbyggda system, fördjupningskurs",A,7.5
EDAA25,C-programmering,G1,3.0
EDAF05,"Algoritmer, datastrukturer och komplexitet",G2,5.0
EEMN10,Datorbaserade mätsystem,A,7.5
EIT020,Digitalteknik,G2,9.0
EIT060,Datasäkerhet,G1,7.5
EITF40,Digitala och analoga projekt,G2,7.5
EITN40,Avancerad webbsäkerhet,A,4.0
EITN50,Avancerad datasäkerhet,A,7.5


k) How many credits has the student taken?

In [62]:
%%sql
SELECT COUNT(), SUM(credits)
FROM courses
WHERE course_code IN (
    SELECT course_code
    FROM taken_courses
    WHERE ssn="910101-1234"
    )

    

Done.


COUNT(),SUM(credits)
35,249.5


l) Which is the student’s grade average?

In [64]:
%%sql
SELECT AVG(grade), count()
FROM taken_courses
WHERE ssn="910101-1234"

Done.


AVG(grade),count()
4.0285714285714285,35


m) Which students have taken 0 credits? [11]

In [26]:
%%sql
SELECT first_name, last_name
FROM students
WHERE ssn NOT IN(
    SELECT     ssn
    FROM    taken_courses 
)



    

Done.


first_name,last_name
Anna,Nyström
Caroline,Olsson
Bo,Ek
Erik,Andersson
Erik,Andersson
Johan,Lind
Filip,Persson
Jonathan,Jönsson
Magnus,Hultgren
Joakim,Hall


n) List the names and average grades of the 10 students with
   the highest grade average?

In [38]:
%%sql

SELECT first_name, last_name, AVG(grade)
FROM taken_courses
JOIN students
USING (ssn)
GROUP BY ssn
ORDER BY AVG(grade) DESC
LIMIT 10

Done.


first_name,last_name,AVG(grade)
Bo,Ek,4.35
Helena,Troberg,4.307692307692308
Elaine,Robertson,4.235294117647059
Anna,Johansson,4.230769230769231
Ylva,Jacobsson,4.21875
Anna,Johansson,4.2
Mikael,Nilsson,4.173913043478261
Jakob,Malmberg,4.166666666666667
Maria,Andersson,4.157894736842105
Per-Erik,Pettersson,4.153846153846154


o) List the social security number and total number of
   credits for all students. Students with no credits should
   be included with 0 credits, not null. If you do this with
   an outer join you might want to use the function
   `COALESCE(v1, v2, ...)`; it returns the first value which
   is not `NULL`. (It is a little bit tricky to get this
   query right, if you're missing the students with 0
   credits, don't worry, your TA will help you get it
   right). [72]

In [50]:
%%sql
SELECT first_name, last_name, SUM(credits)
FROM students
JOIN taken_courses
USING (ssn)
JOIN courses
USING(course_code)
GROUP BY ssn

Done.


first_name,last_name,SUM(credits)
Henrik,Berg,166.5
Ulrika,Jonsson,30.0
Bo,Ek,76.5
Eva,Hjort,151.0
Niklas,Andersson,70.5
Maria,Andersson,140.5
Bo,Ek,153.0
Marie,Persson,254.0
Martin,Alm,250.5
Susanne,Dahl,348.5


p) Is there more than one student with the same name? If so,
   who are these students and what are their social security
   numbers? [7]

In [64]:
%%sql
SELECT * 
FROM students AS s1
JOIN students AS s2
ON s1.first_name == s2.first_name AND s1.last_name == s2.last_name AND s1.ssn != s2.ssn


Done.


ssn,first_name,last_name,ssn_1,first_name_1,last_name_1
950705-2308,Anna,Johansson,930702-3582,Anna,Johansson
930702-3582,Anna,Johansson,950705-2308,Anna,Johansson
861103-2438,Bo,Ek,850819-2139,Bo,Ek
861103-2438,Bo,Ek,931225-3158,Bo,Ek
931225-3158,Bo,Ek,850819-2139,Bo,Ek
931225-3158,Bo,Ek,861103-2438,Bo,Ek
850819-2139,Bo,Ek,861103-2438,Bo,Ek
850819-2139,Bo,Ek,931225-3158,Bo,Ek
891220-1393,Erik,Andersson,900313-2257,Erik,Andersson
900313-2257,Erik,Andersson,891220-1393,Erik,Andersson


q) What 5 courses have the highest grade average?

In [None]:
%%sql


r) (Not required) What are the 'best' three first initial
   letters of the last names, i.e., if you take the average
   grades for each first letter of the last name, which
   three initials have the highest averages?

In [None]:
%%sql
