# HW3. Relational Algebra and Database Design

## Objectives

In this assignment, you will review relational algebra and design theory, and also write more SQL queries. You will practice: 
 - How to use `Relational Algebra` to describe the SQL queries you have previously written
 - How to use `Entity Relationship Model` to design a database and translate it to sql queries for creating tables
 - How to use `Design Theory` to refine a database you have designed
 - How to use `Constraints & Triggers` to make sure consistency of your data

## Q1 (5 points): Relational Algebra

### Preparation 

To write a relational algebra query in a cell, the cell should be a [Markdown cell](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html). You can use [LaTeX equations](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html#LaTeX-equations) in a markdown cell for required algebraic notation. Here is a list of the main operators:

* Selection ($\sigma$)
* Projection ($\pi$)
* Union ($\cup$)
* Intersect ($\cap$)
* Set Difference ($-$) 
* Cross Product ($\times$)
* Rename ($\rho$)
* Join ($\bowtie$)
* Conjunction ($\wedge$)
* Disconjunction ($\vee$)
* Greater Than or Equal To ($\geq$)
* Less Than or Equal To ($\leq$)

You may also need $_{Subscript}$ and $^{Superscript}$ in the notations you use.

Consider the same bank database you have used in previous homework assignments.
 - Customer = {<span style="text-decoration:underline">customerID</span>, firstName, lastName, income, birthDate}
 - Account = {<span style="text-decoration:underline">accNumber</span>, type, balance, branchNumber<sup>FK-Branch</sup>}
 - Owns = {<span style="text-decoration:underline">customerID</span><sup>FK-Customer</sup>, <span style="text-decoration:underline">accNumber</span><sup>FK-Account</sup>}
 - Transactions = {<span style="text-decoration:underline">transNumber</span>, <span style="text-decoration:underline">accNumber</span><sup>FK-Account</sup>, amount}
 - Employee = {<span style="text-decoration:underline">sin</span>, firstName, lastName, salary, branchNumber<sup>FK-Branch</sup>}
 - Branch = {<span style="text-decoration:underline">branchNumber</span>, branchName, managerSIN<sup>FK-Employee</sup>, budget}
 

In each question below, please write down the relational algebraic presentations for the described query.

In [0]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [0]:
%sql sqlite:///bank1.db

'Connected: @bank1.db'

1.1 (1 point) Find out names of the bank branches and first name and last name of their managers.

$ \pi_{BranchName, firstName, lastName} ( Branch \bowtie_{managerSIN=sin} Employee) $



1.2 (1 point) Show account number, account type, account balance, and transaction amount of the accounts with balance higher than 100,000 and transaction amounts higher than 15000.

$ \pi_{a.accNumber, a.type, a.balance, t.amount} (\sigma_{t.amount>=15,000\: AND \:a.balance>=100,000}  \:(Transactions \: t \bowtie_{t.accNumber = a.accNumber} Amount \: a)) $


1.3 (1 point) Show first name, last name, and income of customers whose income is at least twice the income of any customer whose lastName is Butler. 

$temp1 \; t1 \leftarrow \rho_{maxIncome} (\pi_{MAX(income *2)}(\sigma_{lastName = 'Butler'}(Customer)))$
\
$ \pi_{c.firstName, c.lastName, c.income} (Customer \; c \bowtie_{c.income >= t1.maxIncome} (temp1 \; t1) ) $



1.4 (2 points) Show Customer ID, income, account numbers and branch numbers of customers with income greater than 90,000 who own an account at both London and Latveria branches.

$temp1 \; t1 \leftarrow (\sigma_{b.branchName = 'London'}(Owns\: o \bowtie_{o.accNumber = a.accNumber}(Account \: a \bowtie_{a.branchName = b.branchName}(Branch b))))\cap (\sigma_{b.branchName = 'Latveria'}(Owns\: o \bowtie_{o.accNumber = a.accNumber}(Account \: a \bowtie_{a.branchName = b.branchName}(Branch b))))$
\
\
$ \pi_{c.customerId, c.income, t1.accNumber, t1.branchNumber}(\text{temp1} \; t1 \bowtie_{t1.customerId=c.customerId}(\sigma_{c.income >= 90,000} (Customer \; c))) $



## Q2 (5 points): DB Design

Imagine we want to build a music database with the following characteristics:
 - An **artist** is known by their name. We also keep an artist's *genre*, *hometown*, *bio*, and *homepage* in the database.
 - An **album** has an artist. It is produced by a *recording company*.
 - An album is known by *name* of the album and the *name* of its artist. We also keep *year*, *number of tracks* (at least one), and the *recording studio* for an album.
 - An album has songs on the album. 
 - A **recording company** is known by its name. We also keep *address*, *homepage*, and *telephone number* for a *recording company*.
 - A **song** is known by its *name*, *name* of its artist, and the *album* it is part of. We also keep *length* and *track number* for the song. A song might have *guest musicians*. A song may have a *tablature*.
 - A **tablature** is known by the *URL*. We keep *date*, *transcriber*, and transcriber *email* for a tablature.
 - A **musician** is known by their *name*. A musician should have an *instrument*. We also keep *hometown* for a musician.
 - Musicians and an artist can be in a group.
 - Artists might influence a musician.
 
 Design the ERM to capture this database. Please note that you do not need to submit your ERM design, but you need to use it to answer questions **Q2**, **Q3**, and **Q4**.

Please list your schemas (not create table statements).

$Artists(\underline{artistName}, genre, hometown, bio, homepage)$

$Albums(\underline{albumName}, \underline{artistName^{Artists}}, year, numTracks, recName^{RecordingCompany})$

$RecordingCompany(\underline{recName}, addr, homepage, tel)$

$Songs(\underline{songName}, \underline{artistName^{Artists}}, \underline{albumName^{Albums}}, length, trackNum, url^{Tablature})$

$Guestmusicians(\underline{musicianName^{Musicians}}, songName^{Songs}, artistName^{Artists}, albumName^{Albums})$

$Tablature(\underline{url}, date, transcriber, email)$

$Musicians(\underline{musicianName}, instrument, hometown)$

$Groups(\underline{artistName^{Artists}}, musicianName^{Musicians})$

$Influence(\underline{musicianName^{Musicians}}, artistName^{Artists})$


## Q3 (5 points): Functional Dependencies & BCNF

Please list the functional dependencies in your relations, ensuring your relations are in BCNF.


$artistName \leftarrow genre, hometown, bio, homepage$

$albumName, artistName \leftarrow year, numTracks, recName$

$recName \leftarrow addr, homepage, tel $

$songName, artistName, albumName \leftarrow length, trackNum, url$

$musicianName \leftarrow songName, artistName, albumName$

$url \leftarrow date, transcriber, email$

$musicianName \leftarrow instrument, hometown$

$artistName \leftarrow musicianName$

$musicianName \leftarrow artistName$








## Q4 (10 points): Constraints & Triggers

Please use proper `CREATE TABLE` statements required to implement the database described above. Ensure your create statements include all required **constraints and triggers**.

In [0]:
%load_ext sql

In [3]:
%sql sqlite:///music.db

'Connected: @music.db'

In [19]:
%%sql

DROP TABLE IF EXISTS Artists;
DROP TABLE IF EXISTS Albums;
DROP TABLE IF EXISTS RecordingComapany;
DROP TABLE IF EXISTS Songs;
DROP TABLE IF EXISTS Musicians;
DROP TABLE IF EXISTS Tablature;
DROP TABLE IF EXISTS Influence;
DROP TABLE IF EXISTS GuestMusicians;
DROP TABLE IF EXISTS Groups;


 * sqlite:///music.db
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.


[]

In [20]:
%%sql
create table Artists(
    artistName TEXT, 
    genre TEXT,
    hometown TEXT,
    bio TEXT,
    homepage TEXT,
    primary key(artistName)
);

create table Albums(
    albumtName TEXT, 
    artistName TEXT,
    year YEAR,
    numTracks int,
    recName TEXT,
    primary key(albumtName, artistName)
    foreign key(artistName) references Artistz(artistName)
    foreign key(recName) references RecordingComapany(recName)
    CHECK (numTracks >= 1 )
);

create table RecordingComapany(
    recName TEXT, 
    addr TEXT,
    homepage TEXT,
    tel TEXT,
    primary key(recName)
);

create table Songs(
    songName TEXT,
    albumtName TEXT, 
    artistName TEXT,
    length TIMESTAMP,
    Tracknum int,
    url TEXT,
    primary key(songName, albumtName, artistName)
    foreign key(artistName) references Artistz(artistName)
    foreign key(albumtName) references Albumz(albumtName)
    foreign key(url) references Tablature(url)
);

create table GuestMusicians(
    musicianName TEXT,
    songName TEXT,
    albumtName TEXT, 
    artistName TEXT,
    primary key(musicianName)
    foreign key(musicianName) references Musicians(musicianName)
    foreign key(artistName) references Artists(artistName)
    foreign key(albumtName) references Albums(albumtName)
    foreign key(songName) references Songs(songName)
);

create table Tablature(
    url TEXT, 
    date DATE,
    transcriber TEXT,
    email TEXT,
    primary key(url)
);

create table Musicians(
    musicianName TEXT, 
    instrument TEXT,
    hometown TEXT,
    primary key(musicianName)
);

create table Groups(
    musicianName TEXT, 
    artistName TEXT,
    primary key(artistName)
    foreign key(artistName) references Artists(artistName)
    foreign key(musicianName) references Musicians(musicianName)

);

create table Influence(
    musicianName TEXT, 
    artistName TEXT,
    primary key(musicianName)
    foreign key(artistName) references Artists(artistName)
    foreign key(musicianName) references Musicians(musicianName)
);



 * sqlite:///music.db
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.


[]

## Submission

**Complete** the code in this notebook [hw3.ipynb](hw3.ipynb) and submit it to the Canvas activity Homework(3). Please note than you can insert additional cells if required for your answers.