# TP3 - Views, Updates and Design

The objectives for this TP are:

1. Create and use SQL Views
2. Update database content
3. Design the database schema for a Social Network

___

For the first 2 parts we will use the **`wine.db`** database and the Tables created in TP2.

A reminder of the wine database schema:

<center>**Master1**</center>

|*Attribute*|         *Description*          |
| -------   |--------------------------------|
| NV        | Wine number                    |
| CRU       | Vineyard or group of vineyards |
| DEGRE     | Alcohol content                |
| MILL      | Vintage year                   |
| QTE       | Number of bottles harvested    |
| NP        | Producer number                |
| NOM       | Producer's last name           |
| PRENOM    | Producer's first name          |
| REGION    | Production region              |

<center>**Master2**</center>

|*Attribute*|                         *Description*                  |
| -------   |--------------------------------------------------------|
| NV        | Wine number                                            |
| CRU       | Vineyard or group of vineyards                         |
| DEGRE     | Alcohol content                                        |
| MILL      | Vintage year                                           |
| DATES     | Buying date                                            |
| LIEU      | Place where the wine was sold                          |
| QTE       | Number of bottles bought                               |
| NB        | Client (buveur) number                                 |
| NOM       | Client's last name                                     |
| PRENOM    | Client's first name                                    |
| TYPE      | Type of client by volume of purchases                  |
| REGION    | Administrative Region (different to production region) |


In [6]:
import sqlite3

In [7]:
def printSchema(connection):
    ### Source: http://stackoverflow.com/a/35092773/4765776
    for (tableName,) in connection.execute(
        """
        select NAME from SQLITE_MASTER where TYPE='table' order by NAME;
        """
    ):
        print("{}:".format(tableName))
        for (
            columnID, columnName, columnType,
            columnNotNull, columnDefault, columnPK,
        ) in connection.execute("pragma table_info('{}');".format(tableName)):
            print("  {id}: {name}({type}){null}{default}{pk}".format(
                id=columnID,
                name=columnName,
                type=columnType,
                null=" not null" if columnNotNull else "",
                default=" [{}]".format(columnDefault) if columnDefault else "",
                pk=" *{}".format(columnPK) if columnPK else "",
            ))

In [8]:
conn = sqlite3.connect('wine.db')
c = conn.cursor()
print("Database schema:")
printSchema(conn)

Database schema:
MASTER1:
  0: NV(NUM)
  1: CRU(TEXT)
  2: DEGRE(NUM)
  3: MILL(NUM)
  4: QTE(NUM)
  5: NP(NUM)
  6: NOM(TEXT)
  7: PRENOM(TEXT)
  8: REGION(TEXT)
MASTER2:
  0: NV(NUM)
  1: CRU(TEXT)
  2: DEGRE(NUM)
  3: MILL(NUM)
  4: DATES(DATE)
  5: LIEU(TEXT)
  6: QTE(NUM)
  7: NB(NUM)
  8: NOM(TEXT)
  9: PRENOM(TEXT)
  10: TYPE(TEXT)
  11: REGION(TEXT)
client:
  0: NB(NUM)
  1: NOM(TEXT)
  2: PRENOM(TEXT)
  3: TYPE(TEXT)
producteur:
  0: NP(NUM)
  1: NOM(TEXT)
  2: PRENOM(TEXT)
  3: REGION(TEXT)
production:
  0: NP(NUM)
  1: NV(NUM)
  2: QTE(NUM)
region:
  0: REGION(TEXT)
  1: LIEU(TEXT)
vente:
  0: NV(NUM)
  1: NB(NUM)
  2: DATES(NUM)
  3: LIEU(TEXT)
  4: QTE(NUM)
vin:
  0: NV(NUM)
  1: CRU(TEXT)
  2: DEGRE(NUM)
  3: MILL(NUM)


Again, we will use **`%%sql`** magic for our queries

In [9]:
%load_ext sql
%sql sqlite:///wine.db

'Connected: @wine.db'

Recreate the Normalized Tables from **Master1** and **Master2** as you did in the TP2

In [10]:
%%sql 

DROP TABLE IF EXISTS vin;

-- Create vin table
CREATE TABLE vin AS
SELECT DISTINCT NV, CRU, DEGRE, MILL
FROM MASTER1
WHERE NV IS NOT NULL;





 * sqlite:///wine.db
Done.
Done.


[]

In [11]:
%%sql 

DROP TABLE IF EXISTS producteur;
-- Create producteur table
CREATE TABLE producteur AS
SELECT DISTINCT NP, NOM, PRENOM, REGION
FROM MASTER1
WHERE NP IS NOT NULL;

DROP TABLE IF EXISTS production;

-- Create production table
CREATE TABLE production AS
SELECT DISTINCT NP, NV, QTE
FROM MASTER1
WHERE NP IS NOT NULL AND NV IS NOT NULL;

DROP TABLE IF EXISTS client;

-- Create client table
CREATE TABLE client AS
SELECT DISTINCT NB, NOM, PRENOM, TYPE
FROM MASTER2
WHERE NB IS NOT NULL;

DROP TABLE IF EXISTS vente;

-- Create vente table
CREATE TABLE vente AS
SELECT DISTINCT NV, NB, DATES, LIEU, QTE 
FROM MASTER2
WHERE NV IS NOT NULL
    AND NB IS NOT NULL
    AND DATES IS NOT NULL
    AND LIEU IS NOT NULL;
    
DROP TABLE IF EXISTS region;

-- Create region table
CREATE TABLE region AS
SELECT DISTINCT REGION, LIEU
FROM MASTER2
WHERE LIEU IS NOT NULL;


 * sqlite:///wine.db
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.


[]

___
# PART I: CREATE AND USE VIEWS

A view is a virtual table based on the result-set of an SQL statement. Views are stored in the database with an associated name.

Views are created following the syntax:

```mysql
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE [condition];
```

An useful command:

```mysql
DROP VIEW IF EXISTS view_name;
```


__Warning:__ Use `DROP` with caution (only drop something if you are sure)

__Note:__ You will find some cells marked as "Test" that will help you check your work. Do NOT modify them. 

#### Exercise 1.1

Create a view "**bons_buveurs**" with the clients (buveurs) of type 'gros' or 'moyen'.

In [12]:
%%sql
DROP VIEW IF EXISTS bons_buveurs;

CREATE VIEW bons_buveurs AS
SELECT *
FROM client
WHERE client.TYPE IN ('gros', 'moyen');

 * sqlite:///wine.db
Done.
Done.


[]

In [13]:
# Test
%sql SELECT * FROM bons_buveurs ORDER BY nb;

 * sqlite:///wine.db
Done.


NB,NOM,PRENOM,TYPE
2,Artaud,Antonin,moyen
3,Aron,Raymond,gros
4,Apollinaire,Guillaume,moyen
6,Arrabal,Fernando,gros
7,Anouilh,Jean,moyen
8,Aragon,Louis,gros
10,Andersen,Yann,gros
12,Bataille,Georges,moyen
13,Barthes,Roland,moyen
14,Bory,Jean Louis,gros


#### Exercise 1.2

Create the view "**buveurs_asec**" with clients (buveurs) who have not bought any wine.

In [14]:
%%sql sqlite:///wine.db


DROP VIEW IF EXISTS buveurs_asec;

CREATE VIEW buveurs_asec AS
SELECT NOM, PRENOM, client.NB
FROM client
WHERE client.NB NOT IN (SELECT vente.NB 
                         FROM vente
                         GROUP BY vente.NB);
 

Done.
Done.


[]

In [15]:
# Test
%sql SELECT * FROM buveurs_asec ORDER BY nb;

 * sqlite:///wine.db
Done.


NOM,PRENOM,NB
Breton,Andre,11
Barthes,Roland,13
Balzac,Honore de,16
Celine,Louis Ferdinand,18
Chateaubriand,Francois-Rene de,20
Corbiere,Tristan,21
Corneille,Pierre,23
Char,Rene,25
Dumas,Alexandre,27
Fournier,Alain,29


#### Exercise 1.3

Create the view "**buveurs_achats**" complementary to the previous one.

In [16]:
%%sql
DROP VIEW IF EXISTS buveurs_achats;

CREATE VIEW buveurs_achats AS
SELECT NOM, PRENOM, client.NB
FROM client
WHERE client.NB IN (SELECT vente.NB 
                         FROM vente
                         GROUP BY vente.NB);

 * sqlite:///wine.db
Done.
Done.


[]

In [17]:
# Test
%sql SELECT * FROM buveurs_achats ORDER BY nb;

 * sqlite:///wine.db
Done.


NOM,PRENOM,NB
Aristote,,1
Artaud,Antonin,2
Aron,Raymond,3
Apollinaire,Guillaume,4
Audiberti,Jacques,5
Arrabal,Fernando,6
Anouilh,Jean,7
Aragon,Louis,8
Ajar,Emile,9
Andersen,Yann,10


#### Exercise 1.4

Create the view "**q83pl**" (LIEU, CRU, QTE_BUE) that provides by LIEU and CRU the total quantities bought in 1983 by all the clients (buveurs).

In [20]:
%%sql
DROP VIEW IF EXISTS q83pl;

CREATE VIEW q83pl AS
SELECT Lieu, CRU, SUM(vente.QTE) AS QTE_BUE 
FROM vente
JOIN vin ON vin.NV = vente.NV
WHERE vente.DATEs LIKE "1983%"
GROUP BY vente.LIEU, vin.CRU;

 * sqlite:///wine.db
Done.
Done.


[]

In [21]:
# Test
%sql SELECT * FROM q83pl;

 * sqlite:///wine.db
Done.


LIEU,CRU,QTE_BUE
CAEN,Seyssel,3
LILLE,Pommard,5
LYON,Beaujolais Villages,10
LYON,Julienas,2
PARIS,Beaujolais Primeur,4
PARIS,Coteaux du Tricastin,1
PARIS,Pouilly Vinzelles,3
RENNES,Mercurey,1
ROCQUENCOURT,Beaujolais Villages,260
ROCQUENCOURT,Saint Amour,80


#### Exercise 1.5

Can we define the same view with ascending order over the attribute "QTE"? Provide an explanation for your answer.

___
# PART II: UPDATE DATABASE CONTENT

The syntax for the `UPDATE` operation is:

```sql
UPDATE table_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];
```

The syntax for the `INSERT` operation is:

```sql
INSERT INTO table_name [(column1, column2, column3,...columnN)]  
VALUES (value1, value2, value3,...valueN);
```

Database updates are commited automatically in Jupyter/Python. _Transactions_ are an option to control and reverse changes. Additionally we can just reload a backup of the database (NOT an option in deployed systems)

__Note:__ Different to other Database Management Systems, SQLite views are read-only and so you can not execute a `DELETE`, `INSERT` or `UPDATE` statement on a view.

#### Exercise 2.1

Create a table "**RBB**" with the same schema as "**bons_buveurs**" which contains the tuples selected from "**bons_buveurs**"

In [24]:
%%sql 

DROP TABLE IF EXISTS RBB;

-- Create RBB table

CREATE TABLE RBB AS
SELECT *
FROM bons_buveurs;

 * sqlite:///wine.db
Done.
Done.


[]

In [25]:
# Test
%sql SELECT * FROM RBB;

 * sqlite:///wine.db
Done.


NB,NOM,PRENOM,TYPE
13,Barthes,Roland,moyen
16,Balzac,Honore de,moyen
18,Celine,Louis Ferdinand,gros
20,Chateaubriand,Francois-Rene de,moyen
27,Dumas,Alexandre,gros
32,Eluard,Paul,moyen
35,Fromentin,Eugene,gros
39,Montesquieu,,gros
42,Goethe,Johann Wolfgang von,moyen
43,Musset,Alfred de,gros


#### Exercise 2.2

Update the table you used to create "**bons_buveurs**": Change the "type" to 'gros' if the total of quantities bought is over 100.

Find the instances to update (schema may differ from the one in your table)

In [31]:
%%sql 

SELECT client.NB, client.NOM, client.TYPE, SUM(vente.QTE)
FROM client
    JOIN vente ON client.NB = vente.NB
WHERE client.TYPE IN ('petit', 'moyen')
GROUP BY (client.NB)
HAVING SUM(vente.QTE) > 100;

 * sqlite:///wine.db
Done.


NB,NOM,TYPE,SUM(vente.QTE)
2,Artaud,moyen,583
5,Audiberti,petit,113
9,Ajar,petit,140
44,Gide,petit,171


Update instances

In [36]:
%%sql
UPDATE client
SET TYPE = 'gros'
WHERE client.NB IN (
   SELECT client.NB
    FROM client
    JOIN vente ON client.NB = vente.NB
    WHERE client.TYPE IN ('petit', 'moyen')
    GROUP BY (client.NB)
    HAVING SUM(vente.QTE) > 100
);

 * sqlite:///wine.db
4 rows affected.


[]

#### Exercise 2.3

Compare the content of _table_ "**RBB**" and the _view_ "**bons_buveurs**" after the update. What differences do you see? Explain

Let's focus on the list of tuples selected for update NB values: 2, 5, 9, 44. We can see from the querries below that:
1. RBB is not affected by the update, client NB=2 is still referenced as a medium type. This is expected as it is a table created from a view, once created it has no connection to the view.
2. The bon-buveurs view is however affected, the four clients are now referenced as big type. This is also expected, a view is the result of a stored query that is computed on demand. Here we demand a view after the table has been updated therefore it is reflected in the view.

In [39]:
%%sql 

SELECT *
FROM RBB
WHERE RBB.NB IN (2, 5, 9, 44);



Done.


NB,NOM,PRENOM,TYPE
2,Artaud,Antonin,moyen


In [40]:
%%sql

SELECT *
FROM bons_buveurs
WHERE bons_buveurs.NB IN (2, 5, 9, 44);

 * sqlite:///wine.db
Done.


NB,NOM,PRENOM,TYPE
2,Artaud,Antonin,gros
44,Gide,Andre,gros
9,Ajar,Emile,gros
5,Audiberti,Jacques,gros


#### Exercise 2.4

Create a table "**RBA**" with the same schema as "**buveurs_asec**" which contains the tuples selected from "**buveurs_asec**"

In [41]:
%%sql
DROP TABLE IF EXISTS RBA;

-- Create clients table
CREATE TABLE RBA AS
SELECT *
FROM buveurs_asec;

 * sqlite:///wine.db
Done.
Done.


[]

In [42]:
# Test
%sql SELECT * FROM RBA

 * sqlite:///wine.db
Done.


NB,NOM,PRENOM,TYPE
11,Breton,Andre,petit
13,Barthes,Roland,moyen
16,Balzac,Honore de,moyen
18,Celine,Louis Ferdinand,gros
20,Chateaubriand,Francois-Rene de,moyen
21,Corbiere,Tristan,petit
23,Corneille,Pierre,petit
25,Char,Rene,petit
27,Dumas,Alexandre,gros
29,Fournier,Alain,petit


#### Exercise 2.5

Insert a tuple (101, 'your last name', 'your first name', 'your type of purchases(petit, moyen, or gros)') to "**RBA**".

In [43]:
%%sql sqlite:///wine.db

INSERT INTO RBA VALUES (101, 'Younes', 'Chloe', 'petit');

1 rows affected.


[]

In [44]:
# Test
%sql SELECT * FROM RBA

 * sqlite:///wine.db
Done.


NB,NOM,PRENOM,TYPE
11,Breton,Andre,petit
13,Barthes,Roland,moyen
16,Balzac,Honore de,moyen
18,Celine,Louis Ferdinand,gros
20,Chateaubriand,Francois-Rene de,moyen
21,Corbiere,Tristan,petit
23,Corneille,Pierre,petit
25,Char,Rene,petit
27,Dumas,Alexandre,gros
29,Fournier,Alain,petit


;;#### Exercise 2.6;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;://///////////;;;;;;
;Compare the content of _table_ "**RBA**" and the _view_ "**buveurs_asec**". What differences do you see? Explain

In [46]:
%%sql 

SELECT * FROM buveurs_asec;

 * sqlite:///wine.db
Done.


NB,NOM,PRENOM,TYPE
11,Breton,Andre,petit
13,Barthes,Roland,moyen
16,Balzac,Honore de,moyen
18,Celine,Louis Ferdinand,gros
20,Chateaubriand,Francois-Rene de,moyen
21,Corbiere,Tristan,petit
23,Corneille,Pierre,petit
25,Char,Rene,petit
27,Dumas,Alexandre,gros
29,Fournier,Alain,petit


___
# PART III: Design the database schema for posts in a Social Network

In this section your task is to design the database schema for a social network app of a new startup:

The new social network will contain users, where each user will have a name, a nickname, an email, date of birth, and an address (Street, City, State, Country, Postal Code). Users can be friends of other users, and can publish posts. Each post can contain a text, date and attachment. Posts can be either original posts or replies so the app needs to handle both scenarios. When users log in, the app needs to display the posts of their friends.

**Note:** You can create diagrams of your proposal and insert them as images into this notebook.

#### Exercise 3.1

Write and explain the design of the relations of your database

**Social network design**

We have 2 entity sets:
1. Users
2. Posts
We have 2 binary relationships:
1. Publish (between users and posts)
2. Friendship (between users and users)

**Publish** relationship is a one-to-many relationship between a user and a post:
 - a user is associated with several (including 0) posts via **Publish**
 - a post is associated with exactly one user via **Publish**,
 
 
**Friendship** relationship is a zero-to-many relationship between a user and another user:
 - A user has zero to many friends, and a user is a friend to zero to many users.

All these relations can be summarized in the figure below.
Rectangle boxes denote entities, diamond boxes are relationships.
The double line indicates the total participation of post in **Publish** (ie a post has to have a publisher = a user).

In [49]:
url = "https://www.lucidchart.com/invitations/accept/6b0581a1-71ff-406e-bb3c-9c9dab49ebb7"
from IPython.display import HTML
HTML(filename= <div style="width: 640px; height: 480px; margin: 10px; position: relative;"><iframe allowfullscreen frameborder="0" style="width:640px; height:480px" src="https://www.lucidchart.com/documents/embeddedchart/e78c2631-7bcf-4f91-9cfb-01bd5e1d0ed2" id="n.i9r7eTxr9-"></iframe></div>)

SyntaxError: invalid syntax (<ipython-input-49-aed3c32d6c3f>, line 3)

##### Reduction to Relation Schemas

From diagram above, entity sets and relationship sets can be expressed as
relation schemas that represent the contents of the database;
we can create 4 tables: **Users**, **Posts**, **Publish** and **Friendship**.


**Users**

The table contains 10 fields with **ID** as the *PRIMARY KEY*.

| *Attribute*        |
|:-------------------|
| **ID** [PK]        |
| NAME               |
| NICKNAME           |
| EMAIL              |
| DATE_OF_BIRTH      |
| STREET             |
| CITY               |
| STATE              |
| COUNTRY            |
| POSTAL_CODE        |


**Posts**

The table contains 5 fields with **ID** as the *PRIMARY KEY*.

| *Attribute*        |
|:-------------------|
| **ID** [PK]        |
| TEXT               |
| DATE               |
| ATTACHMENT         |
| TYPE_OF_POST       |


**Publish**

The table contains 2 fields with **USER_ID** and  **POST_ID** as the *PRIMARY KEY*.

| *Attribute*        |
|:-------------------|
| **USER_ID** [PK]   | 
| **POST_ID** [PK]   | 


**Friendship**

The table contains 2 fields with **USER_ID_1** and  **USER_ID_2** or **FRIEND_ID** as the *PRIMARY KEY*.

| *Attribute*        |
|:-------------------|
| **USER_ID** [PK]   | 
| **FRIEND_ID** [PK] | 



#### Exercise 3.2

Write a view to retrieve the posts to display when a user logs in. Consider that some users may have a lot of friends and you need to limit the number of post to display. How would you select relevant posts to display first? What kind of information would you use/add in the database for this purpose? Explain your answer.

__Note:__ Limiting the number of posts just by count is too simplistic, the user could be missing something interesting to him/her.