# SD202 TP3 - Views, Updates and Database Design

The objectives for this TP are:

1. Create and use SQL Views
2. Update database content
3. Design the database schema for a Social Network


For the first 2 parts we will use the _wine_ database and the Tables created in TP2.

A reminder of the wine database schema:

In [1]:
import sqlite3

In [2]:
def printSchema(connection):
    ### Source: http://stackoverflow.com/a/35092773/4765776
    for (tableName,) in connection.execute(
        """
        select NAME from SQLITE_MASTER where TYPE='table' order by NAME;
        """
    ):
        print("{}:".format(tableName))
        for (
            columnID, columnName, columnType,
            columnNotNull, columnDefault, columnPK,
        ) in connection.execute("pragma table_info('{}');".format(tableName)):
            print("  {id}: {name}({type}){null}{default}{pk}".format(
                id=columnID,
                name=columnName,
                type=columnType,
                null=" not null" if columnNotNull else "",
                default=" [{}]".format(columnDefault) if columnDefault else "",
                pk=" *{}".format(columnPK) if columnPK else "",
            ))

In [3]:
conn = sqlite3.connect('wine.db')
c = conn.cursor()
print("Database schema:")
printSchema(conn)

Database schema:
Harvest:
  0: NP(INTEGER)
  1: NV(INTEGER)
  2: QTE(INTEGER)
Location:
  0: LIEU(TEXT) *1
  1: REGION(TEXT)
MASTER1:
  0: NV(NUM)
  1: CRU(TEXT)
  2: DEGRE(NUM)
  3: MILL(NUM)
  4: QTE(NUM)
  5: NP(NUM)
  6: NOM(TEXT)
  7: PRENOM(TEXT)
  8: REGION(TEXT)
MASTER2:
  0: NV(NUM)
  1: CRU(TEXT)
  2: DEGRE(NUM)
  3: MILL(NUM)
  4: DATES(NUM)
  5: LIEU(TEXT)
  6: QTE(NUM)
  7: NB(NUM)
  8: NOM(TEXT)
  9: PRENOM(TEXT)
  10: TYPE(TEXT)
  11: REGION(TEXT)
Producer:
  0: NP(INTEGER) *1
  1: NOM(TEXT)
  2: PRENOM(TEXT)
  3: REGION(TEXT)
Wine:
  0: NV(INTEGER) *1
  1: CRU(TEXT)
  2: DEGRE(FLOAT)
  3: MILL(INTEGER)


Again, we recommend inline %sql as an alternative to the sqlite3 package

In [4]:
%load_ext sql
%sql sqlite:///wine.db

u'Connected: None@wine.db'

Recreate the Tables in 3NF/BCNF from Master2 as you did in the TP2

In [5]:
# Write corresponding code here
%sql DROP TABLE IF EXISTS Producer;
%sql CREATE TABLE Producer( \
    NP INTEGER PRIMARY KEY, \
    NOM TEXT, \
    PRENOM TEXT, \
    REGION TEXT\
);
%sql INSERT INTO Producer SELECT DISTINCT NP, NOM, PRENOM, REGION FROM MASTER1 WHERE NP NOT NULL;
%sql SELECT * FROM Producer LIMIT 20;

# Write corresponding code here
%sql DROP TABLE IF EXISTS Wine;
%sql CREATE TABLE Wine( \
    NV INTEGER PRIMARY KEY, \
    CRU TEXT, \
    DEGRE FLOAT, \
    MILL INTEGER \
);
%sql INSERT INTO Wine SELECT DISTINCT NV, CRU, DEGRE, MILL FROM Master1 WHERE NV NOT NULL;
%sql SELECT * FROM Wine LIMIT 20;

# Write corresponding code here

%sql DROP TABLE IF EXISTS Harvest;
%sql CREATE TABLE Harvest( \
    NP INTEGER, \
    NV INTEGER, \
    QTE INTEGER, \
    FOREIGN KEY(NP) REFERENCES Producer(NP), \
    FOREIGN KEY(NV) REFERENCES Wine(NV) \
);
%sql INSERT INTO Harvest SELECT DISTINCT NP, NV, QTE FROM Master1 WHERE NV AND NP NOT NULL;
%sql SELECT * FROM Harvest LIMIT 20;

Done.
Done.
124 rows affected.
Done.
Done.
Done.
102 rows affected.
Done.
Done.
Done.
140 rows affected.
Done.


NP,NV,QTE
1,1,300
73,1,1
5,2,100
1,3,400
10,4,35
30,5,46
42,6,300
98,7,60
90,8,12
98,10,100


In [10]:
%%sql
DROP TABLE IF EXISTS Location;
CREATE TABLE Location(
    LIEU TEXT PRIMARY KEY,
    REGION TEXT
);
INSERT INTO Location SELECT DISTINCT LIEU, REGION FROM Master2 WHERE LIEU NOT NULL;
SELECT * FROM Location LIMIT 20;



DROP TABLE IF EXISTS Customer;
CREATE TABLE Customer(
    ID_CUSTOMER INTEGER PRIMARY KEY AUTOINCREMENT,
    NB INTEGER,
    NOM TEXT,
    PRENOM TEXT,
    TYPE TEXT
);
INSERT INTO Customer (NB, NOM, PRENOM, TYPE) SELECT DISTINCT NB, NOM, PRENOM, TYPE FROM Master2;
SELECT * FROM Customer ORDER BY NB;

DROP TABLE IF EXISTS Buy;
CREATE TABLE Buy(
    ID_BUY INTEGER PRIMARY KEY AUTOINCREMENT,
    NB INTEGER,
    NV INTEGER,
    QTE INTEGER,
    LIEU TEXT,
    DATES DATETIME,
    FOREIGN KEY(NB) REFERENCES Customer(NB),
    FOREIGN KEY(LIEU) REFERENCES Location(LIEU),
    FOREIGN KEY(NV) REFERENCES Wine(NV)
);
INSERT INTO Buy (NB, NV, QTE, LIEU, DATES) SELECT DISTINCT NB, NV, QTE, LIEU, DATES FROM Master2;
SELECT * FROM Buy ORDER BY NB LIMIT 20;

Done.
Done.
18 rows affected.
Done.
Done.
Done.
101 rows affected.
Done.
Done.
Done.
185 rows affected.
Done.


ID_BUY,NB,NV,QTE,LIEU,DATES
81,,11,,,
83,,13,,,
84,,14,,,
85,,15,,,
87,,17,,,
88,,18,,,
89,,19,,,
97,,25,,,
99,,27,,,
101,,29,,,


___
# PART I: CREATE AND USE VIEWS

A view is a virtual table based on the result-set of an SQL statement. Views are stored in the database with an associated name.

Views are created following the syntax:

```sql
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE [condition];
```

An useful command is:

```sql
DROP VIEW IF EXISTS view_name;
```


__Note:__ Use it with caution (only drop something if you are sure)

__1.1__ Create a view 'bons_buveurs' with the clients (buveurs) of type 'gros' or 'moyen'.

In [11]:
%%sql
SELECT * FROM Customer

Done.


ID_CUSTOMER,NB,NOM,PRENOM,TYPE
1,11.0,Breton,Andre,petit
2,13.0,Barthes,Roland,moyen
3,16.0,Balzac,Honore de,moyen
4,18.0,Celine,Louis Ferdinand,gros
5,20.0,Chateaubriand,Francois-Rene de,moyen
6,21.0,Corbiere,Tristan,petit
7,23.0,Corneille,Pierre,petit
8,25.0,Char,Rene,petit
9,27.0,Dumas,Alexandre,gros
10,29.0,Fournier,Alain,petit


In [12]:
%%sql
DROP VIEW IF EXISTS bons_buveurs;
CREATE VIEW bons_buveurs AS
SELECT * FROM Customer
WHERE TYPE='gros' OR TYPE='moyen'

Done.
Done.


[]

In [13]:
# Test
%sql SELECT * FROM bons_buveurs ORDER BY nb;

Done.


ID_CUSTOMER,NB,NOM,PRENOM,TYPE
58,2,Artaud,Antonin,moyen
67,3,Aron,Raymond,gros
68,4,Apollinaire,Guillaume,moyen
77,6,Arrabal,Fernando,gros
62,7,Anouilh,Jean,moyen
64,8,Aragon,Louis,gros
74,10,Andersen,Yann,gros
88,12,Bataille,Georges,moyen
2,13,Barthes,Roland,moyen
89,14,Bory,Jean Louis,gros


__1.2__ Create the view 'buveurs_asec' with clients (buveurs) who have not bought any wine.

In [36]:
%%sql
DROP VIEW IF EXISTS buveurs_asec;
CREATE VIEW buveurs_asec AS
SELECT c.NOM, c.PRENOM, c.NB, c.TYPE FROM Customer c
INNER JOIN Buy b ON b.NB=c.NB
GROUP BY c.NB
HAVING SUM(b.QTE) IS NULL

Done.
Done.


[]

In [37]:
# Test
%sql SELECT * FROM buveurs_asec ORDER BY nb;

Done.


NOM,PRENOM,NB,TYPE
Breton,Andre,11,petit
Barthes,Roland,13,moyen
Balzac,Honore de,16,moyen
Celine,Louis Ferdinand,18,gros
Chateaubriand,Francois-Rene de,20,moyen
Corbiere,Tristan,21,petit
Corneille,Pierre,23,petit
Char,Rene,25,petit
Dumas,Alexandre,27,gros
Fournier,Alain,29,petit


__1.3__ Create the view 'buveurs_achats' complementary to the previous one.

In [16]:
%%sql
DROP VIEW IF EXISTS buveurs_achats;
CREATE VIEW buveurs_achats AS
SELECT c.NOM, c.PRENOM, c.NB, SUM(b.QTE) as Bottles_bought FROM Customer c
INNER JOIN Buy b ON b.NB=c.NB
GROUP BY c.NB
EXCEPT
SELECT * FROM buveurs_asec

Done.
Done.


[]

In [17]:
# Test
%sql SELECT * FROM buveurs_achats ORDER BY nb;

Done.


NOM,PRENOM,NB,Bottles_bought
Aristote,,1,78
Artaud,Antonin,2,583
Aron,Raymond,3,58
Apollinaire,Guillaume,4,24
Audiberti,Jacques,5,113
Arrabal,Fernando,6,36
Anouilh,Jean,7,6
Aragon,Louis,8,132
Ajar,Emile,9,140
Andersen,Yann,10,1


__1.4__ Create the view 'q83pl' (LIEU, CRU, QTE_BUE) that provides by LIEU and CRU the total quantities bought in 1983 by all the clients (buveurs).

In [20]:
%%sql
DROP VIEW IF EXISTS q83pl;
CREATE VIEW q83pl AS
SELECT l.LIEU, w.CRU, SUM(b.QTE) as QTE_BUE FROM Customer c
INNER JOIN Buy b on c.NB = b.NB
INNER JOIN Location l on b.LIEU = l.LIEU
INNER JOIN Wine w on b.NV = w.NV
WHERE b.DATES BETWEEN '1983-01-01' AND '1983-12-31'
GROUP BY l.LIEU, w.CRU

Done.
Done.


[]

In [21]:
# Test
%sql SELECT * FROM q83pl;

Done.


LIEU,CRU,QTE_BUE
CAEN,Seyssel,3
LILLE,Pommard,5
LYON,Beaujolais Villages,10
LYON,Julienas,2
PARIS,Beaujolais Primeur,4
PARIS,Coteaux du Tricastin,1
PARIS,Pouilly Vinzelles,3
RENNES,Mercurey,1
ROCQUENCOURT,Beaujolais Villages,260
ROCQUENCOURT,Saint Amour,80


__1.5__ Can we define the same view with ascending order over the attribute QTE? Provide an explanation for your answer.

In [22]:
%sql SELECT * FROM q83pl ORDER BY QTE_BUE asc;

Done.


LIEU,CRU,QTE_BUE
PARIS,Coteaux du Tricastin,1
RENNES,Mercurey,1
LYON,Julienas,2
CAEN,Seyssel,3
PARIS,Pouilly Vinzelles,3
PARIS,Beaujolais Primeur,4
LILLE,Pommard,5
LYON,Beaujolais Villages,10
ROCQUENCOURT,Saint Amour,80
ROCQUENCOURT,Beaujolais Villages,260


___
# PART II: UPDATE DATABASE CONTENT

The syntax for the Update statement is:

```sql
UPDATE table_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];
```

The syntax for the Insert statement is:

```sql
INSERT INTO TABLE_NAME [(column1, column2, column3,...columnN)]  
VALUES (value1, value2, value3,...valueN);
```


Database updates are commited automatically in Jupyter/Python. _Transactions_ are an option to control and reverse changes. Additionally we can just reload a backup of the database (NOT an option in deployed systems)

__Note:__ Different to other Database Management Systems, SQLite views are read-only and so you may not execute a DELETE, INSERT or UPDATE statement on a view.

__2.1__ Create a table 'RBB' with the same schema as 'bons_buveurs' which contains the tuples selected from 'bons_buveurs'

In [23]:
%%sql
DROP TABLE IF EXISTS RBB;
CREATE TABLE RBB AS 
SELECT * FROM bons_buveurs

Done.


[]

In [24]:
# Test
%sql SELECT * FROM RBB;

Done.


ID_CUSTOMER,NB,NOM,PRENOM,TYPE
2,13,Barthes,Roland,moyen
3,16,Balzac,Honore de,moyen
4,18,Celine,Louis Ferdinand,gros
5,20,Chateaubriand,Francois-Rene de,moyen
9,27,Dumas,Alexandre,gros
11,32,Eluard,Paul,moyen
13,35,Fromentin,Eugene,gros
16,39,Montesquieu,,gros
17,42,Goethe,Johann Wolfgang von,moyen
18,43,Musset,Alfred de,gros


__2.2__ Update the table you used to create 'bons_buveurs': Change the 'type' to 'gros' if the total of quantities bought is over 100.

Find the instances to update (schema may be different from the one in your table)

In [27]:
%%sql
SELECT c.ID_CUSTOMER, c.NB, c.NOM, c.PRENOM, c.TYPE FROM Customer c
INNER JOIN Buy b ON b.NB=c.NB
GROUP BY c.NB
HAVING SUM(b.QTE) > 100

Done.


ID_CUSTOMER,NB,NOM,PRENOM,TYPE
58,2,Artaud,Antonin,moyen
73,5,Audiberti,Jacques,petit
64,8,Aragon,Louis,gros
71,9,Ajar,Emile,petit
59,44,Gide,Andre,petit


Update instances

In [30]:
%%sql
UPDATE Customer
SET TYPE = 'gros'
WHERE ID_CUSTOMER IN (
    SELECT c.ID_CUSTOMER FROM Customer c
    INNER JOIN Buy b ON b.NB=c.NB
    GROUP BY c.NB
    HAVING SUM(b.QTE) > 100
);

5 rows affected.


[]

__2.3__ Compare the content of _table_ 'RBB' and the _view_ 'bons_buveurs' after the update. What differences do you see? Explain

In [33]:
%%sql 
SELECT * FROM bons_buveurs
EXCEPT
SELECT * FROM RBB

Done.


ID_CUSTOMER,NB,NOM,PRENOM,TYPE
58,2,Artaud,Antonin,gros
59,44,Gide,Andre,gros
71,9,Ajar,Emile,gros
73,5,Audiberti,Jacques,gros


__2.4__ Create a table 'RBA' with the same schema as 'buveurs_asec' which contains the tuples selected from 'buveurs_asec'

In [38]:
%%sql
DROP TABLE IF EXISTS RBA;
CREATE TABLE RBA AS 
SELECT * FROM buveurs_asec;

Done.
Done.


[]

In [39]:
# Test
%sql SELECT * FROM RBA

Done.


NOM,PRENOM,NB,TYPE
Breton,Andre,11,petit
Barthes,Roland,13,moyen
Balzac,Honore de,16,moyen
Celine,Louis Ferdinand,18,gros
Chateaubriand,Francois-Rene de,20,moyen
Corbiere,Tristan,21,petit
Corneille,Pierre,23,petit
Char,Rene,25,petit
Dumas,Alexandre,27,gros
Fournier,Alain,29,petit


__2.5__ Insert a tuple (101, 'your last name', 'your first name', 'your type of purchases(petit, moyen, or gros)') to 'RBA'

In [41]:
%%sql
INSERT INTO RBA (NB, NOM, PRENOM, TYPE)
VALUES (101, 'Ferreira', 'Remi', 'moyen');

1 rows affected.


[]

In [42]:
# Test
%sql SELECT * FROM RBA

Done.


NOM,PRENOM,NB,TYPE
Breton,Andre,11,petit
Barthes,Roland,13,moyen
Balzac,Honore de,16,moyen
Celine,Louis Ferdinand,18,gros
Chateaubriand,Francois-Rene de,20,moyen
Corbiere,Tristan,21,petit
Corneille,Pierre,23,petit
Char,Rene,25,petit
Dumas,Alexandre,27,gros
Fournier,Alain,29,petit


__2.6__ Compare the content of _table_ 'RBA' and the _view_ 'buveurs_asec'. What differences do you see? Explain

In [47]:
%%sql 
SELECT * FROM RBA
EXCEPT
SELECT * FROM buveurs_asec

Done.


NOM,PRENOM,NB,TYPE
Ferreira,Remi,101,moyen


___
# PART III: Design the database schema for posts in a Social Network

In this section you need to design the database schema for a social network app of a new startup:

The new social network will contain users, where each user will have a name, a nickname, an email, date of birth, and an address (Street, City, State, Country, Postal Code). Users can be friends of other users, and can publish posts. Each post can contain a text, date and attachment. Posts can be either original posts or replies so the app needs to handle both scenarios. When users log in, the app needs to display the posts of their friends.


__3.1__ Write and explain the design of the relations of your database

__3.2__ Write a view to retrieve the posts to display when a user logs in. Consider that some users may have a lot of friends and you need to limit the number of post to display. How would you select relevant posts to display first? What kind of information would you use/add in the database for this purpose? Explain your answer.

__Note:__ Limiting the number of posts just by count is too simplistic, the user could be missing something interesting to him/her.

Answer:
A view to show the top 50 posts for a user could be:
```sql
CREATE VIEW myview as
SELECT post.TEXT, post.Date, post.attachment FROM POST post
INNER JOIN Publish publish ON publish.ID_POST=post.ID_POST
INNER JOIN Friends friends ON friends.ID_friends=publish.ID_user
WHERE friends.ID_user=ID_myuser
ORDER BY post.DATE, publish.isOriginal desc
LIMIT 50
```

With this view we will have for "myuser" the TOP 50 posts ordered by Date and with a priority to original posts. These two variables are our criterias to order the posts.

Good things that could be added to the database for this purpose would be informations like the number of "likes/dislikes" or the number of times the post has been showed to users to show this posts more often than the others.

Another good practice could be to count how many times each of my friends post has been replied (even if my friend is replying it) and to show with more priority the posts with a huge amount of replies.

In [None]:
%%sql
DROP TABLE IF EXISTS Location;
CREATE TABLE Location(
    LIEU TEXT PRIMARY KEY,
    REGION TEXT
);
INSERT INTO Location SELECT DISTINCT LIEU, REGION FROM Master2 WHERE LIEU NOT NULL;
SELECT * FROM Location LIMIT 20;