# Reboot - SQL Advanced

Tonight we will work with the `blog.sqlite` database is available at this URL:  
`https://wagon-public-datasets.s3.amazonaws.com/sql_databases/blog.sqlite`

Let's have a look at our directory structure

In [19]:
!tree

[01;34m.[0m
├── [01;34mdata[0m
│   └── [00mblog.sqlite[0m
├── [00mREADME.md[0m
└── [00mrecap.ipynb[0m

1 directory, 3 files


## 1. Schema ERD

❓ Open the `data/blog.sqlite` in DBeaver, explore the schema and draw it on [kitt.lewagon.com/db](https://kitt.lewagon.com/db).

_TODO: Double click this cell and **paste** a screenshot of the schema for future reference_.

---
## 2. Most liked posts

Complete the code to get **the 3 most liked posts**:

In [20]:
import sqlite3

conn = sqlite3.connect("data/blog.sqlite")
c = conn.cursor()

# TODO: write the query
query = """
SELECT posts.title,
posts.id,
COUNT(posts.id) as TotalLikes
FROM likes
JOIN posts ON likes.post_id = posts.id
GROUP BY post_id
ORDER BY TotalLikes DESC
LIMIT 3
"""
c.execute(query)
result = c.fetchall()

print(result)


[('Half imagine another.', 143, 84), ('Side foot leader popular.', 83, 82), ('Area paper whatever mean.', 99, 81)]


---

### Pretty Print using _pandas_

The readbility of our `print()` statements is not so good.

Next week, we will introduce [pandas](https://pandas.pydata.org/) which will vastly improve the UX of our data exploration in Notebooks.

Execute the following cell to load `pandas` library:

In [21]:
import pandas as pd

Then try the previous `query` again, delegating the job of fetching results + displaying them to the `read_sql_query` function of `pandas`:

In [22]:
pd.read_sql_query(query, conn)


Unnamed: 0,title,id,TotalLikes
0,Half imagine another.,143,84
1,Side foot leader popular.,83,82
2,Area paper whatever mean.,99,81


---
## 3. Find the three users who 'liked' the most

In [25]:
query = """SELECT users.first_name,
users.last_name,
COUNT(likes.id) as TotalLikes
FROM likes
JOIN users ON likes.user_id = users.id
GROUP BY users.id
ORDER BY TotalLikes DESC
LIMIT 3
"""

c.execute(query)
result = c.fetchall()
print(result)
pd.read_sql_query(query, conn)


[('Michael', 'Allen', 236), ('Donna', 'Ramirez', 233), ('Hayley', 'Williams', 227)]


Unnamed: 0,first_name,last_name,TotalLikes
0,Michael,Allen,236
1,Donna,Ramirez,233
2,Hayley,Williams,227


---
## 4. Find the most liked author

In [26]:
query = """
SELECT users.first_name,
users.last_name,
COUNT(likes.id) as TotalLikes
FROM users
JOIN posts ON posts.user_id = users.id
JOIN likes ON likes.post_id = posts.id
GROUP BY users.id
ORDER BY TotalLikes DESC
LIMIT 1
"""
c.execute(query)
result = c.fetchall()
print(result)
pd.read_sql_query(query, conn)

[('Teresa', 'Moore', 647)]


Unnamed: 0,first_name,last_name,TotalLikes
0,Teresa,Moore,647


---
## 5. Who are the authors of the 3 most liked posts?

In [27]:
query = """
SELECT posts.title, users.first_name,
users.last_name,
COUNT(likes.id) as TotalLikes
FROM posts
JOIN users ON users.id = posts.user_id
JOIN likes ON likes.post_id = posts.id
GROUP BY likes.post_id
ORDER BY TotalLikes DESC
LIMIT 3
"""
c.execute(query)
result = c.fetchall()
print(result)
pd.read_sql_query(query, conn)

[('Half imagine another.', 'Melissa', 'Henry', 84), ('Side foot leader popular.', 'Cynthia', 'Raymond', 82), ('Area paper whatever mean.', 'Alexander', 'Cook', 81)]


Unnamed: 0,title,first_name,last_name,TotalLikes
0,Half imagine another.,Melissa,Henry,84
1,Side foot leader popular.,Cynthia,Raymond,82
2,Area paper whatever mean.,Alexander,Cook,81


---
## 6. How many people liked at least one post?

In [None]:
query = """

"""

pd.read_sql_query(query, conn)

---
## 7. Compute the cumulative number of likes per day

In [None]:
query = """

"""

pd.read_sql_query(query, conn)

---
## 8. (Optional) Who's the biggest fan/ fans of each author?

The biggest fan/ fans of an author is defined as the user or users who liked the author's posts the most. i.e. if there is a tie between fans that both liked an author 20 times, both fans should be returned alongside their like count and the author in question.
<br><br>
<details>
    <summary>💡 Click for Hint</summary>
    You might need to use <code>WITH</code>
</details>


In [None]:
query = """

"""

pd.read_sql_query(query, conn)