# Discussion 01

Running the below cells creates the sqlite3 database `sqlite3:///books.db`.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import make_books

## Discussion Questions

Database schema:

```
books (book_id, book_name, year, genre, author_id, times_trending)
authors (author_id, author_name, publishing_company, debut_year)
awards (award_id, book_id, award_name)
```

Run the below cells to start a sql connection.

In [3]:
%load_ext sql

In [4]:
%sql sqlite:///books.db

## 1) SQL Review
Suppose a full database schema is as follows (same as above):
- books (book_id, book_name, year, genre, author_id, times_trending)
- authors (author_id, author_name, publishing_company, debut_year)
- awards (award_id, book_id, award_name)

Below is how you will be writing your SQL queries with ["SQL magic"](https://www.datacamp.com/tutorial/sql-interface-within-jupyterlab).

In [5]:
%%sql
-- your code here --
;

### SQL Styling Convention
Just for consistency (and best practice), here's a few guidelines to follow when writing SQL:
- SQL keywords should be all caps
- Table names are preferred snake_case, plural (e.g., student_records) or at least consistent within the entire system
    - And in general, it's preferred to use all lowercase identifiers (column names, table names, aliases), unless necessary for clarity
- Whitespace: line breaks however reasonable, two-space indentation
- Semicolons **always** to end queries
- Similar attributes should have similar naming conventions, e.g., created_at, updated_at, etc. For now, default attributes to snake_case
- All timestamps should have timezones

Write a SQL query to accomplish each task below:


**1. Select only the book names and years from the books table.**

In [6]:
%%sql
  SELECT book_name, year FROM books
;

book_name,year
And Then There Were None,1939
Endless Night,1967
Teaching to Transgress,1994
Chanwoo's Photobook,2017
The Dispossessed,1974
The Ministry for the Future,2020


**2. Find the names of all authors whose debut year is in the 21st century (defined as after 2000).**

In [7]:
%%sql
  SELECT author_name
  FROM authors
  WHERE debut_year >= 2001
;

author_name
Chanwoo


**3. Find the names of all authors who released a sci-fi genre book in 1974 that won an award.**

My Solution:

In [20]:
%%sql
  SELECT authors.author_name
  FROM authors
    INNER JOIN books ON authors.author_id = books.author_id
    INNER JOIN awards ON books.book_id = awards.book_id
  WHERE genre = 'sci-fi' AND year = 1974
;

author_name
Ursula K. Le Guin
Ursula K. Le Guin


Staff solution:

In [22]:
%%sql
  SELECT au.author_name
  FROM books AS b
    INNER JOIN awards AS aw ON b.book_id = aw.book_id
    INNER JOIN authors AS au ON b.author_id = au.author_id
  WHERE b.genre = 'sci-fi' AND b.year = 1974
;

author_name
Ursula K. Le Guin
Ursula K. Le Guin


Thing to note: This also works. You can reference an attribute in books before you actually join the books table as long as you join the books table. This works because SQL is declaritive and the computer figures out the best way to run the query.

In [26]:
%%sql
  SELECT authors.author_name
  FROM authors
    INNER JOIN awards ON books.book_id = awards.book_id
    INNER JOIN books ON authors.author_id = books.author_id
  WHERE genre = 'sci-fi' AND year = 1974
;

author_name
Ursula K. Le Guin
Ursula K. Le Guin


## 2) SQL Aggregation Review
In some cases, we may need to accumulate results by combining multiple rows via aggregation. We use the same schema as in Section 2.

**4. Write a query that calculates the total number of books written by author Agatha Christie.**

In [29]:
%%sql
  SELECT COUNT(*) AS total_count
  FROM books AS b
    INNER JOIN authors AS a ON a.author_id = b.author_id
  GROUP BY author_name
  HAVING author_name = 'Agatha Christie'
;

total_books
2


**5. The group-by construct lets us compute aggregates for different chunks of a relation. Find the total number of books released per genre. Don’t include genres with a count less than 2. (Hint: HAVING)**

My Solution

In [32]:
%%sql
  SELECT genre, COUNT(*) AS total_count
  FROM books
  GROUP BY genre
  HAVING genre >= 2
;

genre,total_count
mystery,2
nonfiction,2
sci-fi,2


Staff Solution

In [34]:
%%sql
  SELECT genre, COUNT(*) AS total_count
  FROM books
  GROUP BY genre
  HAVING COUNT(*) >= 2
;

genre,total_count
mystery,2
nonfiction,2
sci-fi,2


## 3) Optional Questions

**6. Find the names of the 5 books that trended the least, ordered from least to most. Break ties by book name in alphabetical order.**

In [None]:
%%sql
-- your code here --
;

**7. *Challenge:* Write a query that computes the total times each author has had a trending book. (Hint: you will need to use a join for this as well.)**

In [None]:
%%sql
-- your code here --
;