# SQL Part 3 Studio

Let's practice your SQL querying skills! For each question, work along in the notebook, adding your query and answering the questions.

## The Dataset

We will be using the Goodbooks-10k dataset from the exercises in the prep work. Feel free reference your previous notebooks.

1. The dataset can be found here: [goodbooks-10k](https://www.kaggle.com/zygmunt/goodbooks-10k)
1. You can access `BooksDB` in the LaunchCode server.

## Business Issue

You are an employee at a small independent bookstore and you have been tasked with increasing sales.  You decide to use `BooksDB` to find books and themes to highlight in fun promotions throughout each month and/or season.  We will be starting with March and then you will have a chance to explore another month of your choosing.  We want to come up with a list of promotions to run each month.  If you are looking for ideas, here are some resources on different holidays:

- [https://www.calendarr.com/united-states/calendar-2022/](https://www.calendarr.com/united-states/calendar-2022/)
- [https://www.holidayinsights.com/moreholidays/](https://www.holidayinsights.com/moreholidays/)
    - Click on a month and it will take you to a more detailed page

## Part 1:  March - Women's History Month, National Pie Day (3/14), St. Patrick's Day (3/17), Season - Spring

### Event 1: Women's History Month

Highlight popular women writers based on ratings from `BooksDB` by writing a query that returns `tag_id`, the number of times each `tag_id` is used and the `tag_name`. Use the `GROUP BY` and `HAVING` clause to narrow your focus and try multiple keywords, such as "woman" and "female".

In [None]:
-- Solution

SELECT TOP 10 t.tag_name, bt.tag_id, SUM(bt.count) AS total, b.authors
FROM BooksDB.dbo.book_tags AS bt
INNER JOIN BooksDB.dbo.tags AS t 
ON bt.tag_id = t.tag_id
LEFT JOIN BooksDB.dbo.books AS b  
ON bt.goodreads_book_id = b.book_id 
GROUP BY t.tag_name, bt.tag_id, b.authors
HAVING t.tag_name LIKE '%female%'
ORDER BY total DESC


### Event 2: Choose another event from March/Spring

Write a query to return authors, titles, ratings, and `tag_id` that you would want to promote during your chosen event.

In [None]:
-- Solution

SELECT TOP 10 t.tag_name, bt.tag_id, SUM(bt.count) AS total, b.title
FROM BooksDB.dbo.book_tags AS bt
INNER JOIN BooksDB.dbo.tags AS t 
ON bt.tag_id = t.tag_id
LEFT JOIN BooksDB.dbo.books AS b  
ON bt.goodreads_book_id = b.book_id 
GROUP BY t.tag_name, bt.tag_id, b.title
HAVING t.tag_name LIKE '%spring'
ORDER BY total DESC

Record your thoughts about why you wrote the query the way you did.

# Part 2: Choose Another Month

Choose another month and plan at least 2 events / promotions and answer the following questions:
1. Which month did you choose?
1. What 2 events / promotions are you highlighting?

1.I choose the month of October

Event 1:

Event: Cookbook Launch Day

 "Fall into a Good Book" theme. Promotion around all things culinary.

Promotion: "Fall Flavors: Celebrate Cookbook Launch Day with a new recipe book!"

Event 2:Event: Frankenstein Friday & Spooky Reads

Instead of just a general Halloween theme, we can have a special focus on Frankenstein Friday! We can build a display around Mary Shelley's classic and other gothic horror novels.

Promotion: "Get Your Frankenstein On! 20% off all classic horror novels."


# Part 3: Summarize your Work

For each event write at least one query that joins any two tables in `BooksDB` to support your choice and record you thoughts as to why you used the paticlular query. At least one of your queries needs to include a `HAVING` clause.

In [None]:
-- Event 1 Query

SELECT TOP 10
    b.title,
    b.authors,
    SUM(bt.count) AS popularity_score
FROM  BooksDB.dbo.books AS b
JOIN  BooksDB.dbo.book_tags AS bt ON b.book_id = bt.goodreads_book_id
WHERE  bt.tag_id IN (SELECT tag_id FROM BooksDB.dbo.tags WHERE tag_name IN ('cookbooks', 'food', 'cooking'))
GROUP BY  b.title, b.authors
ORDER BY  popularity_score DESC;

### Summarize Event 1

Double-click to edit.

I joined books and book_tags to link book titles to their tag usage data.

Instead of joining the tags table directly, I used a subquery (SELECT tag_id FROM ...) in the WHERE clause. This is an way to filter the book_tags table for only the tags we care about ('cookbooks', 'food', 'cooking').

I used SUM(bt.count) to create a popularity_score. The count column in book_tags indicates how many users applied a specific tag to a book. Summing this count across relevant tags gives a strong indicator of a book's popularity within the culinary genre.

ORDER BY popularity_score DESC sorts the list to show the most popular books at the top, which are perfect candidates for a promotional display.

In [None]:
-- Event 2 Query

SELECT TOP 10
    b.title,
    b.authors,
    b.average_rating
FROM  BooksDB.dbo.books AS b
JOIN  BooksDB.dbo.book_tags AS bt ON b.book_id = bt.goodreads_book_id
JOIN  BooksDB.dbo.tags AS t ON bt.tag_id = t.tag_id
WHERE  t.tag_name IN ('classics', 'horror', 'gothic', 'frankenstein')
GROUP BY  b.title, b.authors, b.average_rating
HAVING  COUNT(t.tag_name) > 1
ORDER BY  b.average_rating DESC;

### Summarize Event 2

Double-click to edit.
I joined books with book_tags and tags to connect book details (like title and rating) with their descriptive tags.

The WHERE clause initially filters for any book tagged with at least one of the relevant keywords ('classics', 'horror', 'gothic').

The HAVING COUNT(t.tag_name) > 1 clause is the key part. It further refines the results to only include books that have more than one of these tags. This ensures a book like Frankenstein (tagged as 'classics', 'horror', and 'gothic') will rank highly, while a book that is only tagged as 'horror' might not appear. This gives us a highly curated list for the promotion.

Finally, ORDER BY average_rating DESC ensures we highlight the best-rated books first.