### FIRST STEPS WITH DATABRICKS: Lakehouse SQL Exercises
#### This is going to an exercise for you to try your hands on. It is simple and straight forward. Your task is easy, you just to fill in the gaps (_____) then run your solution to see if you are right.


Author: TheDataLead Databricks Workshop
Description: Medallion architecture exercises using SQL (Bronze → Silver → Gold)

#### Run this to list the files in the S3 bucket

In [0]:
%sql

List 's3://thedatalead-data-engineering-projects-ingestion/workshop-demo/'

#### Run this to use catalog and schema

In [0]:
%sql

use catalog demo_catalog;
use library_schema

### Bronze Layer: Ingest raw CSV files. Use right path to the files to ingest the right table

#### Load books.csv

In [0]:
%sql

DROP TABLE IF EXISTS books_bronze;
CREATE ______
USING CSV
OPTIONS (
  path = '--------',
  header = 'true',
  inferSchema = 'true'
);

### Silver Layer: Clean and enrich the data
#### Your task to clean books table that is handle NULLs and cast publish_date to DATE. Fill in the missing COALESCE defaults as needed.
#### format COALESCE( ________, 'unknown') AS ______,

In [0]:
CREATE OR REPLACE TABLE books_silver AS
SELECT
  isbn,
  title,
  author,
  genre,
  CAST(publish_date AS DATE) AS publish_date,
  CAST(pages AS INT) AS pages
FROM books_bronze;

#### Clean borrowers table and compute return_delay_days

In [0]:
CREATE OR REPLACE TABLE staff_silver AS
SELECT
  staff_id,
  name,
  role,
  CAST(hire_date AS DATE) AS hire_date
FROM staff_bronze;

### Gold Layer: Business Metrics
#### Most Borrowed Books
#### This SQL statement is going to create most_borrowed_books_gold from borrowers_silver and books_silver tables after performing a join operation

In [0]:
CREATE OR REPLACE TABLE most_borrowed_books_gold AS
SELECT
  b.title,
  COUNT(*) AS borrow_count
FROM borrowers_silver br
____ books_silver b ON br.book_isbn = b.isbn
GROUP BY b.title
ORDER BY borrow_count DESC;

### Average delay by genre
#### This SQL statement is going to create delay_by_genre_gold from borrowers_silver and books_silver tables after performing a join operation

In [0]:
CREATE OR REPLACE TABLE delay_by_genre_gold AS
SELECT
  b.genre,
  ROUND(AVG(br.return_delay_days), 2) AS avg_return_delay_days
FROM borrowers_silver br
____ books_silver b ON br.book_isbn = b.isbn
GROUP BY b.genre
ORDER BY avg_return_delay_days DESC;