# FIRST STEPS WITH DATABRICKS: FROM ZERO TO LAKEHOUSE

Welcome to the hands-on workshop for learning **Databricks Lakehouse Architecture** using the analogy of a **Library System**. This end-to-end demo showcases how to build scalable data pipelines using Bronze, Silver, and Gold layers.

## Project Summary

This project simulates a public library managing:
- Books
- Borrowers
- Library staff

It walks participants through:
1. **Raw data ingestion (Bronze)**
2. **Data cleaning & enrichment (Silver)**
3. **Analytics (Gold)**
4. **Insigts (Visualization Dashboard)**


![image info](/Volumes/demo_catalog/library_schema/library_volume/workshop.png)

## The Data Model 
#### A data model defines the structure and relationships within your data. It serves as a blueprint for how data is stored, understood, and used across the organization. This is going general view of our datasets.
![image info](/Volumes/demo_catalog/library_schema/library_volume/datasets/Screenshot 2025-06-24 113300.png)


## Dashboard visualization
#### A dashboard translates data into actionable insights through visual storytelling. It allows decision-makers to monitor, explore, and act on key metrics in real time.
![image info](/Volumes/demo_catalog/library_schema/library_volume/Screenshot 2025-06-24 120009.png)




#### Bronze Layer - Raw Ingestion (SQL Version)
#### Dataset: Library Borrowing System

Explanation.
This SQL script loads raw CSV data into Delta tables so think of it as taking messy papers and filing them into the library basement. These tables are not cleaned or transformed, it is just ingested as-is.

**Key points**:
- Preserves all original data, even if messy
- Perfect for auditing (you keep the originals)
- Fastest way to get data into the system

##### We start by listing the dataset at where it is located. This gives you the individual path to all datasets. This is going to help us extract the data.


#### We are listing the contents of the S3 bucket

In [0]:
%sql

List 's3://thedatalead-data-engineering-projects-ingestion/workshop-demo/'

##### We have set catalog and schema. This is going to help us create tables in selected schema. Running the cell below is going to help us achieve that.

In [0]:
%sql
use catalog demo_catalog;
use library_schema

##### Badsed on the datasets we had in the path listed. We are going to ingest three datasets and create three seperate tables. The tables are:
  - books_bronze
  - borrowers_bronze
  - staffs_bronze 
##### After this we use sql select statement to query the individual tables created to have a view of the dataset. 

#### We are ingesting books.csv into books_bronze. We are deleting the table if it exists and creating a new one.

In [0]:
%sql

DROP TABLE IF EXISTS books_bronze;
CREATE TABLE books_bronze
USING CSV
OPTIONS (
  path 's3://thedatalead-data-engineering-projects-ingestion/workshop-demo/books.csv',
  header 'true',
  inferSchema 'true'
);


##### We are viewing the table to have a look at our data. We are limiting to just 10 role.[](url)

In [0]:
%sql
Select * from books_bronze limit 10

#### We are ingesting borrowers.csv into borrowers_bronze. We are deleting the table if it exists and creating a new one.

In [0]:
%sql

DROP TABLE IF EXISTS borrowers_bronze;
CREATE TABLE borrowers_bronze
USING CSV
OPTIONS (
  path 's3://thedatalead-data-engineering-projects-ingestion/workshop-demo/borrowers.csv',
  header 'true',
  inferSchema 'true'
);

#### We are viewing the table to have a look at our data. We are limiting to just 10 role.

In [0]:
%sql

select * from borrowers_bronze limit 10

#### We are ingesting staffs.csv into staffs_bronze. We are deleting the table if it exists and creating a new one.

In [0]:
%sql

DROP TABLE IF EXISTS staff_bronze;
CREATE TABLE staff_bronze
USING CSV
OPTIONS (
  path 's3://thedatalead-data-engineering-projects-ingestion/workshop-demo/staff.csv',
  header 'true',
  inferSchema 'true'
);


#### We are viewing the table to have a look at our data. We are limiting to just 10 role.

In [0]:
%sql

select * from staff_bronze limit 10

### We are done with ingestion. We should see three bronze table on our library_schema