# FIRST STEPS WITH DATABRICKS: FROM ZERO TO LAKEHOUSE


#### Bronze Layer - Raw Ingestion (SQL Version)
#### Dataset: Library Borrowing System

**Explanation**:

In this SQL notebook, we are going to load raw CSV data into Delta tables. You can think of it as taking messy papers and filing them into the library basement. These tables are not cleaned or transformed, it is just ingested as-is.

**Key points**:
- Preserves all original data, even if messy
- Perfect for auditing (you keep the originals)
- Fastest way to get data into the system

##### We will start by listing the dataset at where it is located. This gives you the individual path to all datasets. This is going to help us extract the data so run the cell below to below to us achieve this task. 

In [0]:
%run "./setup"

##### We have listed the contents of the path where the datasets is located. We are saving the paths in a variable so we can access them _easily_

In [0]:
books_path = "dbfs:/Volumes/demo_catalog/library_schema/library_volume/books.csv"
borrowers_path = "dbfs:/Volumes/demo_catalog/library_schema/library_volume/borrowers.csv"
staff_path = "dbfs:/Volumes/demo_catalog/library_schema/library_volume/staff.csv"



##### Based on the datasets we had in the path listed. We are going to ingest three datasets and create three seperate tables. The tables are:
  - books_bronze
  - borrowers_bronze
  - staffs_bronze 
##### After this we use sql select statement to query the individual tables created to have a view of the dataset. 

##### We are reading books.csv into books_bronze then we use it create a table called books_bronze using the data we read.

In [0]:
books_bronze = (spark.read
    .option("header", True)
    .option("inferSchema", True)
    .csv(books_path)
)

books_bronze.write.mode("overwrite").format("delta").saveAsTable("books_bronze")


##### We are viewing the table to have a look at our data. We are limiting to just 10 role.[](url)

In [0]:
%sql
Select * from books_bronze limit 10

##### We are reading borrowers.csv into borrowers_bronze then we use it create a table called borrowers_bronze using the data we read.

In [0]:
borrowers_bronze = (spark.read
    .option("header", True)
    .option("inferSchema", True)
    .csv(borrowers_path)
)
borrowers_bronze.write.mode("overwrite").format("delta").saveAsTable("borrowers_bronze")


##### We are viewing the table to have a look at our data. We are limiting to just 10 role.

In [0]:
%sql
Select * from borrowers_bronze limit 10

##### We are reading staff.csv into staff_bronze then we use it create a table called staff_bronze using the data we read.

In [0]:
staff_bronze = (spark.read
    .option("header", True)
    .option("inferSchema", True)
    .csv(staff_path)
)
staff_bronze.write.mode("overwrite").format("delta").saveAsTable("staff_bronze")



##### We are viewing the table to have a look at our data. We are limiting to just 10 role.

In [0]:
%sql
Select * from staff_bronze limit 10