<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://blog.scholarnest.com/wp-content/uploads/2023/03/scholarnest-academy-scaled.jpg" alt="ScholarNest Academy" style="width: 1400px">
</div>

#####Cleanup previous runs

In [0]:
%run ../utils/cleanup

#####Setup

In [0]:
%sql
CREATE CATALOG IF NOT EXISTS dev;
CREATE DATABASE IF NOT EXISTS dev.demo_db;

#####1. Verify you can access the invoices directory

In [0]:
%fs ls /mnt/files/dataset_ch8/invoices

#####2. Create a delta table to ingest invoices data

In [0]:
%sql
CREATE TABLE IF NOT EXISTS dev.demo_db.invoices_raw(
  InvoiceNo string,
  StockCode string,
  Description string,
  Quantity int,
  InvoiceDate timestamp,
  UnitPrice double,
  CustomerID string)

#####3. Ingest data into invoices_raw table using copy into command

######3.1 Ingest using copy into command

In [0]:
%sql
COPY INTO dev.demo_db.invoices_raw
FROM (SELECT InvoiceNo::string, StockCode::string, Description::string, Quantity::int,
        to_timestamp(InvoiceDate,'d-M-y H.m') InvoiceDate, UnitPrice::double, CustomerID::string
      FROM "/mnt/files/dataset_ch8/invoices")
FILEFORMAT = CSV
FORMAT_OPTIONS ('header' = 'true')

######3.2 Check the records after ingestion

In [0]:
%sql
SELECT * FROM dev.demo_db.invoices_raw

######3.3 COPY INTO is idempotent

In [0]:
%sql
COPY INTO dev.demo_db.invoices_raw
FROM (SELECT InvoiceNo::string, StockCode::string, Description::string, Quantity::int,
        to_timestamp(InvoiceDate,'d-M-y H.m') InvoiceDate, UnitPrice::double, CustomerID::string
      FROM "/mnt/files/dataset_ch8/invoices")
FILEFORMAT = CSV
FORMAT_OPTIONS ('header' = 'true')

######3.4 Check the records after ingestion

In [0]:
%sql
SELECT * FROM dev.demo_db.invoices_raw

#####4. Collect more data into the invoices directory which comes with an additional column

In [0]:
%fs cp /mnt/files/dataset_ch8/invoices_2021.csv /mnt/files/dataset_ch8/invoices

#####5. Your ingestion code will not break but silently ignore the additional column

######5.1 Alter table to mnaully accomodate the additional field

In [0]:
%sql
ALTER TABLE dev.demo_db.invoices_raw ADD COLUMNS (Country string)

######5.2 Modify your ingestion code to manually accomodate the additional field

In [0]:
%sql
COPY INTO dev.demo_db.invoices_raw
FROM (SELECT InvoiceNo::string, StockCode::string, Description::string, Quantity::int,
        to_timestamp(InvoiceDate,'d-M-y H.m') InvoiceDate, UnitPrice::double, CustomerID::string, Country::string
      FROM "/mnt/files/dataset_ch8/invoices")
FILEFORMAT = CSV
FORMAT_OPTIONS ('header' = 'true', 'mergeSchema' = 'true')

######5.3 Check the records after ingestion

In [0]:
%sql
SELECT * FROM dev.demo_db.invoices_raw

&copy; 2021-2023 ScholarNest Technologies Pvt. Ltd. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
Databricks, Databricks Cloud and the Databricks logo are trademarks of the <a href="https://www.databricks.com/">Databricks Inc</a>.<br/>
<br/>
<a href="https://www.scholarnest.com/privacy/">Privacy Policy</a> | 
<a href="https://www.scholarnest.com/terms/">Terms of Use</a> | <a href="https://www.scholarnest.com/contact/">Contact Us</a>