### Learning Objectives
1. Use `MERGE INTO` to perform upserts, inserts, deletes
2. Apply `MERGE INTO` with schema enforcement to manage data integrity
3. Apply `MERGE INTO` with schema evolution to evolve target tables


### Set Up
- Create a table `main_users_target`. We will want to update this table with incoming data, hence status is current
- Create a table `main_users_source`. This is the incoming data.
  - We will delete pana, update samrth;s email, add owen and eva.
- Create a table `new_users_source`. We will sue this to test schema evolution


In [0]:
CREATE OR REPLACE TABLE workspace.data_engineering_labs_00.main_users_target AS
SELECT *
FROM VALUES
  (1, 'Pana', 'pana@example.com', DATE '2024-01-10', 'current'),
  (2, 'Samarth', 'samarth@example.com', DATE '2024-02-15', 'current'),
  (3, 'Zebi', 'carol@example.com', DATE '2024-03-01', 'current'),
  (4, 'Mark', 'dave@example.com', DATE '2024-03-20', 'current')
AS t (
  id,
  first_name,
  email,
  sign_up_date,
  status
);

--- DISPLAY
SELECT * FROM workspace.data_engineering_labs_00.main_users_target;


In [0]:
CREATE OR REPLACE TABLE workspace.data_engineering_labs_00.main_users_source AS
SELECT *
FROM VALUES
  (1, 'Pana', 'pana@example.com', DATE '2024-01-10', 'delete'),
  (2, 'Samarth', 'samarth_gomez@example.com', DATE '2024-02-15', 'update'),
  (5, 'Owen', 'owen@example.com', DATE '2024-03-01', 'new'),
  (6, 'Eva', 'eva@example.com', DATE '2024-03-20', 'new')
AS t (
  id,
  first_name,
  email,
  sign_up_date,
  status
);

--- DISPLAY
SELECT * FROM workspace.data_engineering_labs_00.main_users_source;

In [0]:
CREATE OR REPLACE TABLE workspace.data_engineering_labs_00.new_users_source AS
SELECT *
FROM VALUES
  (7, 'Kris', 'kris@example.com', DATE '2024-01-10', 'new', 'USA'),
  (8, 'Mod', 'mod@example.com', DATE '2024-02-15', 'new', 'Singapore'),
  (9, 'Cri', 'cri@example.com', DATE '2024-03-01', 'new', 'Australia')
AS t (
  id,
  first_name,
  email,
  sign_up_date,
  status,
  country
);

--- DISPLAY
SELECT * FROM workspace.data_engineering_labs_00.new_users_source;

### 1. MERGE INTO
  - We will delete pana, update samrth;s email, add owen and eva.


In [0]:
MERGE INTO workspace.data_engineering_labs_00.main_users_target target -- alias as target
USING data_engineering_labs_00.main_users_source source -- alias as source
ON target.id = source.id

WHEN MATCHED AND source.status = 'update' THEN  
	UPDATE SET
		target.email = source.email,
		target.status = source.status
		
WHEN MATCHED AND source.status = 'delete' THEN  
	DELETE
	
WHEN NOT MATCHED AND source.status = 'new' THEN 
	INSERT( id, first_name, email, sign_up_date, status) 
	VALUES (source.id, source.first_name, source.email, source.sign_up_date, source.status);

In [0]:
SELECT * FROM workspace.data_engineering_labs_00.main_users_target target 
ORDER BY id

In [0]:
-- We can see history
DESCRIBE HISTORY workspace.data_engineering_labs_00.main_users_target

In [0]:
-- We can check previous versions using tim travel. To chck original data, us version 1
SELECT * FROM workspace.data_engineering_labs_00.main_users_target VERSION AS OF 1
ORDER BY id

### 2. Schema Enforcment
- If source data evolves and adds new columns. We can use `MERGE INTO SCHEMA EVOLUTION` to update the schema of the target table
- If not, it will error out and schema is enforced.

In [0]:
--- View the updated target table. Verify that it is the most updated values
SELECT * FROM workspace.data_engineering_labs_00.main_users_target
ORDER BY id

In [0]:
--- View new users source Verify that it has an extra columhn -country
SELECT * FROM workspace.data_engineering_labs_00.new_users_source
ORDER BY id

- Running MERGE INTO will error out as schema does not match
- We must use `MERGE WITH SCHEMA EVOLUTION`

In [0]:
MERGE INTO workspace.data_engineering_labs_00.main_users_target target -- alias as target
USING data_engineering_labs_00.new_users_source source -- alias as source
ON target.id = source.id

WHEN MATCHED AND source.status = 'update' THEN  
	UPDATE SET
		target.email = source.email,
		target.status = source.status
		
WHEN MATCHED AND source.status = 'delete' THEN  
	DELETE
	
WHEN NOT MATCHED AND source.status = 'new' THEN 
	INSERT( id, first_name, email, sign_up_date, status, country)
	VALUES (source.id, source.first_name, source.email, source.sign_up_date, source.status, source.country);

In [0]:
-- Use schema evolution
MERGE WITH SCHEMA EVOLUTION INTO workspace.data_engineering_labs_00.main_users_target target -- alias as target
USING data_engineering_labs_00.new_users_source source -- alias as source
ON target.id = source.id

WHEN MATCHED AND source.status = 'update' THEN  
	UPDATE SET
		target.email = source.email,
		target.status = source.status
		
WHEN MATCHED AND source.status = 'delete' THEN  
	DELETE
	
WHEN NOT MATCHED AND source.status = 'new' THEN 
	INSERT( id, first_name, email, sign_up_date, status, country)
	VALUES (source.id, source.first_name, source.email, source.sign_up_date, source.status, source.country);