# Constraints

Delta tables constraints are a set of rules that control the values that are inserted, updated, or deleted in a Delta table. They help to ensure data integrity and consistency by enforcing data quality rules.

Enforced constraints ensure that the quality and integrity of data added to a table is automatically.

## Check Constraint

Indicates that a specified boolean expression must be true for each input row.

You manage CHECK constraints using the **ALTER TABLE ADD CONSTRAINT** and **ALTER TABLE DROP CONSTRAINT** commands. 

_ALTER TABLE ADD CONSTRAINT_ verifies that all existing rows satisfy the constraint before adding it to the table.

In [None]:
# Generate dummy data

from pyspark.sql.functions import expr, lit, col
from pyspark.sql.types import *
from datetime import date


df = spark.range(5) \
  .selectExpr("if(id % 2 = 0, 'Open', 'Close') as action") \
  .withColumn("date", expr("cast(concat('2023-06-', cast(rand(5) * 30 as int) + 1) as date)")) \
  .withColumn("device_id", expr("cast(rand(5) * 100 as int)"))


delta_table_name = 'device'

spark.sql("DROP TABLE IF EXISTS " + delta_table_name)
df.write.format("delta").mode("overwrite").saveAsTable(delta_table_name)

In [None]:
%%sql   
SHOW TBLPROPERTIES demo.device

In [None]:
%%sql
ALTER TABLE demo.device ADD CONSTRAINT ck_action CHECK (action IN ("Open", "Close"))

In [None]:
%%sql   
SHOW TBLPROPERTIES demo.device 

In [None]:
## This will throw an errro due to the value "In " for column "action"

df = spark.range(5) \
  .selectExpr("if(id % 2 = 0, 'In progress', 'Close') as action") \
  .withColumn("date", expr("cast(concat('2023-06-', cast(rand(5) * 30 as int) + 1) as date)")) \
  .withColumn("device_id", expr("cast(rand(5) * 100 as int)"))


df.write.format("delta").mode("append").saveAsTable(delta_table_name)

In [None]:
## Now it works!

df = spark.range(5) \
  .selectExpr("if(id % 2 = 0, 'Open', 'Close') as action") \
  .withColumn("date", expr("cast(concat('2023-06-', cast(rand(5) * 30 as int) + 1) as date)")) \
  .withColumn("device_id", expr("cast(rand(5) * 100 as int)"))

delta_table_name = 'device'

df.write.format("delta").mode("append").saveAsTable(delta_table_name)

In [None]:
%%sql
SELECT * FROM demo.device

In [None]:
%%sql
ALTER TABLE demo.device DROP CONSTRAINT ck_action;

## Not Null Constraint

Indicates that values in specific columns cannot be null.

In [None]:
%%sql
DROP TABLE IF EXISTS demo.device

In [None]:
%%sql
CREATE TABLE demo.device (
  action STRING,
  date date NOT NULL,
  device_id INT NOT NULL
) USING DELTA;


In [None]:
spark.read.table("demo.device").printSchema()

In [None]:
%%sql
ALTER TABLE demo.device CHANGE COLUMN device_id DROP NOT NULL;

In [None]:
spark.read.table("demo.device").printSchema()

# Clean up

In [None]:
%%sql
DROP TABLE demo.device 