# Column Mapping

This feature allows Delta table columns and the underlying Parquet file columns to use different names.

This enables Delta schema evolution operations such as **RENAME COLUMN** and **DROP COLUMNS** on a Delta table **_without the need to rewrite_** the underlying Parquet files. 

It also allows users to name Delta table columns by _**using characters that are not allowed by Parquet**_, such as spaces, so that users can directly ingest CSV or JSON data into Delta without the need to rename columns due to previous character constraints.



In [None]:
# Generate dummy data

from pyspark.sql.functions import expr, lit, col
from pyspark.sql.types import *
from datetime import date


df = spark.range(5) \
  .selectExpr("if(id % 2 = 0, 'Open', 'Close') as action") \
  .withColumn("date", expr("cast(concat('2023-06-', cast(rand(5) * 30 as int) + 1) as date)")) \
  .withColumn("device_id", expr("cast(rand(5) * 100 as int)"))


delta_table_name = 'demo.column_mapping_demo'
spark.sql(f"DROP TABLE IF EXISTS {delta_table_name}")


df.write.format("delta").mode("overwrite").saveAsTable(delta_table_name)

In [None]:
%%sql
DESCRIBE demo.column_mapping_demo

## Enable column mapping

In [None]:
%%sql
  ALTER TABLE demo.column_mapping_demo SET TBLPROPERTIES (
    'delta.minReaderVersion' = '2',
    'delta.minWriterVersion' = '5',
    'delta.columnMapping.mode' = 'name'
  )

## Change column name
Let's change the column name and add a space. Parquet file does not support it but delta lake does.  

**You have to use the special character `**

In [None]:
%%sql
ALTER TABLE column_mapping_demo RENAME COLUMN device_id TO `device id`

In [None]:
%%sql
DESCRIBE demo.column_mapping_demo

    > Look into the lakehouse explorer to see no new files were created.

## Check the metadata

In [None]:
import delta

delta_info = delta_info = delta.DeltaTable.forName(spark, "demo.column_mapping_demo")

display(delta_info.history())

Look at metaData info schemaString

In [None]:
deltalog = spark.read.json("Tables/column_mapping_demo/_delta_log/00000000000000000002.json")
display(deltalog)

## Write data

In [None]:
%%sql
INSERT INTO demo.column_mapping_demo VALUES('Open', CURRENT_DATE(), 1010)

In [None]:
%%sql
SELECT * FROM demo.column_mapping_demo

## Drop column

In [None]:
%%sql
ALTER TABLE demo.column_mapping_demo DROP COLUMN `device id` 


In [None]:
display(delta_info.history())

    > Look into the lakehouse explorer to see no new files were created.

In [None]:
%%sql
SELECT * FROM demo.column_mapping_demo

# Clean up

In [None]:
spark.sql(f"DROP TABLE IF EXISTS {delta_table_name}")