# <u>**Creation of Delta Tables**</u>

## **Creation of a delta table from a DataFrame**

In [1]:
#load a file into a dataframe
df = spark.read.load('Files/mydata/mydata.csv', format='csv', header=True)

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 3, Finished, Available, Finished)

In [2]:
display(df)

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 4, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, f131008c-e9a4-42db-b848-63f306e7ded9)

In [4]:
# Save the data Frame as a delta Table
df.write.format("delta").saveAsTable("mytable")

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 6, Finished, Available, Finished)

## **Manage tables vs External Tables**

In [5]:
df.write.format("delta").saveAsTable("myexternaltable", path="Files/myexternaltable")

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 7, Finished, Available, Finished)

## **Creation of Metadata**

### *DeltaTableBuilder*

In [7]:
from delta.tables import * 

DeltaTable.create(spark) \
  .tableName("deltabuildproducts") \
  .addColumn("productid", "INT") \
  .addColumn("ProductName", "STRING") \
  .addColumn("Category", "STRING") \
  .addColumn("Price", "FLOAT") \
  .execute()

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 9, Finished, Available, Finished)

<delta.tables.DeltaTable at 0x7ab62424fbb0>

### *Using Spark SQL*

In [8]:
%%sql
CREATE TABLE salesordersSQL
(
    Orderid INT NOT NULL,
    OrderDate TIMESTAMP NOT NULL,
    CustomerName STRING,
    SalesTotal FLOAT NOT NULL
)
USING DELTA

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 10, Finished, Available, Finished)

<Spark SQL result set with 0 rows and 0 fields>

Creating an External Table:

In [9]:
%%sql
CREATE TABLE externaltableSQL
USING DELTA
LOCATION 'Files/externaltableSQL'

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 11, Finished, Available, Finished)

<Spark SQL result set with 0 rows and 0 fields>

## **Saving only Data in delta format**

Delta files are saved  in Parquet format and it contains a delta_log folder that contains transactions.

In [10]:
delta_path = "Files/onlydatatable"
df.write.format("delta").save(delta_path)

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 12, Finished, Available, Finished)

It is possible to replace contain of a existend folder with data from a different dataframe in overwrite mode:

In [11]:
new_df = df
new_df.write.format("delta").mode("overwrite").save(delta_path)

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 13, Finished, Available, Finished)

Append:

In [12]:
new_rows_df = new_df
new_rows_df.write.format("delta").mode("append").save(delta_path)

StatementMeta(, 1f59305b-7543-473f-b06d-5bc5d0b5d4c7, 14, Finished, Available, Finished)