
### Casper's Ghost Kitchen Initializer

Select `Run All` to initialize Casper's Databricks environment.

#### What specifically does this notebook do

1. Create's a catalog, by default, `caspers`, where all of Casper's assets live.
2. Create's a schema, `simulator`, which houses Casper's dimensional data tables (brands, menus, categories, and items) and a Volume (`events`) for raw JSON events

In [0]:
CATALOG = dbutils.widgets.get("CATALOG")
EVENTS_VOLUME = dbutils.widgets.get("EVENTS_VOLUME")
SIMULATOR_SCHEMA = dbutils.widgets.get("SIMULATOR_SCHEMA")


#### Create main catalog, simulator related schemas and volumes

In [0]:
%sql
CREATE CATALOG IF NOT EXISTS ${CATALOG};
CREATE SCHEMA IF NOT EXISTS ${CATALOG}.${SIMULATOR_SCHEMA};
CREATE VOLUME IF NOT EXISTS ${CATALOG}.${SIMULATOR_SCHEMA}.${EVENTS_VOLUME};


#### Create tables from parquet data

In [0]:
import pandas as pd

spark.createDataFrame(pd.read_parquet("./data/dimensional/brands.parquet")) \
    .write.mode("overwrite").saveAsTable(f"{CATALOG}.{SIMULATOR_SCHEMA}.brands")
spark.createDataFrame(pd.read_parquet("./data/dimensional/menus.parquet")) \
    .write.mode("overwrite").saveAsTable(f"{CATALOG}.{SIMULATOR_SCHEMA}.menus")
spark.createDataFrame(pd.read_parquet("./data/dimensional/categories.parquet")) \
    .write.mode("overwrite").saveAsTable(f"{CATALOG}.{SIMULATOR_SCHEMA}.categories")
spark.createDataFrame(pd.read_parquet("./data/dimensional/items.parquet")) \
    .write.mode("overwrite").saveAsTable(f"{CATALOG}.{SIMULATOR_SCHEMA}.items")