# Data Setup



Run this cell to initialize resource names. You can edit resource names in the config file before running.

In [0]:
%run ./config

## Write data from CSV to schema to use with AI funcions

In [0]:
import pandas as pd
df = pd.read_csv('./data/synthetic_car_data.csv')
spark_df = spark.createDataFrame(df)
spark_df.write.format("delta").mode("overwrite").saveAsTable(f"{catalog_name}.{schema_name}.synthetic_car_data")

In [0]:
import os

current_dir = os.path.dirname(dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get())
data_path = f"file:/Workspace{current_dir}/data"
display(data_path)

spark.sql(f"CREATE VOLUME IF NOT EXISTS {catalog_name}.{schema_name}.`dip`")
dbutils.fs.cp(f"{data_path}/dip.png/", f"/Volumes/{catalog_name}/{schema_name}/dip")

## Write data from ZIP file to use with Agent Bricks 

### Approach #1: Clone and Move Data from GitHub

You need a classic compute cluster (not serverless) to move files from the workspace to a UC volume.


In [0]:
display(f"/Workspace{current_dir}/data/tech_support.zip")

In [0]:
import zipfile

# unzip file
with zipfile.ZipFile(f"/Workspace{current_dir}/data/tech_support.zip", 'r') as zip_ref:
    zip_ref.extractall(f"/Workspace{current_dir}/data/")

spark.sql(f"CREATE VOLUME IF NOT EXISTS {catalog_name}.{schema_name}.`tech_support`")

dbutils.fs.cp(f"{data_path}/tech_support/", f"/Volumes/{catalog_name}/{schema_name}/tech_support", recurse=True)

### Approach #2: Uploading Data Directly to a Unity Catalog Volume

1. Download the Data </br>
Download the tech_support.zip archive from the provided GitHub [link](https://github.com/chen-data-ai/Agent-Bricks-Workshop/blob/main/data/tech_support.zip).

2. Upload the ZIP File </br>
Go to your Databricks workspace.
Navigate to your schema’s Unity Catalog volume using one of these methods:

    - Sidebar: Select Add data > Upload files to volume
    - Catalog Explorer: Click Add > Upload to volume
    - Notebook: Select File > Upload the downloaded tech_support.zip file and your catalog or schema information

3. Copy the File Path </br>
After uploading, locate the file path (e.g., /Volumes/catalog/schema/volume/tech_support.zip) — you’ll need this path for the next step.

4. Unzip the File in Databricks </br>
Use the following command in a notebook cell to unzip the file to your desired directory:

In [0]:
%sh unzip '/Volumes/<ADD YOUR CATALOG>/<ADD YOUR SCHEMA>/tech_support/tech_support.zip' -d '/Volumes/<ADD YOUR CATALOG>/<ADD YOUR SCHEMA>/tech_support'