
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>


# 1 - Exploring the Lab Environment

This demonstration is meant as a review for understanding data objects registered to Unity Catalog (UC) using Databricks. UC is a unified data governance solution designed to centralize and streamline the management of data, metadata, and access control across multiple Databricks workspaces. It provides interoperability across lakehouse formats like Delta lake and Apache Iceberg in addition to providing open APIs and built-in governance for data and AI applications. 

### Learning Objectives
By the end of this lesson, you should be able to:
- Identify and display available Unity Catalog objects, including catalogs, schemas, volumes, and tables within a Databricks.
- Execute SQL queries to display data directly from files in cloud storage.

**References** For more additional reading and learning, check out the [official UC GitHub repository](https://github.com/unitycatalog/unitycatalog) and [this video on UC on Databricks](https://www.databricks.com/resources/demos/videos/data-governance/unity-catalog-overview).

## REQUIRED - SELECT CLASSIC COMPUTE

Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default and you have a Serverless SQL warehouse with a similar name.

![Select Cluster](./Includes/images/selecting_cluster_info.png)

Follow these steps to select the classic compute cluster:


1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.

2. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:

   - Click **More** in the drop-down.

   - In the **Attach to an existing compute resource** window, use the first drop-down to select your unique cluster.

**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:

1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.

2. Find the triangle icon to the right of your compute cluster name and click it.

3. Wait a few minutes for the cluster to start.

4. Once the cluster is running, complete the steps above to select your cluster.


## A. Classroom Setup

1. Run the following cell to configure your working environment for this notebook.

**NOTE:** The `DA` object is only used in Databricks Academy courses and is not available outside of these courses. It will dynamically reference the information needed to run the course in the lab environment.

In [0]:
%run ./Includes/Classroom-Setup-01

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


----------------------------------------------------------------------------------------
Creating folder: /Volumes/dbacademy/ops/labuser11045124_1753761040@vocareum_com/csv_demo_files
Creating folder: /Volumes/dbacademy/ops/labuser11045124_1753761040@vocareum_com/json_demo_files
Creating folder: /Volumes/dbacademy/ops/labuser11045124_1753761040@vocareum_com/xml_demo_files
----------------------------------------------------------------------------------------



Created Delta table mydeltatable for the demonstration.


2. Complete the following to explore your **labuser** schema using the Catalog UI on the left.

   a. In the left navigation bar, select the catalog icon:  ![Catalog Icon](./Includes/images/catalog_icon.png)

   b. Locate the catalog called **dbacademy** and expand the catalog. 

   c. Expand the **labuser** schema (database). This is your catalog for the course. It should be your lab username (for example, **labuser1234_5678**).

3. We want to modify our default catalog and default schema to use **dbacademy** and our **labuser** schema to avoid writing the three-level namespace every time we query and create tables in this course.

    However, before we proceed, note that each of us has a different schema name. Your specific schema name has been stored dynamically in the SQL variable `DA.schema_name` during the classroom setup script.

    Run the code below and confirm that the value of the `DA.schema_name` SQL variable matches your specific schema name (e.g., **labuser1234_678**).

In [0]:
values(DA.schema_name)

col1
labuser11045124_1753761040


4. Let's modify our default catalog and schema using the `USE CATALOG` and `USE SCHEMA` statements. This eliminates the need to specify the three-level name for objects in your **labuser** schema (i.e., catalog.schema.object).

    - `USE CATALOG` – Sets the current catalog.

    - `USE SCHEMA` – Sets the current schema.

    **NOTE:** Since our dynamic schema name is stored in the SQL variable `DA.schema_name` as a string, we will need to use the `IDENTIFIER` clause to interpret the constant string in our variable as a schema name. The `IDENTIFIER` clause can interpret a constant string as any of the following:
    - Relation (table or view) name
    - Function name
    - Column name
    - Field name
    - Schema name
    - Catalog name

    [IDENTIFIER clause documentation](https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-names-identifier-clause?language=SQL)

    Run the following cell to set and view your default catalog and schema. Confirm that your default catalog is **dbacademy** and your schema is **labuser** (this uses the `DA.schema_name` variable created in the classroom setup script).

**NOTE:** Alternatively, you can simply add your schema name without using the `IDENTIFIER` clause.


In [0]:
-- Change the default catalog/schema
USE CATALOG dbacademy;
USE SCHEMA IDENTIFIER(DA.schema_name);


-- View current catalog and schema
SELECT 
  current_catalog(), 
  current_schema()

current_catalog(),current_schema()
dbacademy,labuser11045124_1753761040


## B. Inspecting and Referencing Unity Catalog Objects

### Catalogs, Schemas, Volumes, and Tables
In Unity Catalog, all metadata is registered in a metastore. The hierarchy of database objects in any Unity Catalog metastore is divided into three levels, represented as a three-level namespace (example, `<catalog>.<schema>.<object>`) when you reference tables, views, volumes, models, and functions.


### B1. Catalogs

Use the `SHOW SCHEMAS IN` statement to view available schemas in the **dbacademy** catalog. Run the cell and view the results. Notice that your **labuser** schema is within the **dbacademy** catalog.

In [0]:
SHOW SCHEMAS IN dbacademy;

databaseName
information_schema
labuser11045124_1753761040
ops


### B2. Schemas
Run the `DESCRIBE SCHEMA EXTENDED` statement to see information about your **labuser** schema (database) that was created for you within the **dbacademy** catalog. In the output below, your schema name is in the row called *Namespace Name*.  

**NOTE:** Remember, we are using the `IDENTIFIER` clause to dynamically reference your specific schema name in the lab, since each user will have a different schema name. Alternatively, you can type in the schema name.

In [0]:
DESCRIBE SCHEMA EXTENDED IDENTIFIER(DA.schema_name);

database_description_item,database_description_value
Catalog Name,dbacademy
Namespace Name,labuser11045124_1753761040
Comment,
Location,
Owner,9556a37f-7dc0-4b5f-849c-babbde9b34af
Properties,
Predictive Optimization,ENABLE (inherited from METASTORE 4307004-us-west-2)


### B3. Tables
Use the `DESCRIBE TABLE EXTENDED` statement to describe the table `mydeltatable`.

Run the cell and view the results. Notice the following:
- In the first few cells, you can see column information.
- Starting at cell 4, you can see additional **Delta Statistics Columns**.
- Starting at cell 8, you can see additional **Detailed Table Information**.

**NOTE:** Remember, we do not need to reference the three-level namespace (`catalog.schema.table`) because we set our default catalog and schema earlier.


In [0]:
DESCRIBE TABLE EXTENDED mydeltatable

col_name,data_type,comment
id,int,
name,string,
,,
# Delta Statistics Columns,,
Column Names,"id, name",
Column Selection Method,first-32,
,,
# Detailed Table Information,,
Catalog,dbacademy,
Database,labuser11045124_1753761040,


### B4. Volumes

Volumes are Unity Catalog objects that enable governance over non-tabular datasets. Volumes represent a logical volume of storage in a cloud object storage location. Volumes provide capabilities for accessing, storing, governing, and organizing files.

While tables provide governance over tabular datasets, volumes add governance over non-tabular datasets. You can use volumes to store and access files in **_any_** format, including structured, semi-structured, and unstructured data.

Databricks recommends using volumes to govern access to all non-tabular data. Like tables, volumes can be managed or external.

#### B4.1 UI Exploration

Complete the following to explore the **dbacademy_ecommerce** catalog:

1. In the left navigation bar, select the catalog icon:  ![Catalog Icon](./Includes/images/catalog_icon.png)

2. Locate the catalog called **dbacademy_ecommerce** and expand the catalog.

3. Expand the **v01** schema. Notice that this catalog contains two volumes, **delta** and **raw**.

4. Expand the **raw** volume. Notice that the volume contains a series of folders.

5. Expand the **users-historical** folder. Notice that the folder contains a series of files.


#### B4.2 Volume Exploration with SQL

Run the `DESCRIBE VOLUME` statement to return the metadata for the **dbacademy_ecommerce.v01.raw** volume. The metadata includes the volume name, schema, catalog, type, comment, owner, and more.

Notice the following:
- Under the **storage_location** column, you can see the cloud storage location for this volume.

- Under the **volume_type** column, it indicates this is a *MANAGED* volume.


In [0]:
DESCRIBE VOLUME dbacademy_ecommerce.v01.raw;

name,catalog,database,owner,storage_location,volume_type,comment,securable_type,securable_kind
raw,dbacademy_ecommerce,v01,metastore_admins,s3://marketplace-sandbox-uc-databricks/UC/8f245c11-89d6-49ba-b4cb-8698c92dc4fe/volumes/01312a33-a599-4c00-ab89-9702aff680e1,MANAGED,,VOLUME,VOLUME_DELTASHARING


#### B4.3 List Files in a Volume


Use the `LIST` statement to list the available files in the **raw** volume's **users-historical** directory (`/Volumes/dbacademy_ecommerce/v01/raw/users-historical`) and view the results.

Notice the following:
- Ignore any file names that begin with an underscore (_). These are temporary or intermediate files used when writing files to a location.
- Scroll down in the results and expand one of the files where the **name** column begins with **part**. Confirm that this directory contains a series of Parquet files.


**NOTE:**  When interacting with data in volumes, use the path provided by Unity Catalog, which always follows this format: */Volumes/catalog_name/schema_name/volume_name/*.

For more information on exploring directories and data files managed with Unity Catalog volumes, check out the [Explore storage and find data files](https://docs.databricks.com/en/discover/files.html) documentation.


In [0]:
LIST '/Volumes/dbacademy_ecommerce/v01/raw/users-historical'

path,name,size,modification_time
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/_SUCCESS,_SUCCESS,0,1726173048000
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/_committed_531959640415905750,_committed_531959640415905750,424,1726173048000
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/_started_531959640415905750,_started_531959640415905750,0,1726173049000
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/part-00000-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7571-1-c000.snappy.parquet,part-00000-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7571-1-c000.snappy.parquet,974753,1726173049000
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/part-00001-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7572-1-c000.snappy.parquet,part-00001-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7572-1-c000.snappy.parquet,976470,1726173049000
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/part-00002-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7573-1-c000.snappy.parquet,part-00002-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7573-1-c000.snappy.parquet,980390,1726173049000
/Volumes/dbacademy_ecommerce/v01/raw/users-historical/part-00003-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7574-1-c000.snappy.parquet,part-00003-tid-531959640415905750-948b4f2d-2d35-46e3-97eb-e6d85d2bf872-7574-1-c000.snappy.parquet,979632,1726173050000



&copy; 2025 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="blank">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy" target="blank">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use" target="blank">Terms of Use</a> | 
<a href="https://help.databricks.com/" target="blank">Support</a>