
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>


# Databricks Workspace Walkthrough

In this walkthrough, we will be covering several key components of the Databricks workspace, including:

1. **Navigate to the Databricks Homepage** and understand its usefulness for high-level navigation.

1. **Utilizing the sidebar to navigate** between various Databricks features. 

1. **Inspect and change user settings**.

1. **Understanding Classroom Setup** for all the demos and labs.
   
1. **Exploring the Workspace and Navigation**, where we will look into how to efficiently move through different areas of the Databricks interface.

1. **Working with Notebooks**, where we will go over how to create, edit, and run code in notebooks.

1. **Unity Catalog and Catalog Explorer**, covering how to assign and manage permissions for different users and groups to ensure secure access to data in addition to how to locate various assets registered to Unity Catalog. 

1. **(Optional) Understanding Compute**, going over how to use the UI to create a new cluster. 

*Note: There will be very little coding in this demonstration.*

## Workspace Homepage

1. If you click on the **Databricks logo** at the top left, you'll be taken to a screen that says **Welcome to Databricks**.

1. You will see five different tabs labeled:
   - **Recents**
   - **Favorites**
   - **Popular**
   - **Mosaic AI**
   - **What's New**

1. Under **Recents**, you'll find all of your notebooks that have been recently opened or worked on.

1. Under **Favorites**, anytime you favorite a notebook, table, or any kind of asset within Databricks, you will find it here.

1. Under **Popular**, this is where you can discover popular tables, notebooks, and other assets within the workspace.

1. Under **Mosaic AI**, you will see newly added and featured models registered with Mosaic AI Model Serving. 

1. Under **What's New**, you can see recent announcements and updates about the platform.


## Sidebar Navigation


1. Click on the **hamburger icon** (three horizontal lines stacked on top of each other) at the top left. Clicking on it will reveal or hide the left sidebar navigation.



1. In the left sidebar, you will find the following items at the top:
   - **+ New**
   - **Workspace**
   - **Recents**
   - **Catalog**
   - **Workflows**
   - **Compute**

1. If you click on **+ New** at the very top of the sidebar navigation menu, you will have options to create a new:
    - **Notebook**
    - **Query** 
    - **Dashboard**
    - **Job**
    - **DLT Pipeline**
    - **Alert**
    - **Experiment**
    - **AutoML experiment**
    - **Model**
    - **App**
    - At the very bottom, you will see **More**, which allows you to interact with:
      - Your **Git folder**
      - **Cluster**
      - **SQL Warehouse**
      - **Serving Endpoint**

1. You will also find a grouping of **SQL** menus:
   - **SQL Editor**
   - **Queries**
   - **Dashboards**
   - **Alerts**
   - **Query History**
   - **SQL Warehouses**

1. Under the **Data Engineering** pane, you will find:
   - **Job Runs**
   - **Data Ingestion**
   - **Delta Live Tables**

1. Under **Machine Learning**, you will find:
   - **Playground**
   - **Experiments**
   - **Features**
   - **Models**
   - **Serving**

1. At the very bottom, you will find **Marketplace** and **Partner Connect**.

## Navigating Databricks User Settings

1. Go to the top right, click on the user icon, and select **Settings**.

2. In **Settings**, you will see **User Settings** with five different options:
   - **Profile**
   - **Preferences**
   - **Developer**
   - **Linked Accounts**
   - **Notifications**

3. If you click on **Profile**, you will see:
   - Your display name
   - The group you belong to
   - Your membership details
   - An option to change your password

4. Under **Preferences**, you will see two options to manage:
   - **Language**
   - **Interface Theme**

   You can click on these options if you wish to change them.

5. Under **Developer**, you will see various options as you scroll down, including:
   - Getting an **Access Token**
   - **SQL Query Snippets**
   - **Editor Settings**
   - **Code Editor**
   - **Experimental Features**

   You'll see various options toggled on or off as you scroll down.

6. Under **Linked Accounts**, you will find information about Git integration. Here, you can:
   - Link a Git account
   - Select a **Personal Access Token** for a given Git provider, such as GitHub.

7. Finally, under **Notifications**, you can manage when you'll be notified for:
   - **Model Registry** events
   - **Account-level Email Communications**
   - **Promotional Email Communications**


## Classroom Setup

Before continuing the demo, run the provided classroom setup script. This script will define configuration variables necessary for the demo.

In [0]:
%run ../Includes/Classroom-Setup-01

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


Resetting the learning environment:
| dropping the catalog "labuser8027617_1732830335_zsw6_da"...(1 seconds)

Skipping install of existing datasets to "dbfs:/mnt/dbacademy-datasets/get-started-with-databricks-for-machine-learning/v01"

Validating the locally installed datasets:
| listing local files...(0 seconds)
| validation completed...(0 seconds total)
Creating & using the catalog "labuser8027617_1732830335_zsw6_da"...(2 seconds)

Predefined tables in "labuser8027617_1732830335_zsw6_da.default":
| -none-

Predefined paths variables:
| DA.paths.working_dir: dbfs:/mnt/dbacademy-users/labuser8027617_1732830335@vocareum.com/get-started-with-databricks-for-machine-learning
| DA.paths.datasets:    dbfs:/mnt/dbacademy-datasets/get-started-with-databricks-for-machine-learning/v01

Setup completed (9 seconds)


**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

Username:          labuser8027617_1732830335@vocareum.com
Catalog Name:      labuser8027617_1732830335_zsw6_da
Schema Name:       default
Working Directory: dbfs:/mnt/dbacademy-users/labuser8027617_1732830335@vocareum.com/get-started-with-databricks-for-machine-learning
Dataset Location:  dbfs:/mnt/dbacademy-datasets/get-started-with-databricks-for-machine-learning/v01


### Create a sample table

Let's create a sample table using our `DA` object called `retail-customers`. We will use this table throughout our demo.

In [0]:
DA.create_customers_table()

Customers table created successfully!



## Exploring the Workspace and Navigation

The Workspace acts as a central hub for organizing and accessing various assets, such as notebooks and files. Navigate to **Workspace** using the left sidebar menu.


### Workspace

The Workspace is where you'll manage your assets and perform various tasks. Here's how to navigate and perform common actions within the Workspace:

* **Create a Folder**:
   - Click on the **Workspace** button in the left navigation bar.
   - Select the folder where you want to create a new folder.
   - Right-click and choose **Create > Folder**.
   - Provide a name for the folder and click **Create**
   - Alternatively, click on **Create** button at the top-right corner
   - Select **folder** option name it and click on **Create** button 

* **Import a File**:
   - Navigate to the desired folder.
   - Right-click and choose **Import**.
   - Select the file you want to import and click **Import**.
   - Alternatively, navigate to desired folder and click on **( ⋮ )** kebab icon at top-right corner
   - Select the import option from the dropdown
   - Select the file you want to import and click **Import**

* **Export a File**:
   - Right-click on a file in the Workspace.
   - Choose **Export** and select the desired export format.
   - Alternatively, navigate to folder in which file is present
   - Click on the click on **( ⋮ )** kebab icon and select export
   - Select the desired file type

* **Navigating Folders**:
   - Double-click on a folder to enter it.
   - Use the breadcrumb trail at the top to navigate back.
   - Alternatively, click on the workspace on left sidebar to move at top of folder structure

* **Create a notebook**:
   - Navigate to the desired folder
   - Right-click and hover over create
   - Select the notebook option from the dropdown
   - Alternatively, click on the **Add** button inside the desired folder
   - Select **notebook** option to create new notebook

* **Rename Folder**:
   - Navigate over your folder name
   - Right-click on a folder and select **Rename**
   - Enter the new name and select **Ok**
   - Alternatively, navigate to desired folder 
   - Click on kebab icon **( ⋮ )** on folder name want to rename
   - Rename the folder and click on **ok** button
 
* **Share Folder**:
   - Hover over the folder you want to share
   - Click on the kebab icon **( ⋮ )** at right corner
   - Click on the Share (Permissions) option from the dropdown
   - From the dropdown menu, you can select the users you would like to grant permissions to
   - After selection, an additional dropdown menu will appear to the right. 
   - Provide edit, view, manage, run permissions you want to give to user and click **Add**. You will see a message in the top right corner verifying the permission change. 
* **Moving Files**
   - Databricks supports drag and drop when organizing your notebooks and files within **Workspace**



### Git Functionality

The Databricks Workspace allows you to connect your projects with Git repositories. This enables you to collaborate on code, track changes, and easily sync your work between Databricks and Git. This demo will not provide any hands-on exercises with repos other than creating a Git Folder. However, we will go through the motions of using the UI together. 

Here's how to work with repos:

**📌 Note:** Before working with Repos user should have git credentials for resources and operations at the Databricks workspace level. Follow this <a href="https://docs.databricks.com/en/repos/repos-setup.html#set-up-databricks-repos" target="_blank">documentation</a> to set git credentials from **User settings**. [Databricks recommends Git folders over legacy Repos](https://docs.databricks.com/en/repos/what-happened-repos.html).

* **Add a Repo**:
   - Click on the **Workspace** button in the left navigation bar.
   - Click on **Repos**
   - Click on **Create Git Folder** in the message box the top of the screen. 
   - Provide the Git repository <a href="https://github.com/databricks/databricks-ml-examples.git" target="_blank">URL</a> and click **Create**.
   - Navigate back to **Workspace** and click on Repos and find that a folder with your username has been created. Click on it. 
   - You will find the Git folder titles **databricks-ml-examples** has been created along with the folder for this course that starts with **get-started-with-databricks**. 

* **Pull Changes**:
   - Inside a cloned Git folder, right-click on the folder name
   - Select **Git** option from the dropdown
   - Click on the **Pull** button at the top-right corner to update the repo with the latest changes.
   
* **Push Changes**:
   - Inside a cloned repo folder, click on the **Git** button.
   - Select the **branch** in which you want to push the changes.
   - Choose **Push** to send your local changes to the remote repository. Note that we haven't made any changes, so we cannot actually commit any code. 

* **Commit Changes**:
   - Inside the cloned Git folder, click the Git button.
   - Select the **branch** where you want to make your changes
   - Enter the commit message
   - Choose **Commit** to save your changes along with the commit message. Note we will not follow through with this commit since we are only demonstrating the process.


### Finding Assets

In the Workspace, you can quickly find assets using the search bar at the top. The search capabilities within Databricks are built on top of the DatabricksIQ. Type keywords related to the asset you're looking for, and the search will provide suggestions as you type:

* **Finding assets through searchbox**:
   - Click on the search bar at the top-left
   - Enter the desired keyword.
   - Select the matching filename from the dropdown and press Enter.
   - For advanced search options, while search box is active filter results by asset type such as notebooks, dashboards, etc. 
   - Try searching for the table we made earlier. 

* **Find recent files/folders**:
   - Navigate to the sidebar
   - Click on **Recents**
   - Select the file you want to access


## Working with Notebooks

Let's now explore the power of Databricks notebooks:

* Attach a Notebook to a Cluster:
   - Click on **Workspace** in the left navigation bar.
   - Select the desired folder or create a new one.
   - Right-click and choose **Create > Notebook**.
   - Name your notebook and select the previously created cluster from the dropdown.
   - Click **Confirm**.

* Creating a Cell
   - Navigate to the bottom of existing cell.
   - Click on the **(+) Code** icon to add new cell.   

* Running a Cell:
   - Cells in a notebook can be executed using the **Run** button at the top-left corner of the cell or by pressing **Shift + Enter**.

* Run all cells:
   - Click on the **Run all** button to run all cells at once in notebook.

* Create Python, SQL cells:
   - Navigate to the language switcher cell at the top-right of cell.
   - Select the desired language for your cell.
   - Alternatively, type **%py** or **%sql** at the top of the cell.

* View cell outputs:
   - Notebooks support creating interactive charts to visualize data. 
   - You can view the schema of a Spark DataFrame as a part of the output as well. 

### Left sidebar actions
   - Click on the **Table of contents** icon between the left sidebar and the topmost cell to access notebook table content
   - Click on the folder icon to access **folder** structure of the workspace
   - Navigate to the **Catalog** icon to get a list of available catalogs, schemas, and other assets. 
   - Navigate to **Assistant** (generally available) for viewing code suggestions, diagnosing errors, etc.

### Right sidebar actions 
   - Click on the message icon to add **Comments** on existing code
   - Use the **MLflow experiments** icon to create a workspace experiment.
   - Access code versioning history through the **Version history** icon.
   - Get a list of variables used in a notebook by navigating to **Variable explorer** icon.
   - Discover Python libraries used in the notebook by navigating to the **Python libraries** icon.
   - View your **Environment** as well, which shows your Python environment configuration. 


### Working with Markdown

Working with Markdown Cells:
- Markdown cells allow you to add formatted text and documentation to your notebook.
- Create a new cell, change its type to Markdown, and enter some text using Markdown syntax.
- Alternatively, type **%md** at the top of cell

Editing a Markdown cell:
- Double click this cell to begin editing it
- Then hit **`Esc`** to stop editing


### Markdown Example

# Title One
## Title Two
### Title Three

This is a test of the emergency broadcast system. This is only a test.

This is text with a **bold** word in it.

This is text with an *italicized* word in it.

This is an ordered list
1. one
1. two
1. three

This is an unordered list
* apples
* peaches
* bananas

Links/Embedded HTML: <a href="https://en.wikipedia.org/wiki/Markdown" target="_blank">Markdown - Wikipedia</a>

Images:
![Spark Engines](https://files.training.databricks.com/images/Apache-Spark-Logo_TM_200px.png)

And of course, tables:

| name   | value |
|--------|-------|
| Yi     | 1     |
| Ali    | 2     |
| Selina | 3     |


### Example 1: Executing SQL Code

In [0]:
%sql
SELECT * FROM `retail-customers`;

customer_id,tax_id,tax_code,customer_name,state,city,postcode,street,number,unit,region,district,lon,lat,ship_to_address,valid_from,valid_to,units_purchased,loyalty_segment
11123757,,,"SMITH, SHIRLEY",IN,BREMEN,46506.0,N CENTER ST,521.0,,Indiana,50.0,-86.1465825,41.4507625,"IN, 46506.0, N CENTER ST, 521.0",1532824233,1548137353.0,34.0,3
30585978,,,"STEPHENS, GERALDINE M",OR,ADDRESS,0,NO SITUS,,,,,-122.1055158,45.374317,"OR, 0, NO SITUS, nan",1523100473,,18.0,3
349822,,,"GUZMAN, CARMEN",VA,VIENNA,22181,HILL RD,2860,,VA,,-77.2941261,38.88303270000001,"VA, 22181, HILL RD, 2860",1522922493,,5.0,0
27652636,,,"HASSETT, PATRICK J",WI,VILLAGE OF NASHOTAH,53058.0,IVY LANE,W333N 5591,,,,-88.40951700000002,43.1213789,"WI, 53058.0, IVY LANE, W333N 5591",1531834357,1558052195.0,7.0,1
14437343,,,"HENTZ, DIANA L",OH,COLUMBUS,43228.0,ALLIANCE WAY,5706,,OH,FRA,-83.158438,39.97821810000001,"OH, 43228.0, ALLIANCE WAY, 5706",1517227530,,0.0,0
20441596,,,"TIRADO, MARCO A",NY,Otselic,13072,County Road 16,2792,,NY,Chenango,-75.7505808,42.7172722,"NY, 13072, County Road 16, 2792",1519335250,,24.0,3
5945686,,,"SKORA, BRIAN S",MI,,48205.0,E 8 MILE RD,16414.0,,,,-82.950874,42.4499233,"MI, 48205.0, E 8 MILE RD, 16414.0",1518988242,,7.0,1
5385771,,,"SLAWEK, DEAN J",PA,,19147-3204,FITZWATER ST,328,,,,-75.14920550000002,39.9389473,"PA, 19147-3204, FITZWATER ST, 328",1518239268,,18.0,3
1427940,,,"REAVES, LIONEL C",VA,HOT SPRINGS,24445.0,HOT SPRINGS RD,6419.0,,,,-79.90497859999998,37.8949737,"VA, 24445.0, HOT SPRINGS RD, 6419.0",1529087690,,10.0,2
10457387,,,"BONGIOVANNI, KELLY M",IN,VINCENNES,47591,JERRY ST,2006.0,,Indiana,42.0,-87.519002,38.662178,"IN, 47591, JERRY ST, 2006.0",1535887733,,9.0,2


### Example 2: Executing Python Code

In [0]:
print("This is a Python cell!!")

This is a Python cell!!


### Example 3: Using the Databricks Assistant

Here, we show how to utilize the Databricks assistant to write python code for summing integers 1 through 10. 

1. **Copy the following prompt:** Use Python to compute the sum of integers 1 through 10. 
1. Create a new code cell below this one and click the Assistant icon in the top right. Alternatively, you can press **Command/Alt** + **I** on your keyboard. 
1. Paste the prompt and click **Generate**. 
1. Click **Run suggested**. The output will be `55`. 
1. Click the blue **Accept** button to the right.


In [0]:
total_sum = sum(range(1, 11))
display(total_sum)

55


## Unity Catalog and Catalog Explorer

Managing permission is essential for controlling who can access and perform actions on your data and resources. Unity catalog allows data asset owners to manage permissions using the Catalog Explorer UI or using SQL commands.

### Granting Table Permissions to Users with the UI

1. Navigate to **Catalog** in the sidebar and search for catalog the catalog we made earlier (see **Create a sample table** within this notebook). 
2. Navigate to the **Schema** within the catalog.
3. Click the table we created earlier in **Catalog Explorer** to open the table details page, and go to the **Permissions** tab. 
4. Click **Grant**. 
    - Select the users and groups you want to give permission to. 
    - Select the privileges you want to grant. For example, assign `SELECT` (read) privilege. 
    - Under **Privilege presents** you can grant broad privileges as a Data Reader or Data Editor: 
      - **Data Reader**: can read from any object in the catalog. 
      - **Data Editor**: can read and modify any object in the catalog, as well as create new objects. 
5. Click **Grant** once you have made your selections. 

Inside **Catalog Explorer**, we can also see options for creating and uploading other assets such as **volumes** and **models**.

### Granting Table Permissions to Users Using SQL Statements

We can also grant permissions via SQL. Run the following cell. You can view the output after running the cell and also verify the result within the Catalog Explorer using the previous instructions.

In [0]:
%sql
GRANT SELECT ON TABLE `retail-customers` TO `account users`;
SHOW GRANTS ON TABLE `retail-customers`

Principal,ActionType,ObjectType,ObjectKey
account users,SELECT,TABLE,labuser8027617_1732830335_zsw6_da.default.retail-customers


## (Optional) Understanding Compute


1. On the left sidebar, you'll see **Compute**. Click on it.

2. You will be taken to a screen where, at the top, you will see six different tabs:
   - **All-Purpose Compute**
   - **Job Compute**
   - **SQL Warehouses**
   - **Vector Search**
   - **Pools**
   - **Apps**

3. Inside **All-Purpose Compute**, click on **Create with DBAcademy** at the top right.

4. You will be taken to a screen where you can view how to create a new compute instance. For example, you will be able to set compute policy options such as:
   - **Multi-node** or **Single-node**.

5. You will see **Access Mode**, which contains:
   - **Single User Shared**
   - **No Isolation Shared**

6. Additionally, you will see options for:
   - **Performance**
   - **Node Type**
   - **Instance Profile**
   - **Tags**
   - Other advanced options.

7. In the **Summary** section, you will see a couple of tags, such as:
   - **Unity Catalog** being enabled for this compute
   - The **Runtime** details for the compute.

8. Since we are not creating a new compute, go ahead and click **Cancel**.


#### Understanding Compute and Runtimes in Databricks

Databricks offers various types of compute for different tasks, including:

1. **Serverless Compute for a Notebook**
2. **Serverless Compute for Jobs**
3. **All-Purpose Compute**
4. **Jobs Compute**
5. **Instance Pools**
6. **Serverless SQL Warehouses**
7. **Classic SQL Warehouses**

We also have the choice of using a CPU or a GPU for our compute type. 


### Photon

In addition to these compute options, Databricks provides **Photon**, a high-performance native vectorized query engine. Photon runs SQL workloads and DataFrame API calls faster, reducing the total cost per workload.

### Serverless Compute

**Serverless compute** enhances productivity, efficiency, and reliability by automatically managing infrastructure needs for your workloads.

In addition to the different types of compute, there are also various **Databricks Runtimes**. Each Databricks runtime version includes updates that improve usability, performance, and security. All of this is managed infrastructure provided by Databricks.

The **Databricks Runtime** on your compute adds many features, such as **Delta Lake** and pre-installed **Java**, **Scala**, **Python**, and **R** libraries

### Databricks Machine Learning Runtime

The **Databricks Machine Learning Runtime** provides scalable clusters that include:
- Popular machine learning frameworks
- Built-in **AutoML**
- Optimizations for performance enhancements in data science and machine learning tasks


## Clean up Classroom

After completing the demo, clean up any resources created.

Run the following cell to remove lessons-specific assets created during this lesson.

In [0]:
DA.cleanup()

Resetting the learning environment:
| dropping the catalog "labuser8027617_1732830335_zsw6_da"...(0 seconds)

Validating the locally installed datasets:
| listing local files...(0 seconds)
| validation completed...(0 seconds total)



## Conclusion

In this demo, we explored essential aspects of working with the workspace. We went through the workspace homepage, how to navigate the sidebar, how to navigate user settings, understanding how demos and labs will be set up using the classroom setup. We explored the workspace and navigation functionality, such as understanding how Git is integrated with Databricks. We explored how to work with notebooks and how to manage permissions with Unity Catalog. There is much more to discuss, but there are just the highlights of the Data Intelligence Platform.


&copy; 2024 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the 
<a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use">Terms of Use</a> | 
<a href="https://help.databricks.com/">Support</a>