d-sandbox
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning" style="width: 400px">
</div>

# Databricks Platform
1. Execute code in multiple languages
1. Create documentation cells
1. Access DBFS (Databricks File System)
1. Create database and table
1. Query table and plot results
1. Add notebook parameters with widgets

##### Databricks Notebook Utilities
- <a href="https://docs.databricks.com/notebooks/notebooks-use.html#language-magic" target="_blank">Magic commands</a>: `%python`, `%scala`, `%sql`, `%r`, `%sh`, `%md`
- <a href="https://docs.databricks.com/dev-tools/databricks-utils.html" target="_blank">Databricks utilities</a>: `dbutils.fs` (`%fs`), `dbutils.notebooks` (`%run`), `dbutils.widgets`
- <a href="https://docs.databricks.com/notebooks/visualizations/index.html" target="_blank">Visualization</a>: `display`, `displayHTML`

### Setup
Run classroom setup to mount Databricks training datasets and create your own database for BedBricks.

Use the `%run` magic command to run another notebook within a notebook

In [0]:
%run ./Includes/Classroom-Setup

### ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Execute code in multiple languages
Run default language of notebook

In [0]:
print("Run default language")

Run default language


Run language specified by language magic command: `%python`, `%scala`, `%sql`, `%r`

In [0]:
%python
print("Run python")

Run python


In [0]:
%scala
println("Run scala")

In [0]:
%sql
select "Run SQL"

Run SQL
Run SQL


In [0]:
%r
print("Run R", quote=FALSE)

Run shell commands using magic command: `%sh`

In [0]:
%sh ps | grep 'java'

  286 ?        00:03:35 java
  487 ?        00:10:54 java


Render HTML using the function: `displayHTML` (available in Python, Scala, and R)

In [0]:
html = """<h1 style="color:orange;text-align:center;font-family:Courier">Render HTML</h1>"""
displayHTML(html)

### ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Create documentation cells
Render cell as <a href="https://www.markdownguide.org/cheat-sheet/" target="_blank">Markdown</a> using magic command: `%md`

# Heading 1
### Heading 3
> block quote

1. **bold**
2. *italicized*
3. ~~strikethrough~~
---
- [link](https://www.markdownguide.org/cheat-sheet/)
- `code`

```
{
  "message": "This is a code block",
  "method": "https://www.markdownguide.org/extended-syntax/#fenced-code-blocks",
  "alternative": "https://www.markdownguide.org/basic-syntax/#code-blocks"
}
```

![Spark Logo](https://files.training.databricks.com/images/Apache-Spark-Logo_TM_200px.png)

| Element         | Markdown Syntax |
|-----------------|-----------------|
| Heading         | `#H1` `##H2` `###H3` `#### H4` `##### H5` `###### H6` |
| Block quote     | `> blockquote` |
| Bold            | `**bold**` |
| Italic          | `*italicized*` |
| Strikethrough   | `~~strikethrough~~` |
| Horizontal Rule | `---` |
| Code            | ``` `code` ``` |
| Link            | `[text](https://www.example.com)` |
| Image           | `[alt text](image.jpg)`|
| Ordered List    | `1. First items` <br> `2. Second Item` <br> `3. Third Item` |
| Unordered List  | `- First items` <br> `- Second Item` <br> `- Third Item` |
| Code Block      | ```` ``` ```` <br> `code block` <br> ```` ``` ````|
| Table           |<code> &#124; col &#124; col &#124; col &#124; </code> <br> <code> &#124;---&#124;---&#124;---&#124; </code> <br> <code> &#124; val &#124; val &#124; val &#124; </code> <br> <code> &#124; val &#124; val &#124; val &#124; </code> <br>|

### ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Access DBFS (Databricks File System)
Run file system commands on DBFS using magic command: `%fs`

In [0]:
%fs ls

path,name,size
dbfs:/FileStore/,FileStore/,0
dbfs:/databricks-datasets/,databricks-datasets/,0
dbfs:/databricks-results/,databricks-results/,0
dbfs:/mnt/,mnt/,0
dbfs:/user/,user/,0


In [0]:
%fs ls /databricks-datasets

path,name,size
dbfs:/databricks-datasets/,databricks-datasets/,0
dbfs:/databricks-datasets/COVID/,COVID/,0
dbfs:/databricks-datasets/README.md,README.md,976
dbfs:/databricks-datasets/Rdatasets/,Rdatasets/,0
dbfs:/databricks-datasets/SPARK_README.md,SPARK_README.md,3359
dbfs:/databricks-datasets/adult/,adult/,0
dbfs:/databricks-datasets/airlines/,airlines/,0
dbfs:/databricks-datasets/amazon/,amazon/,0
dbfs:/databricks-datasets/asa/,asa/,0
dbfs:/databricks-datasets/atlas_higgs/,atlas_higgs/,0


In [0]:
%fs head /databricks-datasets/README.md

In [0]:
%fs mounts

mountPoint,source,encryptionType
/mnt/training,s3a://databricks-corp-training/common,
/databricks-datasets,databricks-datasets,sse-s3
/databricks/mlflow-tracking,databricks/mlflow-tracking,sse-s3
/databricks-results,databricks-results,sse-s3
/databricks/mlflow-registry,databricks/mlflow-registry,sse-s3
/,DatabricksRoot,sse-s3


`%fs` is shorthand for the <a href="https://docs.databricks.com/dev-tools/databricks-utils.html" target="_blank">DBUtils</a> module: `dbutils.fs`

In [0]:
%fs help

Run file system commands on DBFS using DBUtils directly

In [0]:
dbutils.fs.ls("/databricks-datasets")

Out[8]: [FileInfo(path='dbfs:/databricks-datasets/', name='databricks-datasets/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/COVID/', name='COVID/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/README.md', name='README.md', size=976),
 FileInfo(path='dbfs:/databricks-datasets/Rdatasets/', name='Rdatasets/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/SPARK_README.md', name='SPARK_README.md', size=3359),
 FileInfo(path='dbfs:/databricks-datasets/adult/', name='adult/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/airlines/', name='airlines/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/amazon/', name='amazon/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/asa/', name='asa/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/atlas_higgs/', name='atlas_higgs/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/bikeSharing/', name='bikeSharing/', size=0),
 FileInfo(path='dbfs:/databricks-datasets/cctvVideos/', name='cctvVideos/', size=0),
 FileInfo

Visualize results in a table using the Databricks <a href="https://docs.databricks.com/notebooks/visualizations/index.html#display-function-1" target="_blank">display</a> function

In [0]:
files = dbutils.fs.ls("/databricks-datasets")
display(files)

path,name,size
dbfs:/databricks-datasets/,databricks-datasets/,0
dbfs:/databricks-datasets/COVID/,COVID/,0
dbfs:/databricks-datasets/README.md,README.md,976
dbfs:/databricks-datasets/Rdatasets/,Rdatasets/,0
dbfs:/databricks-datasets/SPARK_README.md,SPARK_README.md,3359
dbfs:/databricks-datasets/adult/,adult/,0
dbfs:/databricks-datasets/airlines/,airlines/,0
dbfs:/databricks-datasets/amazon/,amazon/,0
dbfs:/databricks-datasets/asa/,asa/,0
dbfs:/databricks-datasets/atlas_higgs/,atlas_higgs/,0


### ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Create table
Run <a href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/index.html#sql-reference" target="_blank">Databricks SQL Commands</a> to create a table named `events` using BedBricks event files on DBFS.

In [0]:
%sql
CREATE TABLE IF NOT EXISTS events USING parquet OPTIONS (path "/mnt/training/ecommerce/events/events.parquet");

This table was saved in the database created for you in classroom setup. See database name printed below.

In [0]:
print(databaseName)

spark_programming_jaimeverapalominogmailcom_py


View your database and table in the Data tab of the UI.

### ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Query table and plot results
Use SQL to query `events` table

In [0]:
%sql
SELECT * FROM events

Run the query below and then <a href="https://docs.databricks.com/notebooks/visualizations/index.html#plot-types" target="_blank">plot</a> results by selecting the bar chart icon

In [0]:
%sql
SELECT traffic_source, SUM(ecommerce.purchase_revenue_in_usd) AS total_revenue
FROM events
GROUP BY traffic_source

traffic_source,total_revenue
instagram,16177893.0
direct,12704560.0
youtube,8044326.0
email,78800000.29999995
facebook,24797837.0
google,47218429.0


### ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Add notebook parameters with widgets
Use <a href="https://docs.databricks.com/notebooks/widgets.html" target="_blank">widgets</a> to add input parameters to your notebook.

Create a text input widget using SQL.

In [0]:
%sql
CREATE WIDGET TEXT state DEFAULT "CA"

Access the current value of the widget using the function `getArgument`

In [0]:
%sql
SELECT *
FROM events
WHERE geo.state = getArgument("state")

device,ecommerce,event_name,event_previous_timestamp,event_timestamp,geo,items,traffic_source,user_first_touch_timestamp,user_id
macOS,"List(null, null, null)",add_item,1593878792892652.0,1593878815459100,"List(Salinas, CA)","List(List(null, M_STAN_T, Standard Twin Mattress, 595.0, 595.0, 1))",youtube,1593878455472030,UA000000107375547
Android,"List(null, null, null)",warranty,1593878529774474.0,1593879213196400,"List(Rancho Santa Margarita, CA)",List(),instagram,1593878529774474,UA000000107376205
Windows,"List(null, null, null)",reviews,1593876442432487.0,1593876944661570,"List(Concord, CA)",List(),direct,1593876442432487,UA000000107357467
iOS,"List(null, null, null)",reviews,1593878353149193.0,1593878356880855,"List(Los Angeles, CA)",List(),facebook,1593878353149193,UA000000107374663
macOS,"List(null, null, null)",main,,1593879039209043,"List(Newark, CA)",List(),direct,1593879039209043,UA000000107380780
Linux,"List(null, null, null)",main,,1593877977050669,"List(Fairfield, CA)",List(),instagram,1593877977050669,UA000000107371192
macOS,"List(null, null, null)",original,1593877933738325.0,1593878472750958,"List(Moreno Valley, CA)",List(),google,1593877933738325,UA000000107370829
Windows,"List(null, null, null)",careers,,1593878120615516,"List(Lynwood, CA)",List(),google,1593878120615516,UA000000107372505
Android,"List(null, null, null)",warranty,1593876568777492.0,1593878762928254,"List(San Diego, CA)",List(),email,1593876568777492,UA000000107358569
macOS,"List(null, null, null)",mattresses,,1593879062248629,"List(Davis, CA)",List(),facebook,1593879062248629,UA000000107380991


Remove the text widget

In [0]:
%sql
REMOVE WIDGET state

To create widgets in Python, Scala, and R, use the DBUtils module: `dbutils.widgets`

In [0]:
dbutils.widgets.text("name", "Brickster", "Name")
dbutils.widgets.multiselect("colors", "orange", ["red", "orange", "black", "blue"], "Traffic Sources")

Access the current value of the widget using the `dbutils.widgets` function `get`

In [0]:
name = dbutils.widgets.get("name")
colors = dbutils.widgets.get("colors").split(",")

html = "<div>Hi {}! Select your color preference.</div>".format(name)
for c in colors:
  html += """<label for="{}" style="color:{}"><input type="radio"> {}</label><br>""".format(c, c, c)

displayHTML(html)

Remove all widgets

In [0]:
dbutils.widgets.removeAll()

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) BedBricks Datasets Lab
Explore BedBricks datasets
1. View data files in DBFS using magic commands
1. View data files in DBFS using dbutils
1. Create tables from files in DBFS
1. Execute SQL to answer questions on BedBricks datasets

-sandbox
### 1. View data files in DBFS using magic commands
Use a magic command to display files located in the DBFS directory: **`/mnt/training/ecommerce`**

<img alt="Hint" title="Hint" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.3em" src="https://files.training.databricks.com/static/images/icon-light-bulb.svg"/>&nbsp;**Hint:** You should see four items: `events`, `products`, `sales`, `users`

In [0]:
%fs ls /mnt/training/ecommerce

path,name,size
dbfs:/mnt/training/ecommerce/README.md,README.md,621
dbfs:/mnt/training/ecommerce/events/,events/,0
dbfs:/mnt/training/ecommerce/products/,products/,0
dbfs:/mnt/training/ecommerce/sales/,sales/,0
dbfs:/mnt/training/ecommerce/users/,users/,0


-sandbox
### 2. View data files in DBFS using dbutils
- Use **`dbutils`** to get the files at the directory above and save it to the variable **`files`**
- Use the Databricks display() function to display the contents in **`files`**

<img alt="Hint" title="Hint" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.3em" src="https://files.training.databricks.com/static/images/icon-light-bulb.svg"/>&nbsp;**Hint:** You should see four items: `events`, `items`, `sales`, `users`

In [0]:
files = dbutils.fs.ls("/mnt/training/ecommerce")
display(files)

path,name,size
dbfs:/mnt/training/ecommerce/README.md,README.md,621
dbfs:/mnt/training/ecommerce/events/,events/,0
dbfs:/mnt/training/ecommerce/products/,products/,0
dbfs:/mnt/training/ecommerce/sales/,sales/,0
dbfs:/mnt/training/ecommerce/users/,users/,0


### 3. Create tables below from files in DBFS
- Create `users` table using files at location `"/mnt/training/ecommerce/users/users.parquet"` 
- Create `sales` table using files at location `"/mnt/training/ecommerce/sales/sales.parquet"` 
- Create `products` table using files at location `"/mnt/training/ecommerce/products/products.parquet"` 

(The `events` table was created above using files at location `"/mnt/training/ecommerce/events/events.parquet"`)

In [0]:
%sql
CREATE TABLE IF NOT EXISTS users USING parquet OPTIONS (path "/mnt/training/ecommerce/users/users.parquet");
CREATE TABLE IF NOT EXISTS products USING parquet OPTIONS (path "/mnt/training/ecommerce/products/products.parquet");
CREATE TABLE IF NOT EXISTS sales USING parquet OPTIONS (path "/mnt/training/ecommerce/sales/sales.parquet");

Use the data tab of the workspace UI to confirm your tables were created.

### 4. Execute SQL to answer questions on BedBricks datasets

-sandbox
##### What products are available for purchase at BedBricks?
Execute a SQL query that selects all from the **`products`** table

<img alt="Hint" title="Hint" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.3em" src="https://files.training.databricks.com/static/images/icon-light-bulb.svg"/>&nbsp;**Hint:** You should see 12 products.

In [0]:
%sql
SELECT * from products

item_id,name,price
M_PREM_Q,Premium Queen Mattress,1795.0
M_STAN_F,Standard Full Mattress,945.0
M_PREM_F,Premium Full Mattress,1695.0
M_PREM_T,Premium Twin Mattress,1095.0
M_PREM_K,Premium King Mattress,1995.0
P_DOWN_S,Standard Down Pillow,119.0
M_STAN_Q,Standard Queen Mattress,1045.0
M_STAN_K,Standard King Mattress,1195.0
M_STAN_T,Standard Twin Mattress,595.0
P_FOAM_S,Standard Foam Pillow,59.0


-sandbox
##### What is the average purchase revenue for a transaction at BedBricks?
Execute a SQL query that computes the average **`purchase_revenue_in_usd`** from the **`sales`** table

<img alt="Hint" title="Hint" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.3em" src="https://files.training.databricks.com/static/images/icon-light-bulb.svg"/>&nbsp;**Hint:** The result should be `1042.79`

In [0]:
%sql
select AVG(purchase_revenue_in_usd) from sales 

avg(purchase_revenue_in_usd)
1042.7902657223433


-sandbox
##### What types of events are recorded on the BedBricks website?
Execute a SQL query that selects distinct values in **`event_name`** from the **`events`** table

<img alt="Hint" title="Hint" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.3em" src="https://files.training.databricks.com/static/images/icon-light-bulb.svg"/>&nbsp;**Hint:** You should see 23 distinct `event_name` values

In [0]:
%sql
select distinct(event_name) from events

event_name
mattresses
down
press
shipping_info
main
warranty
finalize
login
faq
careers


### Clean up classroom

In [0]:
%run ./Includes/Classroom-Cleanup
