d-sandbox

<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning" style="width: 600px; height: 163px">
</div>

# 1.4 SQL in Notebooks

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) In this lesson you:<br>
* Create a database and table
* Compute aggregate statistics against a dataset 
* Create visualizations

### Mounting Data

We are going to run a Classroom Setup script to mount the data we will be using throughout the class.

A [mount](https://docs.databricks.com/spark/latest/data-sources/aws/amazon-s3.html#mount-an-s3-bucket) is a pointer to a remote storage location (typically AWS or Azure), so we can access that data from within Databricks.

In [0]:
%run ../Includes/Classroom-Setup

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Create a Database and Table

## SF Fire Department Calls for Service

Let's take a look at the [SF Fire Department Calls for Service](https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3/data) dataset. This dataset includes all fire units responses to calls.

In [0]:
%fs head /mnt/davis/fire-calls/fire-calls-truncated-comma.csv

For this class, we want to use a dedicated database to store our tables. Let's call it `Databricks`.

In [0]:
%sql
CREATE DATABASE IF NOT EXISTS Databricks

In [0]:
%sql
USE Databricks

## Create Table

Let's create a table using SQL called `FireCalls` so we can query it using SQL.

In [0]:
%sql
DROP TABLE IF EXISTS fireCalls;

CREATE TABLE IF NOT EXISTS fireCalls
USING csv
OPTIONS (
  header "true",
  path "/mnt/davis/fire-calls/fire-calls-truncated-comma.csv",
  inferSchema "true"
)

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Running Spark SQL Queries

Take a look at a sample of the data.

In [0]:
%sql
SELECT * FROM fireCalls

Call Number,Unit ID,Incident Number,Call Type,Call Date,Watch Date,Received DtTm,Entry DtTm,Dispatch DtTm,Response DtTm,On Scene DtTm,Transport DtTm,Hospital DtTm,Call Final Disposition,Available DtTm,Address,City,Zipcode of Incident,Battalion,Station Area,Box,Original Priority,Priority,Final Priority,ALS Unit,Call Type Group,Number of Alarms,Unit Type,Unit sequence in call dispatch,Fire Prevention District,Supervisor District,Neighborhooods - Analysis Boundaries,Location,RowID
1030118,E08,30625,Medical Incident,04/12/2000,04/12/2000,04/12/2000 09:27:45 PM,04/12/2000 09:28:58 PM,04/12/2000 09:29:21 PM,04/12/2000 09:31:26 PM,04/12/2000 09:32:34 PM,,,Other,04/12/2000 09:45:28 PM,4TH ST/CHANNEL ST,SF,,B03,8,2226,3,3,3,False,,1,ENGINE,1,3.0,6,,"(37.7750268633971, -122.392346204303)",001030118-E08
1030122,M18,30630,Medical Incident,04/12/2000,04/12/2000,04/12/2000 09:31:55 PM,04/12/2000 09:33:48 PM,04/12/2000 09:34:10 PM,04/12/2000 09:35:59 PM,04/12/2000 09:45:22 PM,,,Other,04/12/2000 09:49:52 PM,1800 Block of IRVING ST,SF,94122.0,B08,22,7424,1,1,2,False,,1,MEDIC,1,8.0,4,Sunset/Parkside,"(37.763482287794, -122.477678638767)",001030122-M18
1030154,M36,30662,Medical Incident,04/12/2000,04/12/2000,04/12/2000 10:43:54 PM,04/12/2000 10:45:53 PM,04/12/2000 10:49:59 PM,04/12/2000 10:50:35 PM,04/12/2000 10:53:18 PM,04/12/2000 11:11:36 PM,04/12/2000 11:22:17 PM,Other,04/12/2000 11:42:43 PM,0 Block of SOUTH VAN NESS AVE,SF,94103.0,B02,36,5117,1,1,2,False,,1,MEDIC,1,2.0,6,Mission,"(37.7741251002903, -122.418810211803)",001030154-M36
1040007,E12,30697,Structure Fire,04/13/2000,04/12/2000,04/13/2000 12:19:54 AM,04/13/2000 12:29:24 AM,04/13/2000 12:29:35 AM,04/13/2000 12:31:25 AM,04/13/2000 12:32:36 AM,,,Other,04/13/2000 12:33:18 AM,CLAYTON ST/PARNASSUS AV,SF,94117.0,B05,12,5151,3,3,3,True,,1,ENGINE,1,5.0,5,Haight Ashbury,"(37.7651387353822, -122.44763462758)",001040007-E12
1040021,M14,30711,Medical Incident,04/13/2000,04/12/2000,04/13/2000 01:17:25 AM,04/13/2000 01:18:44 AM,04/13/2000 01:20:02 AM,04/13/2000 01:21:40 AM,04/13/2000 01:24:05 AM,04/13/2000 01:56:02 AM,04/13/2000 02:33:33 AM,Other,04/13/2000 02:40:25 AM,500 Block of 38TH AVE,SF,94121.0,B07,34,7255,3,3,3,True,,1,MEDIC,1,7.0,1,Outer Richmond,"(37.778489948235, -122.498662035969)",001040021-M14
1040061,M43,30749,Medical Incident,04/13/2000,04/12/2000,04/13/2000 07:51:29 AM,04/13/2000 07:55:35 AM,04/13/2000 07:55:54 AM,04/13/2000 07:59:58 AM,,04/13/2000 08:16:30 AM,04/13/2000 08:28:37 AM,Other,04/13/2000 09:26:54 AM,200 Block of MADRID ST,SF,94112.0,B09,43,613,3,3,3,True,,1,MEDIC,3,9.0,11,Excelsior,"(37.7255316247491, -122.429925994016)",001040061-M43
1040079,E10,30766,Alarms,04/13/2000,04/13/2000,04/13/2000 09:31:19 AM,04/13/2000 09:33:04 AM,04/13/2000 09:34:10 AM,04/13/2000 09:35:52 AM,04/13/2000 09:37:59 AM,,,Other,04/13/2000 09:39:36 AM,2800 Block of BROADWAY,SF,94123.0,B04,10,4226,3,3,3,False,,1,ENGINE,1,4.0,2,Pacific Heights,"(37.7931736175933, -122.444028632879)",001040079-E10
1040143,M43,30832,Medical Incident,04/13/2000,04/13/2000,04/13/2000 01:01:56 PM,04/13/2000 01:04:12 PM,04/13/2000 01:13:08 PM,,04/13/2000 01:29:19 PM,04/13/2000 01:36:34 PM,,Other,04/13/2000 01:16:20 PM,2500 Block of OCEAN AVE,SF,94132.0,B08,19,8452,1,1,2,True,,1,MEDIC,1,8.0,7,West of Twin Peaks,"(37.7314853147957, -122.472647880057)",001040143-M43
1040170,T16,30855,Structure Fire,04/13/2000,04/13/2000,04/13/2000 02:09:54 PM,04/13/2000 02:12:27 PM,04/13/2000 02:15:00 PM,,,,,Other,04/13/2000 02:22:01 PM,POLK ST/UNION ST,SF,94109.0,B04,4,3131,3,3,3,False,,1,TRUCK,2,4.0,3,Russian Hill,"(37.7987615790944, -122.422336952094)",001040170-T16
1040233,E48,30914,Alarms,04/13/2000,04/13/2000,04/13/2000 05:23:03 PM,04/13/2000 05:23:58 PM,04/13/2000 05:25:02 PM,,04/13/2000 05:29:16 PM,,,Other,04/13/2000 05:52:48 PM,CALL BOX: FS TI,TI,94130.0,B03,48,2931,3,3,3,False,,1,ENGINE,1,,6,Treasure Island,"(37.8225682263653, -122.371537518925)",001040233-E48


In [0]:
%sql
SELECT * FROM fireCalls WHERE `Watch Date` = '04/13/2016'

Call Number,Unit ID,Incident Number,Call Type,Call Date,Watch Date,Received DtTm,Entry DtTm,Dispatch DtTm,Response DtTm,On Scene DtTm,Transport DtTm,Hospital DtTm,Call Final Disposition,Available DtTm,Address,City,Zipcode of Incident,Battalion,Station Area,Box,Original Priority,Priority,Final Priority,ALS Unit,Call Type Group,Number of Alarms,Unit Type,Unit sequence in call dispatch,Fire Prevention District,Supervisor District,Neighborhooods - Analysis Boundaries,Location,RowID
161041182,T02,16041221,Medical Incident,04/13/2016,04/13/2016,04/13/2016 10:04:26 AM,04/13/2016 10:05:35 AM,04/13/2016 10:07:22 AM,04/13/2016 10:07:40 AM,,,,No Merit,04/13/2016 10:13:37 AM,200 Block of GEARY ST,San Francisco,94102,B01,1,1323,3,3,3,False,Potentially Life-Threatening,1,TRUCK,2,1,3,Financial District/South Beach,"(37.7874094147066, -122.407397325291)",161041182-T02
161040894,63,16041200,Medical Incident,04/13/2016,04/13/2016,04/13/2016 08:49:09 AM,04/13/2016 08:51:31 AM,04/13/2016 08:51:44 AM,04/13/2016 08:51:59 AM,04/13/2016 08:59:30 AM,04/13/2016 09:22:02 AM,04/13/2016 09:48:21 AM,Code 2 Transport,04/13/2016 10:07:08 AM,1600 Block of 30TH AVE,San Francisco,94122,B08,18,7514,2,2,2,True,Non Life-threatening,1,MEDIC,1,8,4,Sunset/Parkside,"(37.7565148633826, -122.488520596486)",161040894-63
161040917,E05,16041204,Medical Incident,04/13/2016,04/13/2016,04/13/2016 09:00:09 AM,04/13/2016 09:00:09 AM,04/13/2016 09:00:22 AM,04/13/2016 09:01:26 AM,04/13/2016 09:04:37 AM,,,Code 2 Transport,04/13/2016 09:11:53 AM,VAN NESS AV/ELM ST,San Francisco,94102,B02,36,3164,3,3,3,True,Potentially Life-Threatening,1,ENGINE,1,2,6,Tenderloin,"(37.7815099697037, -122.420451904276)",161040917-E05
161041247,AM04,16041226,Medical Incident,04/13/2016,04/13/2016,04/13/2016 10:19:11 AM,04/13/2016 10:20:06 AM,04/13/2016 10:20:41 AM,04/13/2016 10:21:08 AM,04/13/2016 10:29:52 AM,,,Fire,04/13/2016 11:00:08 AM,0 Block of TURK ST,San Francisco,94102,B03,1,1365,2,2,2,False,Non Life-threatening,1,PRIVATE,1,3,6,Tenderloin,"(37.7833862379382, -122.409853729941)",161041247-AM04
161041313,E06,16041230,Medical Incident,04/13/2016,04/13/2016,04/13/2016 10:36:27 AM,04/13/2016 10:38:54 AM,04/13/2016 10:39:11 AM,04/13/2016 10:42:08 AM,,,,Code 2 Transport,04/13/2016 10:43:24 AM,500 Block of HAIGHT ST,San Francisco,94117,B05,6,3633,3,3,3,True,Potentially Life-Threatening,1,ENGINE,2,5,5,Hayes Valley,"(37.7720569989498, -122.431274062146)",161041313-E06
161041445,79,16041245,Medical Incident,04/13/2016,04/13/2016,04/13/2016 11:07:06 AM,04/13/2016 11:08:34 AM,04/13/2016 11:08:53 AM,04/13/2016 11:09:10 AM,04/13/2016 11:14:42 AM,04/13/2016 11:28:28 AM,04/13/2016 11:33:23 AM,Code 2 Transport,04/13/2016 12:07:31 PM,1000 Block of POLK ST,San Francisco,94109,B04,3,3121,2,2,2,True,Potentially Life-Threatening,1,MEDIC,1,4,6,Tenderloin,"(37.7861172118379, -122.419854245692)",161041445-79
161041537,AM02,16041253,Medical Incident,04/13/2016,04/13/2016,04/13/2016 11:29:34 AM,04/13/2016 11:29:34 AM,04/13/2016 11:30:07 AM,04/13/2016 11:30:21 AM,04/13/2016 11:39:53 AM,04/13/2016 11:48:50 AM,04/13/2016 11:58:34 AM,Code 2 Transport,04/13/2016 12:29:21 PM,8TH ST/MARKET ST,San Francisco,94103,B02,1,2317,2,2,2,False,Non Life-threatening,1,PRIVATE,1,2,6,Tenderloin,"(37.778719428853, -122.414741223022)",161041537-AM02
161041567,E21,16041257,Traffic Collision,04/13/2016,04/13/2016,04/13/2016 11:36:39 AM,04/13/2016 11:36:39 AM,04/13/2016 11:37:05 AM,04/13/2016 11:38:35 AM,04/13/2016 11:42:31 AM,,,Against Medical Advice,04/13/2016 11:55:40 AM,BEAUMONT AV/TURK BL,San Francisco,94118,B07,21,4557,3,3,3,True,Potentially Life-Threatening,1,ENGINE,1,5,1,Lone Mountain/USF,"(37.777712787956, -122.454331406269)",161041567-E21
161041662,RC2,16041260,Medical Incident,04/13/2016,04/13/2016,04/13/2016 11:56:01 AM,04/13/2016 11:56:45 AM,04/13/2016 11:56:53 AM,04/13/2016 11:57:54 AM,04/13/2016 12:05:44 PM,04/13/2016 12:17:17 PM,,Code 3 Transport,04/13/2016 12:29:51 PM,1600 Block of PINE ST,San Francisco,94109,B04,38,3224,3,3,3,True,Potentially Life-Threatening,1,RESCUE CAPTAIN,3,4,2,Western Addition,"(37.7892533657939, -122.422947215446)",161041662-RC2
161041668,E12,16041261,Medical Incident,04/13/2016,04/13/2016,04/13/2016 11:57:49 AM,04/13/2016 11:57:49 AM,04/13/2016 12:00:39 PM,04/13/2016 12:02:09 PM,04/13/2016 12:07:30 PM,,,Code 2 Transport,04/13/2016 12:17:25 PM,JOHN F KENNEDY DR/STOW LAKE DR,San Francisco,94118,B07,31,7133,2,3,3,True,Potentially Life-Threatening,1,ENGINE,2,7,1,Golden Gate Park,"(37.7711558007622, -122.473655190221)",161041668-E12


In [0]:
%sql
 SELECT `birthday` FROM fireCalls

Which neighborhoods have the most fire calls?

In [0]:
%sql
SELECT `Neighborhooods - Analysis Boundaries` as neighborhood, 
  COUNT(`Neighborhooods - Analysis Boundaries`) as count 
FROM FireCalls 
GROUP BY `Neighborhooods - Analysis Boundaries`
ORDER BY count DESC

neighborhood,count
Tenderloin,31564
South of Market,23212
Mission,21829
Financial District/South Beach,16384
Bayview Hunters Point,13057
Sunset/Parkside,9456
Western Addition,8899
Nob Hill,7981
Outer Richmond,6284
Hayes Valley,6012


## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Visualizing your Data

We can use the built-in Databricks visualization to see which neighborhoods have the most fire calls.

In [0]:
%sql
SELECT `Neighborhooods - Analysis Boundaries` as neighborhood, 
  COUNT(`Neighborhooods - Analysis Boundaries`) as count 
FROM fireCalls 
GROUP BY `Neighborhooods - Analysis Boundaries`
ORDER BY count DESC

neighborhood,count
Tenderloin,31564
South of Market,23212
Mission,21829
Financial District/South Beach,16384
Bayview Hunters Point,13057
Sunset/Parkside,9456
Western Addition,8899
Nob Hill,7981
Outer Richmond,6284
Hayes Valley,6012


In [0]:
%sql
SELECT `Neighborhooods - Analysis Boundaries` as neighborhood, 
  COUNT(`Neighborhooods - Analysis Boundaries`) as count 
FROM fireCalls 
GROUP BY `Neighborhooods - Analysis Boundaries`
ORDER BY count DESC


neighborhood,count
Tenderloin,31564
South of Market,23212
Mission,21829
Financial District/South Beach,16384
Bayview Hunters Point,13057
Sunset/Parkside,9456
Western Addition,8899
Nob Hill,7981
Outer Richmond,6284
Hayes Valley,6012


-sandbox
&copy; 2020 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="http://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/>
<a href="https://databricks.com/privacy-policy">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use">Terms of Use</a> | <a href="http://help.databricks.com/">Support</a>