# Databricks Quick Start

## Create and use a cluster

1. In the sidebar, right-click the **Clusters** button and open the link in a new window.
1. On the Clusters page, click **Create Cluster**.
1. Name the cluster **demo-default**.
1. In the Databricks Runtime Version drop-down, select **4.3 (includes Apache Spark 2.3.1, Scala 11)**.
1. Click **Create Cluster**.
1. Return to this notebook. 
1. In the notebook menu bar, select **<img src="http://docs.databricks.com/_static/images/notebooks/detached.png"/></a> > demo-default**.
1. When the cluster changes from <img src="http://docs.databricks.com/_static/images/clusters/cluster-starting.png"/></a> to <img src="http://docs.databricks.com/_static/images/clusters/cluster-running.png"/></a>, then you can run the commands in this notebook on your cluster.

## Create a table from a Databricks dataset

In [0]:
%sql
DROP TABLE IF EXISTS diamonds;

CREATE TABLE diamonds
  USING csv
  OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true")

## Manipulate the data and display the results 

1. Selects color and price columns, averages the price, and groups and orders by color.
1. Displays a table of the results.

In [0]:
%sql
SELECT color, avg(price) AS price FROM diamonds GROUP BY color ORDER BY color

color,price
D,3169.95409594096
E,3076.7524752475247
F,3724.886396981765
G,3999.135671271697
H,4486.669195568401
I,5091.874953891553
J,5323.81801994302


## Repeat using the Python DataFrame API. 
This is a SQL notebook; by default command statements are passed to a SQL interpreter.  
To pass command statements to a Python interpreter, include the `%python` magic command.

## Create a DataFrame from a Databricks dataset

In [0]:
%python
diamonds = spark.read.csv("/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header="true", inferSchema="true")

## Manipulate the data and display the results

In [0]:
%python
from pyspark.sql.functions import avg
  
display(diamonds.select("color","price").groupBy("color").agg(avg("price")))

color,avg(price)
F,3724.886396981765
E,3076.7524752475247
D,3169.95409594096
J,5323.81801994302
G,3999.135671271697
I,5091.874953891553
H,4486.669195568401
