## Tutorial: Query and visualize data from a notebook

This tutorial walks you through using an Azure Databricks notebook to query sample data stored in Unity Catalog using SQL, Python, Scala, and R and then visualize the query results in the notebook.

In [0]:
%sql
SELECT * FROM samples.nyctaxi.trips limit 10

In [0]:
display(spark.read.table("samples.nyctaxi.trips"))

Databricks visualization. Run in Databricks to view.

JFK Í≥µÌï≠ Í≥†Ï†ï ÏöîÍ∏à (Flat Fare) Îâ¥Ïöï ÌÉùÏãúÏóêÎäî ÏïÑÏ£º Ïú†Î™ÖÌïú Í∑úÏπôÏù¥ ÏûàÏñ¥Ïöî.

üëâ Îß®Ìï¥Ìäº ‚Üî JFK Í≥µÌï≠

Í∏∞Î≥∏ ÏöîÍ∏à: $52 (Í≥†Ï†ï)
Ïó¨Í∏∞Ïóê ÌåÅ, ÌÜ®, Ï∂îÍ∞Ä ÏöîÍ∏àÏù¥ Î∂ôÏùÑ Ïàò ÏûàÏùå
Í∑∏ÎûòÏÑú Îç∞Ïù¥ÌÑ∞Ïóê:

fare_amount = 52

ÌòπÏùÄ 52 ¬± ÏÜåÏàòÏ†ê
Í∞íÏù¥ ÏóÑÏ≤≠ÎÇòÍ≤å ÎßéÏù¥ Î∞òÎ≥µÎê©ÎãàÎã§.

### Tutorial: Import and visualize CSV data from a notebook

This tutorial walks you through using a Azure Databricks notebook to import data from a CSV file containing baby name data from health.data.ny.gov into your Unity Catalog volume using Python, Scala, and R. You also learn to modify a column name, visualize the data, and save to a table.

In [0]:
%fs ls '/'

In [0]:
%sql
show catalogs

In [0]:
%sql
use catalog main;
CREATE SCHEMA IF NOT EXISTS main.demo;
CREATE VOLUME IF NOT EXISTS main.demo.baby;


In [0]:
catalog = "main"
schema = "demo"
volume = "baby"
download_url = "https://health.data.ny.gov/api/views/jxy9-yhdk/rows.csv"
file_name = "baby_names.csv"
table_name = "baby_names"

path_volume = "/Volumes/" + catalog + "/" + schema + "/" + volume
path_table = catalog + "." + schema

print(path_table)   # Ï†ÑÏ≤¥ ÌÖåÏù¥Î∏î Í≤ΩÎ°ú Ï∂úÎ†•
print(path_volume)  # Ï†ÑÏ≤¥ Î≥ºÎ•® Í≤ΩÎ°ú Ï∂úÎ†•

In [0]:
dbutils.fs.cp(f"{download_url}", f"{path_volume}" + "/" + f"{file_name}")

In this step, you create a DataFrame named df from the CSV file that you previously loaded into your Unity Catalog volume by using the spark.read.csv method.

In [0]:
df = spark.read.csv(f"{path_volume}/{file_name}",
  header=True,
  inferSchema=True,
  sep=",")

In this step, you use the display() method to display the contents of the DataFrame in a table in the notebook, and then visualize the data in a word cloud chart in the notebook.

In [0]:
display(df)

Databricks visualization. Run in Databricks to view.

Copy and paste the following code into an empty notebook cell. This code replaces a space in the column name. Special characters, such as spaces are not allowed in column names. This code uses the Apache Spark withColumnRenamed() method.

In [0]:
df = df.withColumnRenamed("First Name", "First_Name")
df.printSchema

Copy and paste the following code into an empty notebook cell. This code saves the contents of the DataFrame to a table in Unity Catalog using the table name variable that you defined at the start of this article.

In [0]:
df.write.mode("overwrite").saveAsTable(f"{path_table}" + "." + f"{table_name}")

### Tutorial: Create your first table and grant privileges

This tutorial provides a quick walkthrough of creating a table and granting privileges in Azure Databricks using the Unity Catalog data governance model. As of November 9, 2023, workspaces in new accounts are automatically enabled for Unity Catalog and include the permissions required for all users to complete this tutorial.



In [0]:
%sql
-- select current_catalog()
USE CATALOG main;

In [0]:
%sql
CREATE TABLE IF NOT EXISTS demo.department
(
   deptcode   INT,
   deptname  STRING,
   location  STRING
);


In [0]:
%sql
INSERT INTO demo.department VALUES
   (10, 'FINANCE', 'EDINBURGH'),
   (20, 'SOFTWARE', 'PADDINGTON');

As the original table creator, you're the table owner, and you can grant other users permission to read or write to the table. You can even transfer ownership, but we won't do that here. For more information about the Unity Catalog