# BigQuery Magic Commands and DML

The examples in this notebook introduce features of [BigQuery Standard SQL](https://cloud.google.com/bigquery/sql-reference/) and [BigQuery SQL Data Manipulation Language (beta)](https://cloud.google.com/bigquery/sql-reference/dml-syntax). BigQuery Standard SQL is compliant with the SQL 2011 standard. You've already seen the use of the magic command `%%bq` in the [Hello BigQuery](Hello BigQuery.ipynb) and [BigQuery Commands](BigQuery Commands.ipynb) notebooks. This command and others in the Google Cloud Datalab API support BigQuery Standard SQL.

## Using the BigQuery Magic command with Standard SQL

First, we will cover some more uses of the `%%bq` magic command. Let's define a query to work with:

In [54]:
%%bq query --name UniqueNames2013
WITH UniqueNames2013 AS
(SELECT DISTINCT name
  FROM `bigquery-public-data.usa_names.usa_1910_2013`
  WHERE Year = 2013)
SELECT * FROM UniqueNames2013

Now let's list all available commands to work with `%%bq`

In [55]:
%%bq -h

usage: bq [-h]
          {datasets,tables,query,execute,extract,sample,dryrun,udf,datasource,load}
          ...

Execute various BigQuery-related operations. Use "%bq <command> -h" for help
on a specific command.

positional arguments:
  {datasets,tables,query,execute,extract,sample,dryrun,udf,datasource,load}
                        commands
    datasets            Operations on BigQuery datasets
    tables              Operations on BigQuery tables
    query               Create or execute a BigQuery SQL query object,
                        optionally using other SQL objects, UDFs, or external
                        datasources. If a query name is not specified, the
                        query is executed.
    execute             Execute a BigQuery SQL query and optionally send the
                        results to a named table. The cell can optionally
                        contain arguments for expanding variables in the
                        query.
    extract           

The `dryrun` argument in ``%%bq`` can be helpful to confirm the syntax of the SQL query. Instead of executing the query, it will only return some statistics:

In [56]:
%%bq dryrun -q UniqueNames2013

Now, let's get a small sample of the results using the `sample` argument in ``%%bq``:

In [57]:
%%bq sample -q UniqueNames2013

name
Coleton
Amberlee
Anwar
Kennedy
Rainier
Joaquin
Gisela
Elienai
Myra
Jentry


Finally, We can use the `execute` command in %%bq to display the results of our query:

In [58]:
%%bq execute -q UniqueNames2013

name
Carmelo
Blane
Aryan
Joeziah
Izabell
Kevon
Tsering
Ubaldo
Alyanna
Zahira


# Using Google BigQuery SQL Data Manipulation Language

Below, we will demonstrate how to use Google BigQuery SQL Data Manipulation Language (DML) in Datalab.

## Preparation

First, let's import the BigQuery module, and create a sample dataset and table to help demonstrate the features of Google BigQuery DML.

In [1]:
import google.datalab.bigquery as bq

In [2]:
# Create a new dataset (this will be deleted later in the notebook)
sample_dataset = bq.Dataset('sampleDML')
if not sample_dataset.exists():
  sample_dataset.create(friendly_name = 'Sample Dataset for testing DML', description = 'Created from Sample Notebook in Google Cloud Datalab')
  sample_dataset.exists()

In [67]:
# To create a table, we need to create a schema for it.
# Its easiest to create a schema from some existing data, so this
# example demonstrates using an example object
fruit_row = {
  'name': 'string value',
  'count': 0
}

sample_table1 = bq.Table("sampleDML.fruit_basket").create(schema = bq.Schema.from_data([fruit_row]), 
                                                          overwrite = True)

## Inserting Data

We can add rows to our newly created `fruit_basket` table by using an `INSERT` statement in our BigQuery Standard SQL query.

In [68]:
%%bq query
INSERT sampleDML.fruit_basket (name, count)
VALUES('banana', 5),
      ('orange', 10),
      ('apple', 15),
      ('mango', 20)

count,name
15,apple
5,banana
10,orange
20,mango


You may rewrite the previous query as:

In [69]:
%%bq query
INSERT sampleDML.fruit_basket (name, count)
SELECT * 
FROM UNNEST([('peach', 25), ('watermelon', 30)])

count,name
15,apple
5,banana
10,orange
20,mango
25,peach
30,watermelon


You can also use a `WITH` clause with `INSERT` and `SELECT`.

In [70]:
%%bq query
INSERT sampleDML.fruit_basket(name, count)
WITH w AS (
  SELECT ARRAY<STRUCT<name string, count int64>>
      [('cherry', 35),
      ('cranberry', 40),
      ('pear', 45)] col
)
SELECT name, count FROM w, UNNEST(w.col)

count,name
15,apple
5,banana
10,orange
20,mango
25,peach
30,watermelon
45,pear
35,cherry
40,cranberry


Here is an example that copies one table's contents into another. First we will create a new table.

In [71]:
fruit_row_detailed = {
  'name': 'string value',
  'count': 0,
  'readytoeat': False
}
sample_table2 = bq.Table("sampleDML.fruit_basket_detailed").create(schema = bq.Schema.from_data([fruit_row_detailed]), 
                                                                   overwrite = True)

In [72]:
%%bq query
INSERT sampleDML.fruit_basket_detailed (name, count, readytoeat)
SELECT name, count, false
FROM sampleDML.fruit_basket

count,readytoeat,name
20,False,mango
25,False,peach
35,False,cherry
45,False,pear
30,False,watermelon
40,False,cranberry
10,False,orange
5,False,banana
15,False,apple


## Updating Data

You can update rows in the `fruit_basket` table by using an `UPDATE` statement in the BigQuery Standard SQL query. We will try to do this using the Datalab BigQuery API.

In [73]:
%%bq query
UPDATE sampleDML.fruit_basket_detailed
SET readytoeat = True
WHERE name = 'banana'

count,readytoeat,name
35,False,cherry
10,False,orange
20,False,mango
15,False,apple
30,False,watermelon
25,False,peach
45,False,pear
40,False,cranberry
5,True,banana


To view the contents of a table in BigQuery, use `%%bq tables view` command:

In [None]:
%%bq tables view --name sampleDML.fruit_basket_detailed

## Deleting Data

You can delete rows in the `fruit_basket` table by using a `DELETE` statement in the BigQuery Standard SQL query.

In [75]:
%%bq query
DELETE sampleDML.fruit_basket
WHERE name in ('cherry', 'cranberry')

count,name
15,apple
5,banana
10,orange
20,mango
25,peach
30,watermelon
45,pear


Use the following query to delete the corresponding entries in `sampleDML.fruit_basket_detailed`

In [76]:
%%bq query
DELETE sampleDML.fruit_basket_detailed
WHERE NOT EXISTS
  (SELECT * FROM sampleDML.fruit_basket
  WHERE fruit_basket_detailed.name = fruit_basket.name)

count,readytoeat,name
15,False,apple
10,False,orange
25,False,peach
20,False,mango
30,False,watermelon
45,False,pear
5,True,banana


## Deleting Resources

In [3]:
# Clear out sample resources
sample_dataset.delete(delete_contents = True)