# SQL Parameters

Currently BigQuery SQL does not support parameterization. However, within notebooks, it is quite interesting to be able to use Python variables defined in the notebook as parameter values for SQL.

Datalab introduces a pattern for declaring and using parameterized queries.

## Data Preview

In [1]:
%%bigquery sample --count 10
SELECT * FROM [cloud-datalab-samples:httplogs.logs_20140615]

In [2]:
%%sql
SELECT endpoint FROM [cloud-datalab-samples:httplogs.logs_20140615] GROUP BY endpoint

# Parameterization via SQL Modules

Parameters are declared in SQL modules using a `name = default_value` syntax before the SQL, and then using `$name` within the SQL to reference the parameter.

In [3]:
%%sql --module endpoint_stats
endpoint = 'Other'

SELECT endpoint, COUNT(latency) As requests, MIN(latency) AS min_latency, MAX(latency) AS max_latency
FROM [cloud-datalab-samples:httplogs.logs_20140615]
WHERE endpoint = $endpoint
GROUP BY endpoint

This just defined a SQL query with a String `name` parameter named endpoint, which defaults to the value Other, as you'll see when the query is used to sample data without specifying a specific value.

In [4]:
%%bigquery execute --query endpoint_stats

## Declarative Query Execution

Parameter values can be specified with a `%%bigquery sample` command as follows (parameter values defined in a YAML block):

In [5]:
%%bigquery execute --query endpoint_stats
endpoint: Recent

The YAML text can reference values defined in the notebook as well, using again, the `$variable` syntax.

In [6]:
interesting_endpoint = 'Popular'

In [7]:
%%bigquery execute --query endpoint_stats
endpoint: $interesting_endpoint

## Imperative Query Execution

Parameter values can be passed to BigQuery APIs when constructing a `Query` object.

In [8]:
import gcp.bigquery as bq

In [9]:
stats_query = bq.Query(endpoint_stats, endpoint = interesting_endpoint)
print stats_query.sql

SELECT endpoint, COUNT(latency) As requests, MIN(latency) AS min_latency, MAX(latency) AS max_latency
FROM [cloud-datalab-samples:httplogs.logs_20140615]
WHERE endpoint = "Popular"
GROUP BY endpoint


From the SQL above, you can see above the value for the `$endpoint` variable was expanded out. The parameter replacement happens locally, before the resulting SQL is sent to BigQuery.

In [10]:
stats_query.results()

# Looking Ahead

Parameterization enables one-half of SQL and Python integration - being able to use values in Python code, in the notebook, and passing them in as part of the query when retrieving data from BigQuery.

The next notebook will cover the second-half of SQL and Python integration - retrieving query results into the notebook for use with Python code.

Parameterization is also the building block toward creating complex queries, where whole queries can be used as parameter values.