| ![Home](../images/houses_1f3d8-fe0f.png "Home") | ![Retail](../images/shopping-bags_1f6cd.png "Retail") |
|:--:|:--:|
| [*Home*](../master.ipynb) | [*Retail*](./reail.ipynb) |

We're going to use the sample [Google Analytics sample dataset for BigQuery](https://support.google.com/analytics/answer/7586738?hl=en&ref_topic=3416089) to create a model that predicts whether a website visitor will make a transaction. For information on the schema of the Analytics dataset, see [BigQuery export schema](https://support.google.com/analytics/answer/3437719) in the Google Analytics Help Center.

Import the BigQuery Python client library and initialize a client. The BigQuery client is used to send and receive messages from the BigQuery API.

In [11]:
from google.cloud import bigquery
client = bigquery.Client(location="US")
from google.cloud import bigquery_storage_v1

To create a BigQuery dataset to store your ML model

// dataset = client.create_dataset("dataset_name")

Next, create a logistic regression model using the Google Analytics sample
dataset for BigQuery. The model is used to predict whether a
website visitor will make a transaction.

In [12]:
dataset = client.create_dataset("bqml_tutorial")

In [13]:
%%bigquery
CREATE OR REPLACE MODEL `bqml_tutorial.sample_model`
OPTIONS(model_type='logistic_reg') AS
SELECT
  IF(totals.transactions IS NULL, 0, 1) AS label,
  IFNULL(device.operatingSystem, "") AS os,
  device.isMobile AS is_mobile,
  IFNULL(geoNetwork.country, "") AS country,
  IFNULL(totals.pageviews, 0) AS pageviews
FROM
  `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20160801' AND '20170630'

AttributeError: 'BigQueryReadGrpcTransport' object has no attribute 'channel'

Get training statistics

In [8]:
%%bigquery
SELECT
  *
FROM
  ML.TRAINING_INFO(MODEL `bqml_tutorial.sample_model`)

ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})

Evaluate your model

In [9]:
%%bigquery
SELECT
  *
FROM ML.EVALUATE(MODEL `bqml_tutorial.sample_model`, (
  SELECT
    IF(totals.transactions IS NULL, 0, 1) AS label,
    IFNULL(device.operatingSystem, "") AS os,
    device.isMobile AS is_mobile,
    IFNULL(geoNetwork.country, "") AS country,
    IFNULL(totals.pageviews, 0) AS pageviews
  FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170801'))

ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})

Use your model to predict outcomes

In [10]:
%%bigquery
SELECT
  country,
  SUM(predicted_label) as total_predicted_purchases
FROM ML.PREDICT(MODEL `bqml_tutorial.sample_model`, (
  SELECT
    IFNULL(device.operatingSystem, "") AS os,
    device.isMobile AS is_mobile,
    IFNULL(totals.pageviews, 0) AS pageviews,
    IFNULL(geoNetwork.country, "") AS country
  FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170801'))
  GROUP BY country
  ORDER BY total_predicted_purchases DESC
  LIMIT 10

ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})

Predict the number of transactions each website visitor will make.

In [11]:
%%bigquery
SELECT
  fullVisitorId,
  SUM(predicted_label) as total_predicted_purchases
FROM ML.PREDICT(MODEL `bqml_tutorial.sample_model`, (
  SELECT
    IFNULL(device.operatingSystem, "") AS os,
    device.isMobile AS is_mobile,
    IFNULL(totals.pageviews, 0) AS pageviews,
    IFNULL(geoNetwork.country, "") AS country,
    fullVisitorId
  FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170801'))
  GROUP BY fullVisitorId
  ORDER BY total_predicted_purchases DESC
  LIMIT 10

ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})

To delete the resources created by this tutorial, execute the following code to delete the dataset and its contents:

In [10]:
client.delete_dataset(dataset, delete_contents=True)