# Part 5: Serving the ScaNN Index for Real-time similar Item Matching

This tutorial shows how to use Matrix Factorization algorithm in BigQuery ML to generate embeddings for items based on their cooccurrence statistics. The generated item embeddings can be then used to find similar items.

Part 5 covers deploying the ScaNN index to AI Platform Prediction, using a Custom Container, for real-time similar item matching. The matching service works as follows:
1. Accepts a query item Id.
2. Looks up the embedding of the query item Id from Embedding Lookup Model in AI Platform Prediction.
3. Uses the ScaNN index to find similar item Ids for the given query item embedding.
4. Returns a list of the similar item Ids to the query item Id.


## Setup

In [None]:
!pip install -q scann==1.1.1
!pip install -q pyyaml

### Import libraries

In [None]:
import scann
import numpy as np
import tensorflow as tf

### Configure GCP environment settings

In [None]:
PROJECT_ID = 'ksalama-cloudml'
BUCKET = 'ksalama-cloudml'
REGION = 'us-central1'
ARTIFACTS_REPOSITORY_NAME = 'ml-serving'
INDEX_DIR = f'gs://{BUCKET}/bqml/scann_index'
EMBEDDNIG_LOOKUP_MODEL_NAME = 'item_embedding_lookup'
EMBEDDNIG_LOOKUP_MODEL_VERSION = 'v1'
SCANN_MODEL_NAME = 'index_server'
SCANN_MODEL_VERSION = 'v1'
KIND = 'song'

### Authenticate your GCP account
This is required if you run the notebook in Colab

In [None]:
try:
  from google.colab import auth
  auth.authenticate_user()
  print("Colab user is authenticated.")
except: pass

## Test the Index Server APIs

In [None]:
from index_server.lookup import EmbeddingLookup
embedding_lookup = EmbeddingLookup(PROJECT_ID, REGION, EMBEDDNIG_LOOKUP_MODEL_NAME, EMBEDDNIG_LOOKUP_MODEL_VERSION)

from index_server.matching import ScaNNMatcher
scann_matcher = ScaNNMatcher(INDEX_DIR)

In [None]:
vector = embedding_lookup.lookup(['2114402'])[0]
scann_matcher.match(vector, 5)

## Build a Custom Prediction Container for the ScaNN Index
The custom container runs a [gunicorn](https://gunicorn.org/) web server hosting a [Flask](https://flask.palletsprojects.com/en/1.1.x/quickstart/) application. The app loads the ScaNN index to use it for similar items matching.

In [None]:
!gcloud beta artifacts repositories create {ARTIFACTS_REPOSITORY_NAME} \
  --location={REGION} \
  --repository-format=docker

In [None]:
!gcloud beta auth configure-docker {REGION}-docker.pkg.dev --quiet

In [None]:
IMAGE_URL = f'{REGION}-docker.pkg.dev/{PROJECT_ID}/{ARTIFACTS_REPOSITORY_NAME}/{SCANN_MODEL_NAME}:{SCANN_MODEL_VERSION}'
PORT=5001

SUBSTITUTIONS = ''
SUBSTITUTIONS += f'_IMAGE_URL={IMAGE_URL},'
SUBSTITUTIONS += f'_PORT={PORT}'

!gcloud builds submit --config=index_server/cloudbuild.yaml \
  --substitutions={SUBSTITUTIONS} \
  --timeout=1h

In [None]:
repository_id = f'{REGION}-docker.pkg.dev/{PROJECT_ID}/{ARTIFACTS_REPOSITORY_NAME}'

!gcloud beta artifacts docker images list {repository_id}

## Create a Service Account for the Container to Access Cloud Resources

In [None]:
SERVICE_ACCOUNT_NAME = 'caip-serving'
SERVICE_ACCOUNT_EMAIL = f'{SERVICE_ACCOUNT_NAME}@{PROJECT_ID}.iam.gserviceaccount.com'
!gcloud iam service-accounts create {SERVICE_ACCOUNT_NAME} \
  --description="Service account for AI Platform Prediction to access cloud resources." 

We need to grant the Cloud ML Engine (AI Platform) service account the **iam.serviceAccountAdmin** privileges, and grant the new service account the privileges required by the ScaNN matching service: **storage.objectViewer** and **ml.developer**.

In [None]:
!gcloud projects add-iam-policy-binding {PROJECT_ID} \
  --role=roles/iam.serviceAccountAdmin \
  --member=serviceAccount:service-900786220115@cloud-ml.google.com.iam.gserviceaccount.com

!gcloud projects add-iam-policy-binding {PROJECT_ID} \
  --role=roles/storage.objectViewer \
  --member=serviceAccount:{SERVICE_ACCOUNT_EMAIL}
    
!gcloud projects add-iam-policy-binding {PROJECT_ID} \
  --role=roles/ml.developer \
  --member=serviceAccount:{SERVICE_ACCOUNT_EMAIL}

## Serve the ScaNN Index Custom Container on AI Platform Prediction

In [None]:
!gcloud ai-platform models create {SCANN_MODEL_NAME} --region={REGION}

In [None]:
HEALTH_ROUTE=f'/v1/models/{SCANN_MODEL_NAME}/versions/{SCANN_MODEL_VERSION}'
PREDICT_ROUTE=f'/v1/models/{SCANN_MODEL_NAME}/versions/{SCANN_MODEL_VERSION}:predict'

ENV_VARIABLES = f'PROJECT_ID={PROJECT_ID},'
ENV_VARIABLES += f'REGION={REGION},'
ENV_VARIABLES += f'INDEX_DIR={INDEX_DIR},'
ENV_VARIABLES += f'EMBEDDNIG_LOOKUP_MODEL_NAME={EMBEDDNIG_LOOKUP_MODEL_NAME},'
ENV_VARIABLES += f'EMBEDDNIG_LOOKUP_MODEL_VERSION={EMBEDDNIG_LOOKUP_MODEL_VERSION}'

!gcloud beta ai-platform versions create {SCANN_MODEL_VERSION} \
  --region={REGION} \
  --model={SCANN_MODEL_NAME} \
  --image={IMAGE_URL} \
  --ports={PORT} \
  --predict-route={PREDICT_ROUTE} \
  --health-route={HEALTH_ROUTE} \
  --machine-type=n1-standard-4 \
  --env-vars={ENV_VARIABLES} \
  --service-account={SERVICE_ACCOUNT_EMAIL}

print("The model version is deployed to AI Platform Prediciton.")

## Test the Deployed ScaNN Index Service

In [None]:
from google.cloud import datastore
import requests
client = datastore.Client()

In [None]:
import googleapiclient.discovery
from google.api_core.client_options import ClientOptions

api_endpoint = f'https://{REGION}-ml.googleapis.com'
client_options = ClientOptions(api_endpoint=api_endpoint)
service = googleapiclient.discovery.build(
    serviceName='ml', version='v1', client_options=client_options)
scann_index = f'projects/{PROJECT_ID}/models/{SCANN_MODEL_NAME}/versions/{SCANN_MODEL_VERSION}'
print(f'Service name: {scann_index}')

def caip_predict(query_items, show=10):
  request_body = {
      'instances': [{
          'query':' '.join(query_items), 
          'show':show
      }]
   }

  response = service.projects().predict(name=scann_index, body=request_body).execute()

  if 'error' in response:
    raise RuntimeError(response['error'])

  match_tokens = response['predictions']
  keys = [client.key(KIND, int(key)) for key in match_tokens]
  items = client.get_multi(keys)
  return items


In [None]:
songs = {
    '2114406': 'Metallica: Nothing Else Matters',
    '2114402': 'Metallica: The Unforgiven',
    '2120788': 'Limp Bizkit: My Way',
    '2120786': 'Limp Bizkit: My Generation',
    '1086322': 'Jacques Brel: Ne Me Quitte Pas',
    '3129954': 'Édith Piaf: Non, Je Ne Regrette Rien',
    '53448': 'France Gall: Ella, Elle l\'a',
    '887688': 'Enrique Iglesias: Tired Of Being Sorry',
    '562487': 'Shakira: Hips Don\'t Lie',
    '833391': 'Ricky Martin: Livin\' la Vida Loca',
    '1098069': 'Snoop Dogg: Drop It Like It\'s Hot',
    '910683': '2Pac: California Love',
    '1579481': 'Dr. Dre: The Next Episode',
    '2675403': 'Eminem: Lose Yourself',
    '2954929': 'Black Sabbath: Iron Man',
    '625169': 'Black Sabbath: Paranoid',
}

In [None]:
for item_Id, desc in songs.items():
  print(desc)
  print("==================")
  similar_items = caip_predict([item_Id], 5)
  for similar_item in similar_items:
    print(f'- {similar_item["artist"]}: {similar_item["track_title"]}')
  print()

## License

Copyright 2020 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 

See the License for the specific language governing permissions and limitations under the License.

**This is not an official Google product but sample code provided for an educational purpose**