# Part 3: Creating an Item Embedding Lookup Model using Keras

This tutorial shows how to use Matrix Factorization algorithm in BigQuery ML to generate embeddings for items based on their cooccurrence statistics. The generated item embeddings can be then used to find similar items.

Part 3 covers wrapping the item embeddings in a Keras model and exporting it
as a SavedModel, to act as an item-embedding lookup.



## Setup

In [None]:
!pip install -q -U pip
!pip install -q tensorflow==2.2.0
!pip install -q -U google-auth google-api-python-client google-api-core

### Import libraries

In [None]:
import os
import tensorflow as tf
import numpy as np
print(f'Tensorflow version: {tf.__version__}')

### Configure GCP environment settings

In [None]:
PROJECT_ID = 'ksalama-cloudml' # Change to your project.
BUCKET = 'ksalama-cloudml' # Change to your bucket.
EMBEDDING_FILES_PATH = f'gs://{BUCKET}/bqml/item_embeddings/embeddings-*'
MODEL_OUTPUT_DIR = f'gs://{BUCKET}/bqml/embedding_lookup_model'

!gcloud config set project $PROJECT_ID

### Authenticate your GCP account
This is required if you run the notebook in Colab

In [None]:
try:
  from google.colab import auth
  auth.authenticate_user()
  print("Colab user is authenticated.")
except: pass

### Create and Export an Embedding lookup SavedModel

In [None]:
if tf.io.gfile.exists(MODEL_OUTPUT_DIR):
  print("Removing {} contents...".format(MODEL_OUTPUT_DIR))
  tf.io.gfile.rmtree(MODEL_OUTPUT_DIR)

In [None]:
from embeddings_lookup import lookup_exporter
lookup_exporter.export_saved_model(EMBEDDING_FILES_PATH, MODEL_OUTPUT_DIR)

In [None]:
!saved_model_cli show --dir {MODEL_OUTPUT_DIR} --tag_set serve --signature_def serving_default

In [None]:
loaded_model = tf.saved_model.load(MODEL_OUTPUT_DIR)

In [None]:
input_items = ['2114406', '2114402 2120788', 'abc123']
output = loaded_model(input_items)
print(f'Embeddings retrieved: {output.shape}')
for idx, embedding in enumerate(output):
  print(f'{input_items[idx]}: {embedding[:5]}')

## License

Copyright 2020 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 

See the License for the specific language governing permissions and limitations under the License.

**This is not an official Google product but sample code provided for an educational purpose**