In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# On-device recommendations with Firebase ML and TensorFlow Lite

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/FirebaseExtended/codelab-contentrecommendation-android/blob/master/Firebase_ML_on_device_recommentations.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/FirebaseExtended/codelab-contentrecommendation-android/blob/master/Firebase_ML_on_device_recommentations.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

## Overview

This is the notebook for step 11 of the codelab [**Add recommendations to your app with TensorFlow Lite and Firebase**](https://codelabs.developers.google.com/codelabs/contentrecommendation-android). Before running the code in this notebook, complete steps 1-10 of the codelab to get your app and console projects set up.

This code base provides a toolkit to train an on-device recommendation
tensorflow model with user data collected in your app with Firebase Analytics. This model will then be deployed with Firebase ML to serve movie recommendations in the sample app FireFlix. 

This Notebook shows an end-to-end example that 1) imports Firebase Analytics data from BigQuery 2) preprocesses that data to prepare it for training 3) trains a recommendations model using the data and 4) exports the model in tflite format, ready to use in apps to run inference and serve recommendations.

Since the app we use in the codelab is just a sample app, it doesn't have the usage necessary to generate a significant amount of analytics events. Since training accurate models requires a large amount of data, for the purposes of this codelab and notebook, we will be simulating a larger analytics event store by using the public [movielens](https://grouplens.org/datasets/movielens/) dataset, but you could
adapt the data processing script for your dataset and train your own
recommendation model.

## Prerequisites

Run the cell below to clone the tensorflow recommendations model sample from Github. This is the model we will use, with our analytics training data, to create the recommendations model.

The model uses a Convolutional neural-network encoder (CNN): applying multiple layers of convolutional neural-network to generate an encoding of the user history analytics data. For more details, refer to the [documentation]() for the underlying tensorflow model.

In [1]:
!git clone https://github.com/tensorflow/examples
%cd /content/examples/lite/examples/recommendation/ml/
!pip install -r requirements.txt
!pip install --upgrade google-cloud-storage google-cloud-bigquery[bqstorage]

Cloning into 'examples'...
remote: Enumerating objects: 20141, done.[K
remote: Counting objects: 100% (1961/1961), done.[K
remote: Compressing objects: 100% (1060/1060), done.[K
remote: Total 20141 (delta 909), reused 1580 (delta 590), pack-reused 18180[K
Receiving objects: 100% (20141/20141), 33.15 MiB | 27.87 MiB/s, done.
Resolving deltas: 100% (11003/11003), done.
/content/examples/lite/examples/recommendation/ml
Collecting google-cloud-storage
  Downloading google_cloud_storage-1.43.0-py2.py3-none-any.whl (106 kB)
[K     |████████████████████████████████| 106 kB 5.9 MB/s 
Collecting google-api-core<3.0dev,>=1.29.0
  Downloading google_api_core-2.3.2-py2.py3-none-any.whl (109 kB)
[K     |████████████████████████████████| 109 kB 24.8 MB/s 
Collecting google-resumable-media<3.0dev,>=1.3.0
  Downloading google_resumable_media-2.1.0-py2.py3-none-any.whl (75 kB)
[K     |████████████████████████████████| 75 kB 2.1 MB/s 
Collecting google-cloud-core<3.0dev,>=1.6.0
  Downloading goog

## Set up authentication

In this notebook, we use analytics data from BigQuery to generate training data for our recommendations model. To access BigQuery data from the Colab notebook, you need to upload the service account file that you downloaded in step 10 of the codelab.

Note: If this step is throwing an error, you can either:
1. Manually upload the json file to the /content folder using the Folder icon in the left menu. Then set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the file path.
i.e. If file was uploaded to /content, run:
`os.environ["GOOGLE_APPLICATION_CREDENTIALS"]='/content/<your_service_acct_file_name>`
OR,
2. Try disabling third party cookies in your browser, as [suggested here](https://stackoverflow.com/a/61494336).

In [1]:
import os
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))
  with open('/content/' + fn, 'wb') as f:
    f.write(uploaded[fn])
  os.environ["GOOGLE_APPLICATION_CREDENTIALS"]='/content/' + fn
  projectID = fn.rsplit("-", 1)[0]

Saving swish-dev-b4614-37d80edbbe33.json to swish-dev-b4614-37d80edbbe33.json
User uploaded file "swish-dev-b4614-37d80edbbe33.json" with length 2344 bytes


# Import app analytics data from BigQuery

In this step, we will load the analytics data we collected in the app with Firebase Analytics and sent to BigQuery. We will load the data into the pandas data processing library and then preprocess this data to be the appropriate format for input for the model training step.

## Enable BigQuery IPython magics

BigQuery provides several convenience IPython magics that we will use to fetch data with the %load_ext magic below.

In [None]:
%reload_ext google.cloud.bigquery

## Import data

We use the following SQL statement to get items from the table we created in BigQuery. Firebase Analytics exports a lot of additional information, such as device type, platform version, etc, that we don't need for the purposes of training this model. Initially, we only get a limited amount of rows to briefly explore the form of this data and select which fields are important.

Notice that a row in the dataframe is created for each analytics event logged in the app. This row has many properties, but the ones that are of importance for this notebook are the fields:
* event_name
* event_timestamp
* items
* user_pseudo_id

Notice that some fields, such as the **items** field is actually an object. We will extract the subfield of interest below.

In [2]:
%%bigquery analytics_test_import
SELECT
    *
FROM `firebase_recommendations_dataset.recommendations_table`
LIMIT 10

Query complete after 0.07s: 100%|██████████| 1/1 [00:00<00:00, 148.59query/s]
Downloading: 100%|██████████| 10/10 [00:01<00:00,  7.91rows/s]


In [3]:
analytics_test_import

Unnamed: 0,provider_id,agency_name,street_address,city,state,zip_code,total_episodes_non_lupa,distinct_users_non_lupa,total_hha_charge_amount_non_lupa,total_hha_medicare_payment_amount_non_lupa,total_hha_medicare_standard_payment_amount_non_lupa,outlier_payments_as_a_percent_of_medicare_payment_amount_non_lupa,total_lupa_episodes,total_hha_medicare_payment_amount_for_lupas
0,337290,"AMERICARE CERTIFIED SPECIAL SERVICES, INC CHHA",5923 STRICKLAND AVENUE,BROOKLYN,NY,11234,3148,2310,14112445,12667998,9482140,10,259,108050
1,747394,GRANDCARE HOME HEALTH LLC,4701 ARDENWOOD DRIVE,FORT WORTH,TX,76123,773,220,2175373,2016621,1971446,10,30,10111
2,17009,ALACARE HOME HEALTH & HOSPICE,2970 LORNA ROAD,BIRMINGHAM,AL,35216,12096,6211,30263356,28901930,35087805,0,1138,320331
3,17013,GENTIVA HEALTH SERVICES,"557 GLOVER STREET, SUITE 5",ENTERPRISE,AL,36330,809,459,2811738,2224491,2862609,0,54,13475
4,17014,AMEDISYS HOME HEALTH OF BLOUNTSVILLE,"1106 2ND AVENUE E, SUITE E",ONEONTA,AL,35121,463,322,1389586,1188646,1463926,0,59,17068
5,17016,SOUTHEAST ALABAMA HOMECARE,804 GLOVER AVENUE,ENTERPRISE,AL,36330,963,502,2842744,2182725,2760829,0,109,29735
6,17018,GENTIVA HEALTH SERVICES,"3225 RAINBOW STREET, SUITE 256",RAINBOW CITY,AL,35906,2649,1227,8885133,6910206,8699973,0,170,42769
7,17020,AMEDISYS HOME HEALTH CARE,"273 AZALEA ROAD, SUITE 104, BLDG 2",MOBILE,AL,36609,1028,641,3032311,2512459,3187450,0,126,35139
8,17024,SOUTHEAST ALABAMA HOMECARE,"810 HEDSTROM DRIVE, SUITE ONE",DOTHAN,AL,36301,2172,1090,6542842,4842608,6141106,0,358,99276
9,17025,SAAD HEALTHCARE,"1515 UNIVERSITY BLVD, SOUTH",MOBILE,AL,36609,2321,1148,7162296,5876044,7395974,0,206,54841


All of the columns included in each analytics event entry.

In [4]:
analytics_test_import.columns

Index(['provider_id', 'agency_name', 'street_address', 'city', 'state',
       'zip_code', 'total_episodes_non_lupa', 'distinct_users_non_lupa',
       'total_hha_charge_amount_non_lupa',
       'total_hha_medicare_payment_amount_non_lupa',
       'total_hha_medicare_standard_payment_amount_non_lupa',
       'outlier_payments_as_a_percent_of_medicare_payment_amount_non_lupa',
       'total_lupa_episodes', 'total_hha_medicare_payment_amount_for_lupas'],
      dtype='object')

Of the information logged under 'items', we are only interested in 'item_id',which corresponds to the ID of the movie the user interacted with.

In [17]:
analytics_test_import

Unnamed: 0,provider_id,agency_name,street_address,city,state,zip_code,total_episodes_non_lupa,distinct_users_non_lupa,total_hha_charge_amount_non_lupa,total_hha_medicare_payment_amount_non_lupa,total_hha_medicare_standard_payment_amount_non_lupa,outlier_payments_as_a_percent_of_medicare_payment_amount_non_lupa,total_lupa_episodes,total_hha_medicare_payment_amount_for_lupas
0,337290,"AMERICARE CERTIFIED SPECIAL SERVICES, INC CHHA",5923 STRICKLAND AVENUE,BROOKLYN,NY,11234,3148,2310,14112445,12667998,9482140,10,259,108050
1,747394,GRANDCARE HOME HEALTH LLC,4701 ARDENWOOD DRIVE,FORT WORTH,TX,76123,773,220,2175373,2016621,1971446,10,30,10111
2,17009,ALACARE HOME HEALTH & HOSPICE,2970 LORNA ROAD,BIRMINGHAM,AL,35216,12096,6211,30263356,28901930,35087805,0,1138,320331
3,17013,GENTIVA HEALTH SERVICES,"557 GLOVER STREET, SUITE 5",ENTERPRISE,AL,36330,809,459,2811738,2224491,2862609,0,54,13475
4,17014,AMEDISYS HOME HEALTH OF BLOUNTSVILLE,"1106 2ND AVENUE E, SUITE E",ONEONTA,AL,35121,463,322,1389586,1188646,1463926,0,59,17068
5,17016,SOUTHEAST ALABAMA HOMECARE,804 GLOVER AVENUE,ENTERPRISE,AL,36330,963,502,2842744,2182725,2760829,0,109,29735
6,17018,GENTIVA HEALTH SERVICES,"3225 RAINBOW STREET, SUITE 256",RAINBOW CITY,AL,35906,2649,1227,8885133,6910206,8699973,0,170,42769
7,17020,AMEDISYS HOME HEALTH CARE,"273 AZALEA ROAD, SUITE 104, BLDG 2",MOBILE,AL,36609,1028,641,3032311,2512459,3187450,0,126,35139
8,17024,SOUTHEAST ALABAMA HOMECARE,"810 HEDSTROM DRIVE, SUITE ONE",DOTHAN,AL,36301,2172,1090,6542842,4842608,6141106,0,358,99276
9,17025,SAAD HEALTHCARE,"1515 UNIVERSITY BLVD, SOUTH",MOBILE,AL,36609,2321,1148,7162296,5876044,7395974,0,206,54841


Now we run the following command to import the whole dataset into a variable. Note how we only import the fields which we are interested in for training purposes.

In [31]:
%%bigquery analytics_data_real
SELECT provider_id, agency_name, street_address, city, zip_code
FROM `firebase_recommendations_dataset.recommendations_table`

Query complete after 0.01s: 100%|██████████| 1/1 [00:00<00:00, 569.57query/s] 
Downloading: 100%|██████████| 11062/11062 [00:01<00:00, 10648.22rows/s]


In [32]:
analytics_data_real.head()

Unnamed: 0,provider_id,agency_name,street_address,city,zip_code
0,337290,"AMERICARE CERTIFIED SPECIAL SERVICES, INC CHHA",5923 STRICKLAND AVENUE,BROOKLYN,11234
1,747394,GRANDCARE HOME HEALTH LLC,4701 ARDENWOOD DRIVE,FORT WORTH,76123
2,17009,ALACARE HOME HEALTH & HOSPICE,2970 LORNA ROAD,BIRMINGHAM,35216
3,17013,GENTIVA HEALTH SERVICES,"557 GLOVER STREET, SUITE 5",ENTERPRISE,36330
4,17014,AMEDISYS HOME HEALTH OF BLOUNTSVILLE,"1106 2ND AVENUE E, SUITE E",ONEONTA,35121


# Preprocess the dataset

In this step, we create a lambda function to extract a subfield 'item_id' from the items object. This represents the movie_id, so we also rename the columns to match.

In [33]:
analytics = analytics_data_real
def getAgencyId(row):
  items_obj = row['provider_id']
  return items_obj
analytics['agency_id'] = analytics.apply(lambda row: getAgencyId(row), axis=1)
analytics

Unnamed: 0,provider_id,agency_name,street_address,city,zip_code,agency_id
0,337290,"AMERICARE CERTIFIED SPECIAL SERVICES, INC CHHA",5923 STRICKLAND AVENUE,BROOKLYN,11234,337290
1,747394,GRANDCARE HOME HEALTH LLC,4701 ARDENWOOD DRIVE,FORT WORTH,76123,747394
2,17009,ALACARE HOME HEALTH & HOSPICE,2970 LORNA ROAD,BIRMINGHAM,35216,17009
3,17013,GENTIVA HEALTH SERVICES,"557 GLOVER STREET, SUITE 5",ENTERPRISE,36330,17013
4,17014,AMEDISYS HOME HEALTH OF BLOUNTSVILLE,"1106 2ND AVENUE E, SUITE E",ONEONTA,35121,17014
...,...,...,...,...,...,...
11057,59229,"FIFTH AVE HOME HEALTH CARE, INC","5250 SANTA MONICA, SUITE 208 B",LOS ANGELES,90029,59229
11058,557509,APEX HOME HEALTH SERVICES,3919 W SLAUSON AVE,LOS ANGELES,90043,557509
11059,453107,BRITE HEALTH SERVICES LLC,10715 GULFDALE DR SUITE 240,SAN ANTONIO,78216,453107
11060,679667,SOUTHERN ASSURED HOME HEALTH LLC,4211 GARDENDALE DRIVE SUITE A 210,SAN ANTONIO,78229,679667


We drop the 'items' column since we don't need anything else from it.

In [None]:
analytics.rename(columns={'user_pseudo_id': 'user_id', 'event_timestamp': 'timestamp'}, inplace=True)
analytics.drop(['items'], axis=1, inplace=True)

Here is our processed dataframe containing only the data we want to use in training.

The data has the following properties:
*   UserIDs range between 1 and 6040
*   MovieIDs range between 1 and 3952
*   Timestamp is represented in seconds since the epoch as returned by time(2)
*   Each user has at least 20 ratings

In [34]:
analytics

Unnamed: 0,provider_id,agency_name,street_address,city,zip_code,agency_id
0,337290,"AMERICARE CERTIFIED SPECIAL SERVICES, INC CHHA",5923 STRICKLAND AVENUE,BROOKLYN,11234,337290
1,747394,GRANDCARE HOME HEALTH LLC,4701 ARDENWOOD DRIVE,FORT WORTH,76123,747394
2,17009,ALACARE HOME HEALTH & HOSPICE,2970 LORNA ROAD,BIRMINGHAM,35216,17009
3,17013,GENTIVA HEALTH SERVICES,"557 GLOVER STREET, SUITE 5",ENTERPRISE,36330,17013
4,17014,AMEDISYS HOME HEALTH OF BLOUNTSVILLE,"1106 2ND AVENUE E, SUITE E",ONEONTA,35121,17014
...,...,...,...,...,...,...
11057,59229,"FIFTH AVE HOME HEALTH CARE, INC","5250 SANTA MONICA, SUITE 208 B",LOS ANGELES,90029,59229
11058,557509,APEX HOME HEALTH SERVICES,3919 W SLAUSON AVE,LOS ANGELES,90043,557509
11059,453107,BRITE HEALTH SERVICES LLC,10715 GULFDALE DR SUITE 240,SAN ANTONIO,78216,453107
11060,679667,SOUTHERN ASSURED HOME HEALTH LLC,4211 GARDENDALE DRIVE SUITE A 210,SAN ANTONIO,78229,679667


## Sort and group training data to create training examples

Our analytics events need to be reorganized in the format required for the model training step. We will create an object that maps key user_id to a list of movies that user has seen. We use the timestamp data to create the sequential context.

In [47]:
import collections
def convert_to_timelines(df):
  """Convert ratings data to user."""
  timelines = collections.defaultdict(list)
  movie_counts = collections.Counter()
  for provider_id, zip_code, agency_id, *_ in df.values:
    timelines[provider_id].append([agency_id, zip_code])
    movie_counts[agency_id] += 1
  # Sort per-user timeline by timestamp
  for (provider_id, timeline) in timelines.items():
    timeline.sort(key=lambda x: x[1])
    timelines[provider_id] = [agency_id for agency_id, _ in timeline]
  return timelines, movie_counts
timelines, counts = convert_to_timelines(analytics)

The timelines object contains a list of movie_id's keyed on user_id to indicate the sequence of movies that user has interacted with.

In [48]:
import itertools

for key, val in sorted(timelines.items())[0:10]:
  print(key, val)

17008 ['2201 ARLINGTON AVENUE']
17009 ['2970 LORNA ROAD']
17013 ['557 GLOVER STREET, SUITE 5']
17014 ['1106 2ND AVENUE E, SUITE E']
17016 ['804 GLOVER AVENUE']
17017 ['508 ST CLAIR STREET SE']
17018 ['3225 RAINBOW STREET, SUITE 256']
17020 ['273 AZALEA ROAD, SUITE 104, BLDG 2']
17024 ['810 HEDSTROM DRIVE, SUITE ONE']
17025 ['1515 UNIVERSITY BLVD, SOUTH']


## Generate training examples

We use the timelines data to generate tensorflow training examples. We discard any timeline with less than 3 context items, and we consider context lengths of 100 items. We perform the following steps:

* Groups movie records by user, and orders per-user movie records by timestamp.
* Generates Tensorflow examples with features: 1) "context": time-ordered sequential movie IDs 2) "label": next movie ID user viewed as label. "max_history_length" is taken in as parameter to define "context" feature shape, if not enough history found, right padding with out-of-vocab ID 0 will be performed.
* Then partition the available data into a training and test set.

Sample generated training example with max user history as 10:
```
0 : {   # (tensorflow.Example)
  features: {   # (tensorflow.Features)
    feature: {
      key  : "context"
      value: {
        int64_list: {
          value: [ 595, 2687, 745, 588, 1, 2355, 2294, 783, 1566, 1907 ]
        }
      }
    }
    feature: {
      key  : "label"
      value: {
        int64_list: {
          value: [ 48 ]
        }
      }
    }
  }
}
```

In [49]:
import tensorflow as tf

# used to pad when user doesn't have enough context
OOV_MOVIE_ID = 0

def generate_examples_from_timelines(timelines,
                                     min_timeline_len=3,
                                     max_context_len=100):
  """Convert user timelines to tf examples.

  Convert user timelines to tf examples by adding all possible context-label
  pairs in the examples pool.

  Args:
    timelines: the user timelines to process.
    min_timeline_len: minimum length of the user timeline.
    max_context_len: maximum length of context signals.

  Returns:
    train_examples: tf example list for training.
    test_examples: tf example list for testing.
  """
  train_examples = []
  test_examples = []
  for timeline in timelines.values():
    # Skip if timeline is shorter than min_timeline_len.
    if len(timeline) < min_timeline_len:
      continue
    for label_idx in range(1, len(timeline)):
      start_idx = max(0, label_idx - max_context_len)
      context = timeline[start_idx:label_idx]
      # Pad context with out-of-vocab movie id 0.
      while len(context) < max_context_len:
        context.append(OOV_MOVIE_ID)
      label = timeline[label_idx]
      feature = {
          "context":
              tf.train.Feature(int64_list=tf.train.Int64List(value=context)),
          "label":
              tf.train.Feature(int64_list=tf.train.Int64List(value=[label]))
      }
      tf_example = tf.train.Example(features=tf.train.Features(feature=feature))
      if label_idx == len(timeline) - 1:
        test_examples.append(tf_example.SerializeToString())
      else:
        train_examples.append(tf_example.SerializeToString())
  return train_examples, test_examples



In [51]:
train_examples, test_examples = generate_examples_from_timelines(timelines)

Write examples to tfrecords, to be loaded in the model training step.

In [52]:
def write_tfrecords(tf_examples, filename):
  """Write tf examples to tfrecord file."""
  with tf.io.TFRecordWriter(filename) as file_writer:
    for example in tf_examples:
      file_writer.write(example)

output_dir = 'data/examples'
OUTPUT_TRAINING_DATA_FILENAME = "train_movielens_1m.tfrecord"
OUTPUT_TESTING_DATA_FILENAME = "test_movielens_1m.tfrecord"

if not tf.io.gfile.exists(output_dir):
  tf.io.gfile.makedirs(output_dir)
write_tfrecords(
    tf_examples=train_examples,
    filename=os.path.join(output_dir, OUTPUT_TRAINING_DATA_FILENAME))
write_tfrecords(
    tf_examples=test_examples,
    filename=os.path.join(output_dir, OUTPUT_TESTING_DATA_FILENAME))




# Train model

The training launcher script uses TensorFlow keras compile/fit APIs and performs
the following steps to kick start training and evaluation process:

*   Set up both train and eval dataset input function.
*   Construct keras model according to provided configs, please refer to sample.config file in the source code to config your model architecture, such as embedding dimension, convolutional neural network params, LSTM units etc.
*   Setup loss function. In this code base, we leverages customized batch softmax loss function.
*   Setup optimizer, with flag specified learning rate and gradient clip if needed.
*   Setup evaluation metrics, we provided recall@k metrics by default.
*   Compile model with loss function, optimizer and defined metrics.
*   Setup callbacks for tensorboard and checkpoint manager.
*   Run model.fit with compiled model, where you could specify number of epochs to train, number of train steps in each epoch and number of eval steps in each epoch.

## Model training parameters

### Encoder type

You can train the model using three different encoder types: a convolutional neural net (cnn), a recurrent neural net (rnn), or a bag of words (bow). You can select between the various types with the **--encoder_type** parameter supplying **cnn**, **rnn**, or **bow**. Different encoders have strengths and weakensses depending on the input / output characteristics of your dataset.

For example: If the input context (here, the user history length) is long, cnn and rnn would be more suitable as they have better summarization ability with longer user histories.

### Training time / size

Another consideration is training time. Rnn generally requires the longer training times, followed by cnn, and finally bow with the shortest training times. Bag of words will also be a smaller sized model if space is a consideration.

To start training, execute the following command. Please note that we are using a very small number of epochs (**num_epochs** parameter below) of 10 to speed up training time at the expense of model quality. Generating a high quality model often requires a much higher number. For this model, setting num_epochs to at least 100 should provide a model of sufficient quality. 


In [56]:
!python -m model.recommendation_model_launcher_keras \
  --run_mode "train_and_eval" \
  --encoder_type "cnn" \
  --training_data_filepattern "data/examples/train_movielens_1m.tfrecord" \
  --testing_data_filepattern "data/examples/test_movielens_1m.tfrecord" \
  --model_dir "model/model_dir" \
  --params_path "model/sample_config.json"\
  --batch_size 64 \
  --learning_rate 0.1 \
  --steps_per_epoch 1000 \
  --num_epochs 10 \
  --num_eval_steps 1000 \
  --gradient_clip_norm 1.0 \
  --max_history_length 10

/usr/bin/python3: Error while finding module specification for 'model.recommendation_model_launcher_keras' (ModuleNotFoundError: No module named 'model')


# Export model

Now we export the trained model to a tflite file suitable for on-device inference on mobile devices.
Note that here we use the latest checkpoint, number 10000 in the **checkpoint_path**. This results from num_epochs (10) x steps_per_epoch (1000). If you change either parameter in the previous training step, you should update this parameter to accordingly export the latest checkpoint.

In [None]:
!python -m model.recommendation_model_launcher_keras \
  --run_mode "export" \
  --encoder_type "cnn" \
  --params_path "model/sample_config.json"\
  --model_dir "model/model_dir" \
  --checkpoint_path "model/model_dir/ckpt-10000" \
  --num_predictions 100

# Model inference (Optional)

You could verify your model's performance by running inference with test examples.

In [None]:
import tensorflow as tf
import os
import json

# Use [0, 1, ... 9] as example input to represent 10 movies that user interacted with.
#context = [1196, 1210, 2628]
# context = tf.range(10)
context = tf.constant([1196, 1210, 2628, 260, 480, 2571, 589, 1240, 1, 10])

# Directory to exported TensorFlow Lite model.
export_dir = "model/model_dir/export"
tflite_model_path = os.path.join(export_dir, 'model.tflite')
f = open(tflite_model_path, 'rb')
interpreter = tf.lite.Interpreter(model_content=f.read())
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)

interpreter.set_tensor(input_details[0]['index'], context)
interpreter.invoke()
tflite_top_predictions_ids = interpreter.get_tensor(
    output_details[0]['index'])
tflite_top_prediction_scores = interpreter.get_tensor(
    output_details[1]['index'])
print("results >>>>>")
print("input >>>>>")
print(input_details[0])
print("output >>>>>")
print(tflite_top_predictions_ids)

# Deploy model to the Firebase Console

We now deploy the model to the Firebase Console. From there, it can be automatically downloaded to your user's devices with Firebase ML.

Step 1. Initialize Firebase App Instance

In [None]:
import firebase_admin

firebase_admin.initialize_app(options={'projectId': projectID, 
             'storageBucket': projectID + '.appspot.com' })

Step 2. Upload the model file to Cloud Storage

In [None]:
from firebase_admin import ml

# This uploads it to your bucket as recommendation.tflite
source = ml.TFLiteGCSModelSource.from_saved_model(export_dir, 'model.tflite')
print (source.gcs_tflite_uri)

Step 3. Deploy the model to Firebase

In [None]:
# Create a Model Format
model_format = ml.TFLiteFormat(model_source=source)

# Create a Model object
sdk_model_1 = ml.Model(display_name="recommendations", model_format=model_format)

# Make the Create API call to create the model in Firebase
firebase_model_1 = ml.create_model(sdk_model_1)
print(firebase_model_1.as_dict())

# Publish the model
model_id = firebase_model_1.model_id
firebase_model_1 = ml.publish_model(model_id)

# Return to the Firebase Console
At this point, we have deployed the trained model to the Firebase console. You can go to Develop > Machine Learning > Custom to check it out!

Note that for the purposes of this codelab, in order to have a quick training time, we intentionally chose suboptimal training parameters (as described in the model training step above) that sacrifice model quality. To get better results, please use the pre-trained model included in the Github code repo [here](https://github.com/FirebaseExtended/codelab-contentrecommendation-android/blob/master/recommendation_cnn_i10o100.tflite).
To replace the model we just published:
1. In the Firebase console, go to Develop > Machine Learning > Custom
1. Select the settings dropdown under the model named "recommendations"
1. Choose "Replace model" and upload the model file from the Github repo.

Finally, please return to the [codelab](https://codelabs.developers.google.com/codelabs/contentrecommendation-android) and complete the last steps to see the app in action!