<a href="https://colab.research.google.com/github/mmahadevisingh/kidoranjan/blob/master/Copy_of_Copy_of_Copy_of_Copy_of_Copy_of_Firebase_ML_on_device_recommentations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# On-device recommendations with Firebase ML and TensorFlow Lite

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/FirebaseExtended/codelab-contentrecommendation-android/blob/master/Firebase_ML_on_device_recommentations.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/FirebaseExtended/codelab-contentrecommendation-android/blob/master/Firebase_ML_on_device_recommentations.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

## Overview

This is the notebook for step 11 of the codelab [**Add recommendations to your app with TensorFlow Lite and Firebase**](https://codelabs.developers.google.com/codelabs/contentrecommendation-android). Before running the code in this notebook, complete steps 1-10 of the codelab to get your app and console projects set up.

This code base provides a toolkit to train an on-device recommendation
tensorflow model with user data collected in your app with Firebase Analytics. This model will then be deployed with Firebase ML to serve movie recommendations in the sample app FireFlix. 

This Notebook shows an end-to-end example that 1) imports Firebase Analytics data from BigQuery 2) preprocesses that data to prepare it for training 3) trains a recommendations model using the data and 4) exports the model in tflite format, ready to use in apps to run inference and serve recommendations.

Since the app we use in the codelab is just a sample app, it doesn't have the usage necessary to generate a significant amount of analytics events. Since training accurate models requires a large amount of data, for the purposes of this codelab and notebook, we will be simulating a larger analytics event store by using the public [movielens](https://grouplens.org/datasets/movielens/) dataset, but you could
adapt the data processing script for your dataset and train your own
recommendation model.

## Prerequisites

Run the cell below to clone the tensorflow recommendations model sample from Github. This is the model we will use, with our analytics training data, to create the recommendations model.

The model uses a Convolutional neural-network encoder (CNN): applying multiple layers of convolutional neural-network to generate an encoding of the user history analytics data. For more details, refer to the [documentation]() for the underlying tensorflow model.

In [1]:
!git clone https://github.com/tensorflow/examples
%cd /content/examples/lite/examples/recommendation/ml/
!pip install -r requirements.txt
!pip install db-dtypes
!pip install --upgrade google-cloud-storage google-cloud-bigquery[bqstorage]

fatal: destination path 'examples' already exists and is not an empty directory.
/content/examples/lite/examples/recommendation/ml
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting protobuf<3.20,>=3.9.2
  Downloading protobuf-3.19.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 34.9 MB/s 
Installing collected packages: protobuf
  Attempting uninstall: protobuf
    Found existing installation: protobuf 3.20.1
    Uninstalling protobuf-3.20.1:
      Successfully uninstalled protobuf-3.20.1
Successfully installed protobuf-3.19.4


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


## Set up authentication

In this notebook, we use analytics data from BigQuery to generate training data for our recommendations model. To access BigQuery data from the Colab notebook, you need to upload the service account file that you downloaded in step 10 of the codelab.

Note: If this step is throwing an error, you can either:
1. Manually upload the json file to the /content folder using the Folder icon in the left menu. Then set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the file path.
i.e. If file was uploaded to /content, run:
`os.environ["GOOGLE_APPLICATION_CREDENTIALS"]='/content/<your_service_acct_file_name>`
OR,
2. Try disabling third party cookies in your browser, as [suggested here](https://stackoverflow.com/a/61494336).

In [2]:
import os
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))
  with open('/content/' + fn, 'wb') as f:
    f.write(uploaded[fn])
  os.environ["GOOGLE_APPLICATION_CREDENTIALS"]='/content/' + fn
  projectID = fn.rsplit("-", 1)[0]

Saving kidoranjan-f3178-b67e21472114.json to kidoranjan-f3178-b67e21472114.json
User uploaded file "kidoranjan-f3178-b67e21472114.json" with length 2321 bytes


# Import app analytics data from BigQuery

In this step, we will load the analytics data we collected in the app with Firebase Analytics and sent to BigQuery. We will load the data into the pandas data processing library and then preprocess this data to be the appropriate format for input for the model training step.

## Enable BigQuery IPython magics

BigQuery provides several convenience IPython magics that we will use to fetch data with the %load_ext magic below.

In [3]:
%reload_ext google.cloud.bigquery

## Import data

We use the following SQL statement to get items from the table we created in BigQuery. Firebase Analytics exports a lot of additional information, such as device type, platform version, etc, that we don't need for the purposes of training this model. Initially, we only get a limited amount of rows to briefly explore the form of this data and select which fields are important.

Notice that a row in the dataframe is created for each analytics event logged in the app. This row has many properties, but the ones that are of importance for this notebook are the fields:
* event_name
* event_timestamp
* items
* user_pseudo_id

Notice that some fields, such as the **items** field is actually an object. We will extract the subfield of interest below.

In [4]:
%%bigquery analytics_test_import
SELECT
    *
FROM `analytics_317045996.events_20220628`
LIMIT 10

Query complete after 0.01s: 100%|██████████| 1/1 [00:00<00:00, 641.82query/s] 
Downloading: 100%|██████████| 10/10 [00:01<00:00,  6.26rows/s]


In [5]:
analytics_test_import

Unnamed: 0,event_date,event_timestamp,event_name,event_params,event_previous_timestamp,event_value_in_usd,event_bundle_sequence_id,event_server_timestamp_offset,user_id,user_pseudo_id,...,user_ltv,device,geo,app_info,traffic_source,stream_id,platform,event_dimensions,ecommerce,items
0,20220628,1656418265822012,screen_view,"[{'key': 'firebase_screen_class', 'value': {'s...",1656418146415012,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
1,20220628,1656419539126072,user_engagement,"[{'key': 'firebase_screen_id', 'value': {'stri...",1656419520822072,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
2,20220628,1656420723987157,user_engagement,"[{'key': 'ga_session_number', 'value': {'strin...",1656419706719157,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
3,20220628,1656420740723158,screen_view,"[{'key': 'firebase_screen_class', 'value': {'s...",1656420632600158,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
4,20220628,1656401617001000,screen_view,"[{'key': 'firebase_screen_class', 'value': {'s...",1656401325913000,,147,85088155,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
5,20220628,1656401641945015,user_engagement,"[{'key': 'firebase_screen_class', 'value': {'s...",1656334688658015,,147,85088155,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
6,20220628,1656418060902021,user_engagement,"[{'key': 'ga_session_id', 'value': {'string_va...",1656401641945021,,148,669,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
7,20220628,1656418236022011,user_engagement,"[{'key': 'ga_session_id', 'value': {'string_va...",1656418068524011,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
8,20220628,1656418494175068,user_engagement,"[{'key': 'engaged_session_event', 'value': {'s...",1656418424051068,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]
9,20220628,1656419510720069,screen_view,"[{'key': 'firebase_screen_id', 'value': {'stri...",1656418454322069,,149,595471,,deb5945fa29be45801049d34f1a52570,...,,"{'category': 'mobile', 'mobile_brand_name': 'X...","{'continent': 'Asia', 'country': 'India', 'reg...",{'id': 'com.google.firebase.codelabs.recommend...,"{'name': '(direct)', 'medium': '(none)', 'sour...",3595545520,ANDROID,,,[]


All of the columns included in each analytics event entry.

In [6]:
analytics_test_import.columns

Index(['event_date', 'event_timestamp', 'event_name', 'event_params',
       'event_previous_timestamp', 'event_value_in_usd',
       'event_bundle_sequence_id', 'event_server_timestamp_offset', 'user_id',
       'user_pseudo_id', 'privacy_info', 'user_properties',
       'user_first_touch_timestamp', 'user_ltv', 'device', 'geo', 'app_info',
       'traffic_source', 'stream_id', 'platform', 'event_dimensions',
       'ecommerce', 'items'],
      dtype='object')

Of the information logged under 'items', we are only interested in 'item_id',which corresponds to the ID of the movie the user interacted with.

In [None]:
analytics_test_import['items'][0][0]

IndexError: ignored

Now we run the following command to import the whole dataset into a variable. Note how we only import the fields which we are interested in for training purposes.

In [42]:
%%bigquery analytics_data_real
SELECT
    items,user_pseudo_id,event_timestamp
FROM `analytics_317045996.events_20220628`LIMIT 185

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 256.63query/s]
Downloading: 100%|██████████| 185/185 [00:01<00:00, 110.61rows/s]


In [43]:
analytics_data_real.head()

Unnamed: 0,items,user_pseudo_id,event_timestamp
0,[],deb5945fa29be45801049d34f1a52570,1656418265822012
1,[],deb5945fa29be45801049d34f1a52570,1656419539126072
2,[],deb5945fa29be45801049d34f1a52570,1656420723987157
3,[],deb5945fa29be45801049d34f1a52570,1656420740723158
4,[],deb5945fa29be45801049d34f1a52570,1656401617001000


# Preprocess the dataset

In this step, we create a lambda function to extract a subfield 'item_id' from the items object. This represents the movie_id, so we also rename the columns to match.

In [44]:
analytics = analytics_data_real.loc[analytics_data_real['items']!='[]'].copy()
def getMovieID(row):
  items_obj = row['items'][0]
  return items_obj['item_id']
analytics['movie_id'] = analytics.apply(lambda row: getMovieID(row), axis=1)
analytics

Unnamed: 0,items,user_pseudo_id,event_timestamp,movie_id
28,"[{'item_id': '40', 'item_name': '(not set)', '...",deb5945fa29be45801049d34f1a52570,1656401366818006,40
29,"[{'item_id': '2', 'item_name': '(not set)', 'i...",deb5945fa29be45801049d34f1a52570,1656401623961002,2
30,"[{'item_id': '16', 'item_name': '(not set)', '...",deb5945fa29be45801049d34f1a52570,1656401632096009,16
31,"[{'item_id': '16', 'item_name': '(not set)', '...",deb5945fa29be45801049d34f1a52570,1656418027906007,16
32,"[{'item_id': '14', 'item_name': '(not set)', '...",deb5945fa29be45801049d34f1a52570,1656418030824010,14
...,...,...,...,...
180,"[{'item_id': '35', 'item_name': '(not set)', '...",deb5945fa29be45801049d34f1a52570,1656419810124124,35
181,"[{'item_id': '7', 'item_name': '(not set)', 'i...",deb5945fa29be45801049d34f1a52570,1656420492697132,7
182,"[{'item_id': '9', 'item_name': '(not set)', 'i...",deb5945fa29be45801049d34f1a52570,1656420493692133,9
183,"[{'item_id': '8', 'item_name': '(not set)', 'i...",deb5945fa29be45801049d34f1a52570,1656420494172134,8


We drop the 'items' column since we don't need anything else from it.

In [45]:
analytics.rename(columns={'user_pseudo_id': 'user_id', 'event_timestamp': 'timestamp'}, inplace=True)
analytics.drop(['items'], axis=1, inplace=True)

Here is our processed dataframe containing only the data we want to use in training.

The data has the following properties:
*   UserIDs range between 1 and 6040
*   MovieIDs range between 1 and 3952
*   Timestamp is represented in seconds since the epoch as returned by time(2)
*   Each user has at least 20 ratings

In [46]:
analytics

Unnamed: 0,user_id,timestamp,movie_id
28,deb5945fa29be45801049d34f1a52570,1656401366818006,40
29,deb5945fa29be45801049d34f1a52570,1656401623961002,2
30,deb5945fa29be45801049d34f1a52570,1656401632096009,16
31,deb5945fa29be45801049d34f1a52570,1656418027906007,16
32,deb5945fa29be45801049d34f1a52570,1656418030824010,14
...,...,...,...
180,deb5945fa29be45801049d34f1a52570,1656419810124124,35
181,deb5945fa29be45801049d34f1a52570,1656420492697132,7
182,deb5945fa29be45801049d34f1a52570,1656420493692133,9
183,deb5945fa29be45801049d34f1a52570,1656420494172134,8


## Sort and group training data to create training examples

Our analytics events need to be reorganized in the format required for the model training step. We will create an object that maps key user_id to a list of movies that user has seen. We use the timestamp data to create the sequential context.

In [47]:
import collections
def convert_to_timelines(df):
  """Convert ratings data to user."""
  timelines = collections.defaultdict(list)
  movie_counts = collections.Counter()
  for user_id, timestamp, movie_id in df.values:
    timelines[user_id].append([movie_id, int(timestamp)])
    movie_counts[movie_id] += 1
  # Sort per-user timeline by timestamp
  for (user_id, timeline) in timelines.items():
    timeline.sort(key=lambda x: x[1])
    timelines[user_id] = [movie_id for movie_id, _ in timeline]
  return timelines, movie_counts
timelines, counts = convert_to_timelines(analytics)

The timelines object contains a list of movie_id's keyed on user_id to indicate the sequence of movies that user has interacted with.

In [48]:
import itertools

for key, val in sorted(timelines.items())[0:10]:
  print(key, val)

deb5945fa29be45801049d34f1a52570 ['1', '11', '40', '35', '31', '20', '25', '23', '3', '2', '4', '22', '32', '36', '1', '2', '3', '7', '11', '10', '13', '17', '16', '30', '41', '38', '5', '9', '12', '16', '15', '13', '14', '18', '21', '20', '22', '23', '26', '31', '33', '36', '6', '8', '10', '7', '5', '16', '23', '33', '38', '41', '1', '9', '12', '16', '21', '24', '26', '30', '31', '29', '28', '33', '38', '36', '4', '5', '8', '14', '16', '18', '27', '30', '38', '35', '33', '29', '28', '15', '13', '10', '14', '16', '15', '35', '38', '36', '34', '11', '7', '4', '1', '5', '9', '10', '12', '14', '15', '26', '24', '29', '36', '37', '41', '40', '1', '12', '11', '15', '13', '17', '20', '24', '37', '36', '40', '30', '1', '10', '9', '11', '13', '15', '17', '22', '26', '30', '27', '29', '33', '35', '38', '1', '4', '6', '5', '7', '9', '8', '10', '11', '14', '13', '16', '24', '25', '28', '37', '12', '16', '23', '27', '31', '7', '39', '41']


## Generate training examples

We use the timelines data to generate tensorflow training examples. We discard any timeline with less than 3 context items, and we consider context lengths of 100 items. We perform the following steps:

* Groups movie records by user, and orders per-user movie records by timestamp.
* Generates Tensorflow examples with features: 1) "context": time-ordered sequential movie IDs 2) "label": next movie ID user viewed as label. "max_history_length" is taken in as parameter to define "context" feature shape, if not enough history found, right padding with out-of-vocab ID 0 will be performed.
* Then partition the available data into a training and test set.

Sample generated training example with max user history as 10:
```
0 : {   # (tensorflow.Example)
  features: {   # (tensorflow.Features)
    feature: {
      key  : "context"
      value: {
        int64_list: {
          value: [ 595, 2687, 745, 588, 1, 2355, 2294, 783, 1566, 1907 ]
        }
      }
    }
    feature: {
      key  : "label"
      value: {
        int64_list: {
          value: [ 48 ]
        }
      }
    }
  }
}
```

In [49]:
import tensorflow as tf

# used to pad when user doesn't have enough context
OOV_MOVIE_ID = 0

def generate_examples_from_timelines(timelines,
                                     min_timeline_len=3,
                                     max_context_len=100):
  """Convert user timelines to tf examples.

  Convert user timelines to tf examples by adding all possible context-label
  pairs in the examples pool.

  Args:
    timelines: the user timelines to process.
    min_timeline_len: minimum length of the user timeline.
    max_context_len: maximum length of context signals.

  Returns:
    train_examples: tf example list for training.
    test_examples: tf example list for testing.
  """
  train_examples = []
  test_examples = []
  for timeline in timelines.values():
    # Skip if timeline is shorter than min_timeline_len.
    if len(timeline) < min_timeline_len:
      continue
    for label_idx in range(1, len(timeline)):
      start_idx = max(0, label_idx - max_context_len)
      context = timeline[start_idx:label_idx]
      context=list(map(int,context))
      # Pad context with out-of-vocab movie id 0.
      while len(context) < max_context_len:
        context.append(OOV_MOVIE_ID)
      label = timeline[label_idx]
      label=int(label)
      feature = {
          "context":
              tf.train.Feature(int64_list=tf.train.Int64List(value=context)),
          "label":
              tf.train.Feature(int64_list=tf.train.Int64List(value=[label]))
      }
      tf_example = tf.train.Example(features=tf.train.Features(feature=feature))
      if label_idx == len(timeline) - 1:
        test_examples.append(tf_example.SerializeToString())
      else:
        train_examples.append(tf_example.SerializeToString())
  return train_examples, test_examples



In [50]:
train_examples, test_examples = generate_examples_from_timelines(timelines)

Write examples to tfrecords, to be loaded in the model training step.

In [51]:
def write_tfrecords(tf_examples, filename):
  """Write tf examples to tfrecord file."""
  with tf.io.TFRecordWriter(filename) as file_writer:
    for example in tf_examples:
      file_writer.write(example)

output_dir = 'data/examples'
OUTPUT_TRAINING_DATA_FILENAME = "train_movielens_1m.tfrecord"
OUTPUT_TESTING_DATA_FILENAME = "test_movielens_1m.tfrecord"

if not tf.io.gfile.exists(output_dir):
  tf.io.gfile.makedirs(output_dir)
write_tfrecords(
    tf_examples=train_examples,
    filename=os.path.join(output_dir, OUTPUT_TRAINING_DATA_FILENAME))
write_tfrecords(
    tf_examples=test_examples,
    filename=os.path.join(output_dir, OUTPUT_TESTING_DATA_FILENAME))
print(len(train_examples))
print(len(test_examples))




155
1


# Train model

The training launcher script uses TensorFlow keras compile/fit APIs and performs
the following steps to kick start training and evaluation process:

*   Set up both train and eval dataset input function.
*   Construct keras model according to provided configs, please refer to sample.config file in the source code to config your model architecture, such as embedding dimension, convolutional neural network params, LSTM units etc.
*   Setup loss function. In this code base, we leverages customized batch softmax loss function.
*   Setup optimizer, with flag specified learning rate and gradient clip if needed.
*   Setup evaluation metrics, we provided recall@k metrics by default.
*   Compile model with loss function, optimizer and defined metrics.
*   Setup callbacks for tensorboard and checkpoint manager.
*   Run model.fit with compiled model, where you could specify number of epochs to train, number of train steps in each epoch and number of eval steps in each epoch.

## Model training parameters

### Encoder type

You can train the model using three different encoder types: a convolutional neural net (cnn), a recurrent neural net (rnn), or a bag of words (bow). You can select between the various types with the **--encoder_type** parameter supplying **cnn**, **rnn**, or **bow**. Different encoders have strengths and weakensses depending on the input / output characteristics of your dataset.

For example: If the input context (here, the user history length) is long, cnn and rnn would be more suitable as they have better summarization ability with longer user histories.

### Training time / size

Another consideration is training time. Rnn generally requires the longer training times, followed by cnn, and finally bow with the shortest training times. Bag of words will also be a smaller sized model if space is a consideration.

To start training, execute the following command. Please note that we are using a very small number of epochs (**num_epochs** parameter below) of 10 to speed up training time at the expense of model quality. Generating a high quality model often requires a much higher number. For this model, setting num_epochs to at least 100 should provide a model of sufficient quality. 


In [52]:

!python -m model.recommendation_model_launcher_keras\
 --run_mode "train_and_eval" \
 --encoder_type "cnn" \
 --training_data_filepattern "data/examples/train_movielens_1m.tfrecord" \
 --testing_data_filepattern "data/examples/test_movielens_1m.tfrecord" \
 --model_dir "model/model_dir" \
 --params_path "model/sample_config.json"\
 --batch_size 10 \
 --learning_rate 0.1 \
 --steps_per_epoch 100\
 --num_epochs   10\
 --num_eval_steps 1000 \
 --gradient_clip_norm 1.0 \
 --max_history_length 10

INFO:tensorflow:Setting up train and eval input_fns.
I0705 12:05:33.955281 139771579889536 app.py:258] Setting up train and eval input_fns.
INFO:tensorflow:Build keras model for mode: train_and_eval.
I0705 12:05:33.955555 139771579889536 app.py:258] Build keras model for mode: train_and_eval.
2022-07-05 12:05:34.573018: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
INFO:tensorflow:<keras.callbacks.History object at 0x7f1e803394d0>
I0705 12:06:57.866011 139771579889536 app.py:258] <keras.callbacks.History object at 0x7f1e803394d0>


# Export model

Now we export the trained model to a tflite file suitable for on-device inference on mobile devices.
Note that here we use the latest checkpoint, number 10000 in the **checkpoint_path**. This results from num_epochs (10) x steps_per_epoch (1000). If you change either parameter in the previous training step, you should update this parameter to accordingly export the latest checkpoint.

In [55]:
import os
os.chdir('/content/examples/lite/examples/recommendation/ml/model-tokonex/model-tokonex/model-tokonex/model-tokonex/')
!python -m model.recommendation_model_launcher_keras \
  --run_mode "export" \
  --encoder_type "cnn" \
  --params_path "model/sample_config.json"\
  --model_dir "model/model_dir" \
  --checkpoint_path "model/model_dir/ckpt-1000" \
  --num_predictions 10

INFO:tensorflow:Setting up train and eval input_fns.
I0705 12:10:20.381891 139822023776128 app.py:258] Setting up train and eval input_fns.
INFO:tensorflow:Build keras model for mode: export.
I0705 12:10:20.382103 139822023776128 app.py:258] Build keras model for mode: export.
2022-07-05 12:10:21.008830: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
INFO:tensorflow:Exporting model to dir: model/model_dir/export
I0705 12:10:21.127792 139822023776128 app.py:258] Exporting model to dir: model/model_dir/export
W0705 12:10:21.611879 139822023776128 save_impl.py:72] Skipping full serialization of Keras layer <model.recommendation_model.RecommendationModel object at 0x7f2a42e662d0>, because it is not built.
2022-07-05 12:10:21.615447: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, 

# Model inference (Optional)

You could verify your model's performance by running inference with test examples.

In [56]:
import tensorflow as tf
import os
import json

# Use [0, 1, ... 9] as example input to represent 10 movies that user interacted with.
#context = [1196, 1210, 2628]
# context = tf.range(10)
context = tf.constant([1196, 1210, 2628, 260, 480, 2571, 589, 1240, 1, 10])

# Directory to exported TensorFlow Lite model.
export_dir = "model/model_dir/export"
tflite_model_path = os.path.join(export_dir, 'model.tflite')
f = open(tflite_model_path, 'rb')
interpreter = tf.lite.Interpreter(model_content=f.read())
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)

interpreter.set_tensor(input_details[0]['index'], context)
interpreter.invoke()
tflite_top_predictions_ids = interpreter.get_tensor(
    output_details[0]['index'])
tflite_top_prediction_scores = interpreter.get_tensor(
    output_details[1]['index'])
print("results >>>>>")
print("input >>>>>")
print(input_details[0])
print("output >>>>>")
print(tflite_top_predictions_ids)

[{'name': 'serving_default_context:0', 'index': 0, 'shape': array([10], dtype=int32), 'shape_signature': array([10], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'StatefulPartitionedCall:1', 'index': 42, 'shape': array([10], dtype=int32), 'shape_signature': array([10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:0', 'index': 41, 'shape': array([10], dtype=int32), 'shape_signature': array([10], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_

# Deploy model to the Firebase Console

We now deploy the model to the Firebase Console. From there, it can be automatically downloaded to your user's devices with Firebase ML.

Step 1. Initialize Firebase App Instance

In [57]:
import firebase_admin

firebase_admin.initialize_app(options={'projectId': projectID, 
             'storageBucket': projectID + '.appspot.com' })

<firebase_admin.App at 0x7fe0279e2e10>

Step 2. Upload the model file to Cloud Storage

In [59]:
from firebase_admin import ml

# This uploads it to your bucket as recommendation.tflite
source = ml.TFLiteGCSModelSource.from_saved_model(export_dir, 'model.tflite')
print (source.gcs_tflite_uri)



gs://kidoranjan-f3178.appspot.com/Firebase/ML/Models/model.tflite


Step 3. Deploy the model to Firebase

In [60]:
# Create a Model Format
model_format = ml.TFLiteFormat(model_source=source)

# Create a Model object
sdk_model_1 = ml.Model(display_name="recommendations", model_format=model_format)

# Make the Create API call to create the model in Firebase
firebase_model_1 = ml.create_model(sdk_model_1)
print(firebase_model_1.as_dict())

# Publish the model
model_id = firebase_model_1.model_id
firebase_model_1 = ml.publish_model(model_id)

{'name': 'projects/kidoranjan-f3178/models/19169303', 'displayName': 'recommendations', 'createTime': '2022-07-05T12:19:14.413175Z', 'updateTime': '2022-07-05T12:19:14.413175Z', 'state': {}, 'etag': '0096277604f4f801881a5f7d69002b27c91202fe2ba65ba92750947638a02f93', 'modelHash': 'd6bef35f979b639d3d12a78aa4a6b0eaa407a61208027158f4229da0c2a5b1c9', 'tfliteModel': {'sizeBytes': '604408', 'gcsTfliteUri': 'gs://kidoranjan-f3178.appspot.com/Firebase/ML/Models/model.tflite'}}


# Return to the Firebase Console
At this point, we have deployed the trained model to the Firebase console. You can go to Develop > Machine Learning > Custom to check it out!

Note that for the purposes of this codelab, in order to have a quick training time, we intentionally chose suboptimal training parameters (as described in the model training step above) that sacrifice model quality. To get better results, please use the pre-trained model included in the Github code repo [here](https://github.com/FirebaseExtended/codelab-contentrecommendation-android/blob/master/recommendation_cnn_i10o100.tflite).
To replace the model we just published:
1. In the Firebase console, go to Develop > Machine Learning > Custom
1. Select the settings dropdown under the model named "recommendations"
1. Choose "Replace model" and upload the model file from the Github repo.

Finally, please return to the [codelab](https://codelabs.developers.google.com/codelabs/contentrecommendation-android) and complete the last steps to see the app in action!