<a href="https://colab.research.google.com/github/rkrissada/google_ml_training/blob/master/hybrid_recommendations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural network hybrid recommendation system on Google Analytics data model and training

This notebook demonstrates how to implement a hybrid recommendation system using a neural network to combine content-based and collaborative filtering recommendation models using Google Analytics data. We are going to use the learned user embeddings from [wals.ipynb](../wals.ipynb) and combine that with our previous content-based features from [content_based_using_neural_networks.ipynb](../content_based_using_neural_networks.ipynb)

Now that we have our data preprocessed from BigQuery and Cloud Dataflow, we can build our neural network hybrid recommendation model to our preprocessed data. Then we can train locally to make sure everything works and then use the power of Google Cloud ML Engine to scale it out.

We're going to use TensorFlow Hub to use trained text embeddings, so let's first pip install that and reset our session.

In [0]:
!pip3 install tensorflow_hub



In [0]:
%%bash
pip install --upgrade tensorflow

Requirement already up-to-date: tensorflow in /usr/local/envs/py3env/lib/python3.5/site-packages (1.13.1)


Now reset the notebook's session kernel! Since we're no longer using Cloud Dataflow, we'll be using the python3 kernel from here on out so don't forget to change the kernel if it's still python2.

In [0]:
# Import helpful libraries and setup our project, bucket, and region
import os
import tensorflow as tf
import tensorflow_hub as hub

# PROJECT = "cloud-training-demos" # REPLACE WITH YOUR PROJECT ID
# BUCKET = "cloud-training-demos-ml" # REPLACE WITH YOUR BUCKET NAME
# REGION = "us-central1" # REPLACE WITH YOUR BUCKET REGION e.g. us-central1
PROJECT = "qwiklabs-gcp-d54a1ed2fe64d873" # REPLACE WITH YOUR PROJECT ID
BUCKET = "qwiklabs-gcp-d54a1ed2fe64d873" # REPLACE WITH YOUR BUCKET NAME
REGION = "us-central1" # REPLACE WITH YOUR BUCKET REGION e.g. us-central1

# do not change these
os.environ["PROJECT"] = PROJECT
os.environ["BUCKET"] = BUCKET
os.environ["REGION"] = REGION
os.environ["TFVERSION"] = "1.13"

In [0]:
%%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

Updated property [core/project].
Updated property [compute/region].


In [0]:
%%bash
if ! gsutil ls | grep -q gs://${BUCKET}/hybrid_recommendation/preproc; then
    gsutil mb -l ${REGION} gs://${BUCKET}
    # copy canonical set of preprocessed files if you didn't do preprocessing notebook
    gsutil -m cp -R gs://cloud-training-demos/courses/machine_learning/deepdive/10_recommendation/hybrid_recommendation gs://${BUCKET}
fi

Creating gs://qwiklabs-gcp-d54a1ed2fe64d873/...
Copying gs://cloud-training-demos/courses/machine_learning/deepdive/10_recommendation/hybrid_recommendation/preproc/features/eval.csv-00000-of-00001 [Content-Type=text/plain]...
Copying gs://cloud-training-demos/courses/machine_learning/deepdive/10_recommendation/hybrid_recommendation/preproc/features/tmp/staging/preprocess-hybrid-recommendation-features-181217-164834.1545065316.946936/apache_beam-2.9.0-cp27-cp27mu-manylinux1_x86_64.whl [Content-Type=application/octet-stream]...
/ [0 files][    0.0 B/ 11.4 MiB]                                                / [0 files][    0.0 B/ 13.6 MiB]                                                Copying gs://cloud-training-demos/courses/machine_learning/deepdive/10_recommendation/hybrid_recommendation/preproc/features/tmp/staging/preprocess-hybrid-recommendation-features-181217-164834.1545065316.946936/dataflow_python_sdk.tar [Content-Type=application/octet-stream]...
/ [0 files][    0.0 B/ 16.0 

<h2> Create hybrid recommendation system model using TensorFlow </h2>

Now that we've created our training and evaluation input files as well as our categorical feature vocabulary files, we can create our TensorFlow hybrid recommendation system model.

Let's first get some of our aggregate information that we will use in the model from some of our preprocessed files we saved in Google Cloud Storage.

In [0]:
from tensorflow.python.lib.io import file_io

In [0]:
# Get number of content ids from text file in Google Cloud Storage
with file_io.FileIO(tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocab_counts/content_id_vocab_count.txt*".format(BUCKET))[0], mode = 'r') as ifp:
    number_of_content_ids = int([x for x in ifp][0])
print("number_of_content_ids = {}".format(number_of_content_ids))

number_of_content_ids = 15634


In [0]:
# Get number of categories from text file in Google Cloud Storage
with file_io.FileIO(tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocab_counts/category_vocab_count.txt*".format(BUCKET))[0], mode = 'r') as ifp:
    number_of_categories = int([x for x in ifp][0])
print("number_of_categories = {}".format(number_of_categories))

number_of_categories = 3


In [0]:
# Get number of authors from text file in Google Cloud Storage
with file_io.FileIO(tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocab_counts/author_vocab_count.txt*".format(BUCKET))[0], mode = 'r') as ifp:
    number_of_authors = int([x for x in ifp][0])
print("number_of_authors = {}".format(number_of_authors))

number_of_authors = 1103


In [0]:
# Get mean months since epoch from text file in Google Cloud Storage
with file_io.FileIO(tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocab_counts/months_since_epoch_mean.txt*".format(BUCKET))[0], mode = 'r') as ifp:
    mean_months_since_epoch = float([x for x in ifp][0])
print("mean_months_since_epoch = {}".format(mean_months_since_epoch))

mean_months_since_epoch = 573.60733908


In [0]:
# Determine CSV and label columns
NON_FACTOR_COLUMNS = "next_content_id,visitor_id,content_id,category,title,author,months_since_epoch".split(',')
FACTOR_COLUMNS = ["user_factor_{}".format(i) for i in range(10)] + ["item_factor_{}".format(i) for i in range(10)]
CSV_COLUMNS = NON_FACTOR_COLUMNS + FACTOR_COLUMNS
LABEL_COLUMN = "next_content_id"

# Set default values for each CSV column
NON_FACTOR_DEFAULTS = [["Unknown"],["Unknown"],["Unknown"],["Unknown"],["Unknown"],["Unknown"],[mean_months_since_epoch]]
FACTOR_DEFAULTS = [[0.0] for i in range(10)] + [[0.0] for i in range(10)] # user and item
DEFAULTS = NON_FACTOR_DEFAULTS + FACTOR_DEFAULTS

Create input function for training and evaluation to read from our preprocessed CSV files.

In [0]:
# Create input function for train and eval
def read_dataset(filename, mode, batch_size = 512):
    def _input_fn():
        def decode_csv(value_column):
            columns = tf.decode_csv(records = value_column, record_defaults = DEFAULTS)
            features = dict(zip(CSV_COLUMNS, columns))          
            label = features.pop(LABEL_COLUMN)         
            return features, label

        # Create list of files that match pattern
        file_list = tf.gfile.Glob(filename = filename)

        # Create dataset from file list
        dataset = tf.data.TextLineDataset(filenames = file_list).map(map_func = decode_csv)

        if mode == tf.estimator.ModeKeys.TRAIN:
            num_epochs = None # indefinitely
            dataset = dataset.shuffle(buffer_size = 10 * batch_size)
        else:
            num_epochs = 1 # end-of-input after this

        dataset = dataset.repeat(count = num_epochs).batch(batch_size = batch_size)
        return dataset.make_one_shot_iterator().get_next()
    return _input_fn

Next, we will create our feature columns using our read in features.

In [0]:
# Create feature columns to be used in model
def create_feature_columns(args):
    # Create content_id feature column
    content_id_column = tf.feature_column.categorical_column_with_hash_bucket(
        key = "content_id",
        hash_bucket_size = number_of_content_ids)

    # Embed content id into a lower dimensional representation
    embedded_content_column = tf.feature_column.embedding_column(
        categorical_column = content_id_column,
        dimension = args["content_id_embedding_dimensions"])

    # Create category feature column
    categorical_category_column = tf.feature_column.categorical_column_with_vocabulary_file(
        key = "category",
        vocabulary_file = tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocabs/category_vocab.txt*".format(args["bucket"]))[0],
        num_oov_buckets = 1)

    # Convert categorical category column into indicator column so that it can be used in a DNN
    indicator_category_column = tf.feature_column.indicator_column(categorical_column = categorical_category_column)

    # Create title feature column using TF Hub
    embedded_title_column = hub.text_embedding_column(
        key = "title", 
        module_spec = "https://tfhub.dev/google/nnlm-de-dim50-with-normalization/1",
        trainable = False)

    # Create author feature column
    author_column = tf.feature_column.categorical_column_with_hash_bucket(
        key = "author",
        hash_bucket_size = number_of_authors + 1)

    # Embed author into a lower dimensional representation
    embedded_author_column = tf.feature_column.embedding_column(
        categorical_column = author_column,
        dimension = args["author_embedding_dimensions"])

    # Create months since epoch boundaries list for our binning
    months_since_epoch_boundaries = list(range(400, 700, 20))

    # Create months_since_epoch feature column using raw data
    months_since_epoch_column = tf.feature_column.numeric_column(
        key = "months_since_epoch")

    # Create bucketized months_since_epoch feature column using our boundaries
    months_since_epoch_bucketized = tf.feature_column.bucketized_column(
        source_column = months_since_epoch_column,
        boundaries = months_since_epoch_boundaries)

    # Cross our categorical category column and bucketized months since epoch column
    crossed_months_since_category_column = tf.feature_column.crossed_column(
        keys = [categorical_category_column, months_since_epoch_bucketized],
        hash_bucket_size = len(months_since_epoch_boundaries) * (number_of_categories + 1))

    # Convert crossed categorical category and bucketized months since epoch column into indicator column so that it can be used in a DNN
    indicator_crossed_months_since_category_column = tf.feature_column.indicator_column(
            categorical_column = crossed_months_since_category_column)

    # Create user and item factor feature columns from our trained WALS model
    user_factors = [tf.feature_column.numeric_column(key = "user_factor_" + str(i)) for i in range(10)]
    item_factors =  [tf.feature_column.numeric_column(key = "item_factor_" + str(i)) for i in range(10)]

    # Create list of feature columns
    feature_columns = [embedded_content_column,
    embedded_author_column,
    indicator_category_column,
    embedded_title_column,
    indicator_crossed_months_since_category_column] + user_factors + item_factors

    return feature_columns

Now we'll create our model function

In [0]:
# Create custom model function for our custom estimator
def model_fn(features, labels, mode, params):
    # Create neural network input layer using our feature columns defined above
    net = tf.feature_column.input_layer(features = features, feature_columns = params["feature_columns"])

    # Create hidden layers by looping through hidden unit list
    for units in params["hidden_units"]:
        net = tf.layers.dense(inputs = net, units = units, activation = tf.nn.relu)

    # Compute logits (1 per class) using the output of our last hidden layer
    logits = tf.layers.dense(inputs = net, units = params["n_classes"], activation = None)

    # Find the predicted class indices based on the highest logit (which will result in the highest probability)
    predicted_classes = tf.argmax(input = logits, axis = 1)

  # Read in the content id vocabulary so we can tie the predicted class indices to their respective content ids
    with file_io.FileIO(tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocabs/content_id_vocab.txt*".format(BUCKET))[0], mode = "r") as ifp:
        content_id_names = tf.constant(value = [x.rstrip() for x in ifp])

    # Gather predicted class names based predicted class indices
    predicted_class_names = tf.gather(params = content_id_names, indices = predicted_classes)

    # If the mode is prediction
    if mode == tf.estimator.ModeKeys.PREDICT:
        # Create predictions dict
        predictions_dict = {
            "class_ids": tf.expand_dims(input = predicted_classes, axis = -1),
            "class_names" : tf.expand_dims(input = predicted_class_names, axis = -1),
            "probabilities": tf.nn.softmax(logits = logits),
            "logits": logits
        }

        # Create export outputs
        export_outputs = {"predict_export_outputs": tf.estimator.export.PredictOutput(outputs = predictions_dict)}

        return tf.estimator.EstimatorSpec( # return early since we"re done with what we need for prediction mode
          mode = mode,
          predictions = predictions_dict,
          loss = None,
          train_op = None,
          eval_metric_ops = None,
          export_outputs = export_outputs)

    # Continue on with training and evaluation modes

    # Create lookup table using our content id vocabulary
    table = tf.contrib.lookup.index_table_from_file(
        vocabulary_file = tf.gfile.Glob(filename = "gs://{}/hybrid_recommendation/preproc/vocabs/content_id_vocab.txt*".format(BUCKET))[0])

    # Look up labels from vocabulary table
    labels = table.lookup(keys = labels)

    # Compute loss using sparse softmax cross entropy since this is classification and our labels (content id indices) and probabilities are mutually exclusive
    loss = tf.losses.sparse_softmax_cross_entropy(labels = labels, logits = logits)

    # If the mode is evaluation
    if mode == tf.estimator.ModeKeys.EVAL:
        # Compute evaluation metrics of total accuracy and the accuracy of the top k classes
        accuracy = tf.metrics.accuracy(labels = labels, predictions = predicted_classes, name = "acc_op")
        top_k_accuracy = tf.metrics.mean(values = tf.nn.in_top_k(predictions = logits, targets = labels, k = params["top_k"]))
        map_at_k = tf.metrics.average_precision_at_k(labels = labels, predictions = predicted_classes, k = params["top_k"])

        # Put eval metrics into a dictionary
        eval_metric_ops = {
            "accuracy": accuracy,
            "top_k_accuracy": top_k_accuracy,
            "map_at_k": map_at_k}

        # Create scalar summaries to see in TensorBoard
        tf.summary.scalar(name = "accuracy", tensor = accuracy[1])
        tf.summary.scalar(name = "top_k_accuracy", tensor = top_k_accuracy[1])
        tf.summary.scalar(name = "map_at_k", tensor = map_at_k[1])
        
        return tf.estimator.EstimatorSpec( # return early since we"re done with what we need for evaluation mode
            mode = mode,
            predictions = None,
            loss = loss,
            train_op = None,
            eval_metric_ops = eval_metric_ops,
            export_outputs = None)

    # Continue on with training mode

    # If the mode is training
    assert mode == tf.estimator.ModeKeys.TRAIN

    # Create a custom optimizer
    optimizer = tf.train.AdagradOptimizer(learning_rate = params["learning_rate"])

    # Create train op
    train_op = optimizer.minimize(loss = loss, global_step = tf.train.get_global_step())

    return tf.estimator.EstimatorSpec( # final return since we"re done with what we need for training mode
        mode = mode,
        predictions = None,
        loss = loss,
        train_op = train_op,
        eval_metric_ops = None,
        export_outputs = None)

Now create a serving input function

In [0]:
# Create serving input function
def serving_input_fn():  
    feature_placeholders = {
        colname : tf.placeholder(dtype = tf.string, shape = [None]) \
        for colname in NON_FACTOR_COLUMNS[1:-1]
    }
    feature_placeholders["months_since_epoch"] = tf.placeholder(dtype = tf.float32, shape = [None])

    for colname in FACTOR_COLUMNS:
        feature_placeholders[colname] = tf.placeholder(dtype = tf.float32, shape = [None])

    features = {
        key: tf.expand_dims(tensor, -1) \
        for key, tensor in feature_placeholders.items()
    }

    return tf.estimator.export.ServingInputReceiver(features = features, receiver_tensors = feature_placeholders)

Now that all of the pieces are assembled let's create and run our train and evaluate loop

In [0]:
# Create train and evaluate loop to combine all of the pieces together.
tf.logging.set_verbosity(tf.logging.INFO)
def train_and_evaluate(args):
    estimator = tf.estimator.Estimator(
        model_fn = model_fn,
        model_dir = args["output_dir"],
        params = {
        "feature_columns": create_feature_columns(args),
        "hidden_units": args["hidden_units"],
        "n_classes": number_of_content_ids,
        "learning_rate": args["learning_rate"],
        "top_k": args["top_k"],
        "bucket": args["bucket"]
        }
    )

    train_spec = tf.estimator.TrainSpec(
        input_fn = read_dataset(filename = args["train_data_paths"], mode = tf.estimator.ModeKeys.TRAIN, batch_size = args["batch_size"]),
        max_steps = args["train_steps"])

    exporter = tf.estimator.LatestExporter(name = "exporter", serving_input_receiver_fn = serving_input_fn)

    eval_spec = tf.estimator.EvalSpec(
        input_fn = read_dataset(filename = args["eval_data_paths"], mode = tf.estimator.ModeKeys.EVAL, batch_size = args["batch_size"]),
        steps = None,
        start_delay_secs = args["start_delay_secs"],
        throttle_secs = args["throttle_secs"],
        exporters = exporter)

    tf.estimator.train_and_evaluate(estimator = estimator, train_spec = train_spec, eval_spec = eval_spec)

Run train_and_evaluate!

In [0]:
# Call train and evaluate loop
import shutil

outdir = "hybrid_recommendation_trained"
shutil.rmtree(path = outdir, ignore_errors = True) # start fresh each time

arguments = {
    "bucket": BUCKET,
    "train_data_paths": "gs://{}/hybrid_recommendation/preproc/features/train.csv*".format(BUCKET),
    "eval_data_paths": "gs://{}/hybrid_recommendation/preproc/features/eval.csv*".format(BUCKET),
    "output_dir": outdir,
    "batch_size": 128,
    "learning_rate": 0.1,
    "hidden_units": [256, 128, 64],
    "content_id_embedding_dimensions": 10,
    "author_embedding_dimensions": 10,
    "top_k": 10,
    "train_steps": 1000,
    "start_delay_secs": 30,
    "throttle_secs": 30
}

train_and_evaluate(arguments)

INFO:tensorflow:vocabulary_size = 3 in category is inferred from the number of elements in the vocabulary_file gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/preproc/vocabs/category_vocab.txt-00000-of-00001.


I0508 13:52:52.445419 140200235099904 feature_column_v2.py:1625] vocabulary_size = 3 in category is inferred from the number of elements in the vocabulary_file gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/preproc/vocabs/category_vocab.txt-00000-of-00001.


INFO:tensorflow:Using default config.


I0508 13:53:22.992029 140200235099904 estimator.py:1739] Using default config.


INFO:tensorflow:Using config: {'_task_id': 0, '_model_dir': 'hybrid_recommendation_trained', '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_master': '', '_log_step_count_steps': 100, '_save_summary_steps': 100, '_eval_distribute': None, '_num_ps_replicas': 0, '_experimental_distribute': None, '_save_checkpoints_steps': None, '_protocol': None, '_train_distribute': None, '_evaluation_master': '', '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_device_fn': None, '_is_chief': True, '_task_type': 'worker', '_tf_random_seed': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_keep_checkpoint_max': 5, '_save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f82a2fe66a0>}


I0508 13:53:23.021491 140200235099904 estimator.py:201] Using config: {'_task_id': 0, '_model_dir': 'hybrid_recommendation_trained', '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_master': '', '_log_step_count_steps': 100, '_save_summary_steps': 100, '_eval_distribute': None, '_num_ps_replicas': 0, '_experimental_distribute': None, '_save_checkpoints_steps': None, '_protocol': None, '_train_distribute': None, '_evaluation_master': '', '_global_id_in_cluster': 0, '_num_worker_replicas': 1, '_device_fn': None, '_is_chief': True, '_task_type': 'worker', '_tf_random_seed': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_keep_checkpoint_max': 5, '_save_checkpoints_secs': 600, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f82a2fe66a0>}


INFO:tensorflow:Not using Distribute Coordinator.


I0508 13:53:23.032705 140200235099904 estimator_training.py:185] Not using Distribute Coordinator.


INFO:tensorflow:Running training and evaluation locally (non-distributed).


I0508 13:53:23.037605 140200235099904 training.py:610] Running training and evaluation locally (non-distributed).


INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.


I0508 13:53:23.041005 140200235099904 training.py:698] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.


Instructions for updating:
Colocations handled automatically by placer.


W0508 13:53:23.075108 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.


INFO:tensorflow:Calling model_fn.


I0508 13:53:23.325376 140200235099904 estimator.py:1111] Calling model_fn.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.334492 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:205: EmbeddingColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.342680 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:3100: HashedCategoricalColumn._get_sparse_tensors (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.345519 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:2121: HashedCategoricalColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.364204 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:3040: HashedCategoricalColumn._num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.431527 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:206: EmbeddingColumn._variable_shape (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.447360 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:205: IndicatorColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.454761 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:2121: IndicatorColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.458384 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:4295: CrossedColumn._get_sparse_tensors (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.460149 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:2121: CrossedColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.463267 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:4115: VocabularyFileCategoricalColumn._get_sparse_tensors (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.465841 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:2121: VocabularyFileCategoricalColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.507817 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:4115: BucketizedColumn._get_sparse_tensors (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.515704 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:2121: BucketizedColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.518620 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:2121: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
Use tf.cast instead.


W0508 13:53:23.527299 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:2703: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.


Instructions for updating:
Use tf.cast instead.


W0508 13:53:23.565364 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:2898: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.581118 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:4266: IndicatorColumn._variable_shape (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.588698 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:4321: CrossedColumn._num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.611190 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column_v2.py:4321: VocabularyFileCategoricalColumn._num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.705620 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:205: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


W0508 13:53:23.717336 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/feature_column/feature_column.py:206: NumericColumn._variable_shape (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0508 13:53:23.828721 140200235099904 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0508 13:53:23.888029 140200235099904 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Use keras.layers.dense instead.


W0508 13:53:23.990691 140200235099904 deprecation.py:323] From <ipython-input-21-a83e8c390d82>:8: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.



For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Use tf.cast instead.


W0508 13:53:27.641224 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/array_grad.py:425: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.


INFO:tensorflow:Done calling model_fn.


I0508 13:53:27.761714 140200235099904 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0508 13:53:27.772295 140200235099904 basic_session_run_hooks.py:527] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0508 13:53:28.634097 140200235099904 monitored_session.py:222] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0508 13:53:29.081258 140200235099904 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0508 13:53:30.626458 140200235099904 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into hybrid_recommendation_trained/model.ckpt.


I0508 13:53:31.519102 140200235099904 basic_session_run_hooks.py:594] Saving checkpoints for 0 into hybrid_recommendation_trained/model.ckpt.


INFO:tensorflow:loss = 9.656681, step = 1


I0508 13:53:33.203202 140200235099904 basic_session_run_hooks.py:249] loss = 9.656681, step = 1


INFO:tensorflow:global_step/sec: 8.13911


I0508 13:53:45.489241 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 8.13911


INFO:tensorflow:loss = 5.378806, step = 101 (12.301 sec)


I0508 13:53:45.503906 140200235099904 basic_session_run_hooks.py:247] loss = 5.378806, step = 101 (12.301 sec)


INFO:tensorflow:global_step/sec: 8.53595


I0508 13:53:57.204404 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 8.53595


INFO:tensorflow:loss = 4.375754, step = 201 (11.712 sec)


I0508 13:53:57.216357 140200235099904 basic_session_run_hooks.py:247] loss = 4.375754, step = 201 (11.712 sec)


INFO:tensorflow:global_step/sec: 8.7093


I0508 13:54:08.686373 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 8.7093


INFO:tensorflow:loss = 4.790354, step = 301 (11.481 sec)


I0508 13:54:08.697651 140200235099904 basic_session_run_hooks.py:247] loss = 4.790354, step = 301 (11.481 sec)


INFO:tensorflow:global_step/sec: 9.04668


I0508 13:54:19.740157 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 9.04668


INFO:tensorflow:loss = 5.121499, step = 401 (11.054 sec)


I0508 13:54:19.751897 140200235099904 basic_session_run_hooks.py:247] loss = 5.121499, step = 401 (11.054 sec)


INFO:tensorflow:global_step/sec: 9.06497


I0508 13:54:30.771639 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 9.06497


INFO:tensorflow:loss = 4.6208563, step = 501 (11.031 sec)


I0508 13:54:30.782983 140200235099904 basic_session_run_hooks.py:247] loss = 4.6208563, step = 501 (11.031 sec)


INFO:tensorflow:global_step/sec: 9.10653


I0508 13:54:41.752774 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 9.10653


INFO:tensorflow:loss = 4.603697, step = 601 (10.982 sec)


I0508 13:54:41.764964 140200235099904 basic_session_run_hooks.py:247] loss = 4.603697, step = 601 (10.982 sec)


INFO:tensorflow:global_step/sec: 8.82413


I0508 13:54:53.085345 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 8.82413


INFO:tensorflow:loss = 5.044692, step = 701 (11.332 sec)


I0508 13:54:53.096912 140200235099904 basic_session_run_hooks.py:247] loss = 5.044692, step = 701 (11.332 sec)


INFO:tensorflow:global_step/sec: 9.07482


I0508 13:55:04.104972 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 9.07482


INFO:tensorflow:loss = 4.4556427, step = 801 (11.017 sec)


I0508 13:55:04.113814 140200235099904 basic_session_run_hooks.py:247] loss = 4.4556427, step = 801 (11.017 sec)


INFO:tensorflow:global_step/sec: 8.88594


I0508 13:55:15.358545 140200235099904 basic_session_run_hooks.py:680] global_step/sec: 8.88594


INFO:tensorflow:loss = 5.346167, step = 901 (11.255 sec)


I0508 13:55:15.368909 140200235099904 basic_session_run_hooks.py:247] loss = 5.346167, step = 901 (11.255 sec)


INFO:tensorflow:Saving checkpoints for 1000 into hybrid_recommendation_trained/model.ckpt.


I0508 13:55:26.716445 140200235099904 basic_session_run_hooks.py:594] Saving checkpoints for 1000 into hybrid_recommendation_trained/model.ckpt.


INFO:tensorflow:Calling model_fn.


I0508 13:55:27.632009 140200235099904 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0508 13:55:27.962472 140200235099904 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0508 13:55:28.027020 140200235099904 saver.py:1483] Saver not created because there are no variables in the graph to restore


Instructions for updating:
Use tf.cast instead.


W0508 13:55:28.628957 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/metrics_impl.py:2295: to_double (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


W0508 13:55:28.643571 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/ops/metrics_impl.py:3040: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


INFO:tensorflow:Done calling model_fn.


I0508 13:55:28.692947 140200235099904 estimator.py:1113] Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2019-05-08T13:55:28Z


I0508 13:55:28.729183 140200235099904 evaluation.py:257] Starting evaluation at 2019-05-08T13:55:28Z


INFO:tensorflow:Graph was finalized.


I0508 13:55:29.117606 140200235099904 monitored_session.py:222] Graph was finalized.


Instructions for updating:
Use standard file APIs to check for files with this prefix.


W0508 13:55:29.126167 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.


INFO:tensorflow:Restoring parameters from hybrid_recommendation_trained/model.ckpt-1000


I0508 13:55:29.134988 140200235099904 saver.py:1270] Restoring parameters from hybrid_recommendation_trained/model.ckpt-1000


INFO:tensorflow:Running local_init_op.


I0508 13:55:29.496692 140200235099904 session_manager.py:491] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0508 13:55:30.901467 140200235099904 session_manager.py:493] Done running local_init_op.


INFO:tensorflow:Finished evaluation at 2019-05-08-13:55:47


I0508 13:55:47.821115 140200235099904 evaluation.py:277] Finished evaluation at 2019-05-08-13:55:47


INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.02687605, global_step = 1000, loss = 5.4521894, map_at_k = 0.059553174603174636, top_k_accuracy = 0.16360015


I0508 13:55:47.839330 140200235099904 estimator.py:1979] Saving dict for global step 1000: accuracy = 0.02687605, global_step = 1000, loss = 5.4521894, map_at_k = 0.059553174603174636, top_k_accuracy = 0.16360015


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: hybrid_recommendation_trained/model.ckpt-1000


I0508 13:55:48.069063 140200235099904 estimator.py:2039] Saving 'checkpoint_path' summary for global step 1000: hybrid_recommendation_trained/model.ckpt-1000


INFO:tensorflow:Calling model_fn.


I0508 13:55:48.143230 140200235099904 estimator.py:1111] Calling model_fn.


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0508 13:55:48.501814 140200235099904 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


I0508 13:55:48.585599 140200235099904 saver.py:1483] Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Done calling model_fn.


I0508 13:55:48.997986 140200235099904 estimator.py:1113] Done calling model_fn.


Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.


W0508 13:55:49.007808 140200235099904 deprecation.py:323] From /usr/local/envs/py3env/lib/python3.5/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.


INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict_export_outputs', 'serving_default']


I0508 13:55:49.016834 140200235099904 export.py:587] Signatures INCLUDED in export for Predict: ['predict_export_outputs', 'serving_default']


INFO:tensorflow:Signatures INCLUDED in export for Regress: None


I0508 13:55:49.020572 140200235099904 export.py:587] Signatures INCLUDED in export for Regress: None


INFO:tensorflow:Signatures INCLUDED in export for Train: None


I0508 13:55:49.023398 140200235099904 export.py:587] Signatures INCLUDED in export for Train: None


INFO:tensorflow:Signatures INCLUDED in export for Classify: None


I0508 13:55:49.026103 140200235099904 export.py:587] Signatures INCLUDED in export for Classify: None


INFO:tensorflow:Signatures INCLUDED in export for Eval: None


I0508 13:55:49.027393 140200235099904 export.py:587] Signatures INCLUDED in export for Eval: None


INFO:tensorflow:Restoring parameters from hybrid_recommendation_trained/model.ckpt-1000


I0508 13:55:49.111948 140200235099904 saver.py:1270] Restoring parameters from hybrid_recommendation_trained/model.ckpt-1000


INFO:tensorflow:Assets added to graph.


I0508 13:55:49.419217 140200235099904 builder_impl.py:654] Assets added to graph.


INFO:tensorflow:Assets written to: hybrid_recommendation_trained/export/exporter/temp-b'1557323748'/assets


I0508 13:55:49.510589 140200235099904 builder_impl.py:763] Assets written to: hybrid_recommendation_trained/export/exporter/temp-b'1557323748'/assets


INFO:tensorflow:SavedModel written to: hybrid_recommendation_trained/export/exporter/temp-b'1557323748'/saved_model.pb


I0508 13:55:50.038672 140200235099904 builder_impl.py:414] SavedModel written to: hybrid_recommendation_trained/export/exporter/temp-b'1557323748'/saved_model.pb


INFO:tensorflow:Loss for final step: 4.702016.


I0508 13:55:50.827052 140200235099904 estimator.py:359] Loss for final step: 4.702016.


## Run on module locally

Now let's place our code into a python module with model.py and task.py files so that we can train using Google Cloud's ML Engine! First, let's test our module locally.

In [0]:
%writefile requirements.txt
tensorflow_hub

Writing requirements.txt


In [0]:
%%bash
echo "bucket=${BUCKET}"
rm -rf hybrid_recommendation_trained
export PYTHONPATH=${PYTHONPATH}:${PWD}/hybrid_recommendations_module
python -m trainer.task \
    --bucket=${BUCKET} \
    --train_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/train.csv* \
    --eval_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/eval.csv* \
    --output_dir=${OUTDIR} \
    --batch_size=128 \
    --learning_rate=0.1 \
    --hidden_units="256 128 64" \
    --content_id_embedding_dimensions=10 \
    --author_embedding_dimensions=10 \
    --top_k=10 \
    --train_steps=1000 \
    --start_delay_secs=30 \
    --throttle_secs=60

bucket=qwiklabs-gcp-d54a1ed2fe64d873
number_of_content_ids = 15634
number_of_categories = 3
number_of_authors = 1103
mean_months_since_epoch = 573.60733908

For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.



  from ._conv import register_converters as _register_converters
W0508 13:55:58.750374 140240352323328 __init__.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14
INFO:tensorflow:vocabulary_size = 3 in category is inferred from the number of elements in the vocabulary_file gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/preproc/vocabs/category_vocab.txt-00000-of-00001.
I0508 13:56:00.326117 140240352323328 feature_column_v2.py:1625] vocabulary_size = 3 in category is inferred from the number of elements in the vocabulary_file gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/preproc/vocabs/category_vocab.txt-00000-of-00001.
INFO:tensorflow:Using default config.
I0508 13:56:00.330513 140240352323328 estimator.py:1739] Using default config.
W0508 13:56:00.331419 140240352323328 estimator.py:1760] Using temporary folder as model directory: /tmp/tmppy222di0
INFO:tensorflow:Using config: {'_is_chief': True, '_keep_checkpoint_every_n_hours'

# Run on Google Cloud ML Engine
If our module locally trained fine, let's now use of the power of ML Engine to scale it out on Google Cloud.

In [0]:
%%bash
OUTDIR=gs://${BUCKET}/hybrid_recommendation/small_trained_model
JOBNAME=hybrid_recommendation_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
    --region=$REGION \
    --module-name=trainer.task \
    --package-path=$(pwd)/hybrid_recommendations_module/trainer \
    --job-dir=$OUTDIR \
    --staging-bucket=gs://$BUCKET \
    --scale-tier=STANDARD_1 \
    --runtime-version=$TFVERSION \
    -- \
    --bucket=${BUCKET} \
    --train_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/train.csv* \
    --eval_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/eval.csv* \
    --output_dir=${OUTDIR} \
    --batch_size=128 \
    --learning_rate=0.1 \
    --hidden_units="256 128 64" \
    --content_id_embedding_dimensions=10 \
    --author_embedding_dimensions=10 \
    --top_k=10 \
    --train_steps=1000 \
    --start_delay_secs=30 \
    --throttle_secs=30

gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/small_trained_model us-central1 hybrid_recommendation_190508_135951
jobId: hybrid_recommendation_190508_135951
state: QUEUED


CommandException: 1 files/objects could not be removed.
Job [hybrid_recommendation_190508_135951] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ml-engine jobs describe hybrid_recommendation_190508_135951

or continue streaming the logs with the command

  $ gcloud ml-engine jobs stream-logs hybrid_recommendation_190508_135951


Let's add some hyperparameter tuning!

In [0]:
%%writefile hyperparam.yaml
trainingInput:
    hyperparameters:
        goal: MAXIMIZE
        maxTrials: 5
        maxParallelTrials: 1
        hyperparameterMetricTag: accuracy
        params:
            - parameterName: batch_size
              type: INTEGER
              minValue: 8
              maxValue: 64
              scaleType: UNIT_LINEAR_SCALE
            - parameterName: learning_rate
              type: DOUBLE
              minValue: 0.01
              maxValue: 0.1
              scaleType: UNIT_LINEAR_SCALE
            - parameterName: hidden_units
              type: CATEGORICAL
              categoricalValues: ["1024 512 256", "1024 512 128", "1024 256 128", "512 256 128", "1024 512 64", "1024 256 64", "512 256 64", "1024 128 64", "512 128 64", "256 128 64", "1024 512 32", "1024 256 32", "512 256 32", "1024 128 32", "512 128 32", "256 128 32", "1024 64 32", "512 64 32", "256 64 32", "128 64 32"]
            - parameterName: content_id_embedding_dimensions
              type: INTEGER
              minValue: 5
              maxValue: 250
              scaleType: UNIT_LOG_SCALE
            - parameterName: author_embedding_dimensions
              type: INTEGER
              minValue: 5
              maxValue: 30
              scaleType: UNIT_LINEAR_SCALE

Writing hyperparam.yaml


In [0]:
%%bash
OUTDIR=gs://${BUCKET}/hybrid_recommendation/hypertuning
JOBNAME=hybrid_recommendation_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
    --region=$REGION \
    --module-name=trainer.task \
    --package-path=$(pwd)/hybrid_recommendations_module/trainer \
    --job-dir=$OUTDIR \
    --staging-bucket=gs://$BUCKET \
    --scale-tier=STANDARD_1 \
    --runtime-version=$TFVERSION \
    --config=hyperparam.yaml \
    -- \
    --bucket=${BUCKET} \
    --train_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/train.csv* \
    --eval_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/eval.csv* \
    --output_dir=${OUTDIR} \
    --batch_size=128 \
    --learning_rate=0.1 \
    --hidden_units="256 128 64" \
    --content_id_embedding_dimensions=10 \
    --author_embedding_dimensions=10 \
    --top_k=10 \
    --train_steps=1000 \
    --start_delay_secs=30 \
    --throttle_secs=30

gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/hypertuning us-central1 hybrid_recommendation_190508_140000
jobId: hybrid_recommendation_190508_140000
state: QUEUED


CommandException: 1 files/objects could not be removed.
Job [hybrid_recommendation_190508_140000] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ml-engine jobs describe hybrid_recommendation_190508_140000

or continue streaming the logs with the command

  $ gcloud ml-engine jobs stream-logs hybrid_recommendation_190508_140000


Now that we know the best hyperparameters, run a big training job!

In [0]:
%%bash
OUTDIR=gs://${BUCKET}/hybrid_recommendation/big_trained_model
JOBNAME=hybrid_recommendation_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
    --region=$REGION \
    --module-name=trainer.task \
    --package-path=$(pwd)/hybrid_recommendations_module/trainer \
    --job-dir=$OUTDIR \
    --staging-bucket=gs://$BUCKET \
    --scale-tier=STANDARD_1 \
    --runtime-version=$TFVERSION \
    -- \
    --bucket=${BUCKET} \
    --train_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/train.csv* \
    --eval_data_paths=gs://${BUCKET}/hybrid_recommendation/preproc/features/eval.csv* \
    --output_dir=${OUTDIR} \
    --batch_size=128 \
    --learning_rate=0.1 \
    --hidden_units="256 128 64" \
    --content_id_embedding_dimensions=10 \
    --author_embedding_dimensions=10 \
    --top_k=10 \
    --train_steps=10000 \
    --start_delay_secs=30 \
    --throttle_secs=30

gs://qwiklabs-gcp-d54a1ed2fe64d873/hybrid_recommendation/big_trained_model us-central1 hybrid_recommendation_190508_140004
jobId: hybrid_recommendation_190508_140004
state: QUEUED


CommandException: 1 files/objects could not be removed.
Job [hybrid_recommendation_190508_140004] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ml-engine jobs describe hybrid_recommendation_190508_140004

or continue streaming the logs with the command

  $ gcloud ml-engine jobs stream-logs hybrid_recommendation_190508_140004
