<a href="https://colab.research.google.com/github/leoninekev/training-frcnn-google-ml-engine/blob/master/ml_engine_training_walkthrough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!ls

sample_data


## Overview

This tutorial shows how to train a neural network on AI Platform
using the Keras sequential API and how to serve predictions from that
model.

Keras is a high-level API for building and training deep learning models.
[tf.keras](https://www.tensorflow.org/guide/keras) is TensorFlow’s
implementation of this API.

The first two parts of the tutorial walk through training a model on Cloud
AI Platform using prewritten Keras code, deploying the trained model to
AI Platform, and serving online predictions from the deployed model.

The last part of the tutorial digs into the training code used for this model and ensuring it's compatible with AI Platform. To learn more about building
machine learning models in Keras more generally, read [TensorFlow's Keras
tutorials](https://www.tensorflow.org/tutorials/keras).

### Set up your GCP project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a GCP project.](https://console.cloud.google.com/cloud-resource-manager)

2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project)

3. [Enable the AI Platform ("Cloud Machine Learning Engine") and Compute Engine APIs.](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,compute_component)

4. Enter your project ID in the cell below. Then run the  cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$` into these commands.

In [2]:
PROJECT_ID = "nifty-episode-231612" #@param {type:"string"}
! gcloud config set project $PROJECT_ID

Updated property [core/project].


To take a quick anonymous survey, run:
  $ gcloud alpha survey



**If you are using Colab**, run the cell below and follow the instructions
when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

1. In the GCP Console, go to the [**Create service account key**
   page](https://console.cloud.google.com/apis/credentials/serviceaccountkey).

2. From the **Service account** drop-down list, select **New service account**.

3. In the **Service account name** field, enter a name.

4. From the **Role** drop-down list, select
   **Machine Learning Engine > AI Platform Admin** and
   **Storage > Storage Object Admin**.

5. Click *Create*. A JSON file that contains your key downloads to your
local environment.

6. Enter the path to your service account key as the
`GOOGLE_APPLICATION_CREDENTIALS` variable in the cell below and run the cell.

In [4]:
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

if 'google.colab' in sys.modules:
  from google.colab import auth as google_auth
  google_auth.authenticate_user()

# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
  %env GOOGLE_APPLICATION_CREDENTIALS ''


W0726 04:51:05.403162 140558765889408 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



In [5]:
!gcloud config list

[component_manager]
disable_update_check = True
[core]
account = quantumbisht@gmail.com
project = nifty-episode-231612

Your active configuration is: [default]


### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**

When you submit a training job using the Cloud SDK, you upload a Python package
containing your training code to a Cloud Storage bucket. AI Platform runs
the code from this package. In this tutorial, AI Platform also saves the
trained model that results from your job in the same bucket. You can then
create an AI Platform model version based on this output in order to serve
online predictions.

Set the name of your Cloud Storage bucket below. It must be unique across all
Cloud Storage buckets. 

You may also change the `REGION` variable, which is used for operations
throughout the rest of this notebook. Make sure to [choose a region where Cloud
AI Platform services are
available](https://cloud.google.com/ml-engine/docs/tensorflow/regions). You may
not use a Multi-Regional Storage bucket for training with AI Platform.

In [0]:
BUCKET_NAME = "nifty-episode-231612-mlengine" #@param {type:"string"}
REGION = "asia-east1" #@param {type:"string"}

**Finally, validate access to your Cloud Storage bucket by examining its contents:**


In [7]:
! gsutil ls -al gs://$BUCKET_NAME

 209581009  2019-06-24T12:45:15Z  gs://nifty-episode-231612-mlengine/flask_app_git_140619.zip#1561380315059208  metageneration=1
      1136  2019-03-14T10:52:42Z  gs://nifty-episode-231612-mlengine/img_gcloudpath.txt#1552560762918268  metageneration=1
       885  2019-05-06T12:32:48Z  gs://nifty-episode-231612-mlengine/my_job_filesconfig.pickle#1557145968875019  metageneration=1
     10925  2019-03-20T12:29:13Z  gs://nifty-episode-231612-mlengine/preprocessing_test.py#1553084953044119  metageneration=1
  87552713  2019-07-24T12:21:59Z  gs://nifty-episode-231612-mlengine/train_on_gcloud_bilkul_final.zip#1563970919338895  metageneration=1
     32636  2019-03-19T12:12:00Z  gs://nifty-episode-231612-mlengine/yeaah.jpg#1552997520104445  metageneration=1
                                 gs://nifty-episode-231612-mlengine/cloud_test_package/
                                 gs://nifty-episode-231612-mlengine/cloud_test_package_2/
                                 gs://nifty-episode-231612-mlen

##1. Training in AI Platform

This section of the tutorial walks you through submitting a training job to Cloud
AI Platform. This job runs sample code that uses Keras to train a deep neural
network on the United States Census data. It outputs the trained model as a
[TensorFlow SavedModel
directory](https://www.tensorflow.org/guide/saved_model#save_and_restore_models)
in your Cloud Storage bucket.

### Get training code and dependencies

Run the following cell to 
* First, download the training code.

* install Python dependencies needed to train the model locally. WBut hen you run the training job in AI Platform, dependencies are preinstalled based on the [runtime version](https://cloud.google.com/ml-engine/docs/tensorflow/runtime-version-list) you choose.
* change the notebook's working directory.

In [8]:
!git clone https://github.com/leoninekev/training-frcnn-google-ml-engine.git

! pip install -r requirements.txt

# Set the working directory to the sample code directory
%cd training-frcnn-google-ml-engine/move_to_cloudshell/

Cloning into 'training-frcnn-google-ml-engine'...
remote: Enumerating objects: 686, done.[K
remote: Counting objects: 100% (686/686), done.[K
remote: Compressing objects: 100% (349/349), done.[K
remote: Total 686 (delta 340), reused 672 (delta 331), pack-reused 0[K
Receiving objects: 100% (686/686), 35.21 MiB | 35.80 MiB/s, done.
Resolving deltas: 100% (340/340), done.
/content/training-frcnn-google-ml-engine/move_to_cloudshell


In [9]:
! ls -pR

.:
annotations.txt  setup.py  trainer/

./trainer:
config.py		    __init__.py     RoiPoolingConv.py	   vgg.py
data_augment.py		    losses.py	    simple_parser_pkl.py
data_generators.py	    resnet.py	    simple_parser_text.py
FixedBatchNormalization.py  roi_helpers.py  task.py


### Train your model using AI Platform

Next, submit a training job to AI Platform. This runs the training module
in the cloud and exports the training package and trained model to Cloud Storage.

Proceed as follows.

* Navigate to trainer/ directory to modify bucket path, model name, config name in task.py in accordance with your gcp service account. 

In [16]:
%cd trainer
!ls

/content/training-frcnn-google-ml-engine/move_to_cloudshell/trainer
config.py		    __init__.py     RoiPoolingConv.py	   vgg.py
data_augment.py		    losses.py	    simple_parser_pkl.py
data_generators.py	    resnet.py	    simple_parser_text.py
FixedBatchNormalization.py  roi_helpers.py  task.py


*  Run **%pycat task.py** (this draws a pop displayinf content of task.py)
* Copy all code to local Python IDE and edit the default arguments values in parsers for
 * **--path**
 * **--config_filename** 
 * **--output_weight_path** 
 * **--bucket_path**

  and name for **model_weights** before saving.


In [0]:
%pycat task.py
#copy the code from popup, paste it to a python IDLE locally, edit it and again copy the whole post edit

* Copy the edited code from local IDE in following colab cell beneath the command: **%%writefile task.py**
and run the cell - The new edits will be overwritten to a new task.py file(you may also save a new task_file.py, and later delete the older task.py)

In [18]:
%%writefile task.py
from __future__ import division
import random
import pprint
import sys
import time
import numpy as np
from optparse import OptionParser
import pickle

from tensorflow.python.lib.io import file_io

from keras import backend as K
from keras.optimizers import Adam, SGD, RMSprop
from keras.layers import Input
from keras.models import Model

import config, data_generators
import losses as losses
import roi_helpers
from keras.utils import generic_utils

sys.setrecursionlimit(40000)

parser = OptionParser()

parser.add_option("-p", "--path", dest="train_path", help="Path to training data(annotation.txt file).",default="gs://nifty-episode-231612-mlengine/training_data_and_annotations_for_cloud_060519/annotations.txt")# /data.pickle -- for pickled annotations 
parser.add_option("-o", "--parser", dest="parser", help="Parser to use. One of simple_text or simple_pickle",
                  default="simple")# simple_pick --for simple_parser_pkl

parser.add_option("-n", "--num_rois", type="int", dest="num_rois", help="Number of RoIs to process at once.", default=32)
parser.add_option("--network", dest="network", help="Base network to use. Supports vgg or resnet50.", default='resnet50')
parser.add_option("--hf", dest="horizontal_flips", help="Augment with horizontal flips in training. (Default=false).", action="store_true", default=False)
parser.add_option("--vf", dest="vertical_flips", help="Augment with vertical flips in training. (Default=false).", action="store_true", default=False)
parser.add_option("--rot", "--rot_90", dest="rot_90", help="Augment with 90 degree rotations in training. (Default=false).",
                                  action="store_true", default=False)
parser.add_option("--num_epochs", type="int", dest="num_epochs", help="Number of epochs.", default=1)# deafult=1 --for test
parser.add_option("--config_filename", dest="config_filename", help=
                                "Location to store all the metadata related to the training (to be used when testing).",
                                default="config_new.pickle")
parser.add_option("--output_weight_path", dest="output_weight_path", help="Output path for weights.",default='gs://nifty-episode-231612-mlengine/my_job_files/')
parser.add_option("--input_weight_path", dest="input_weight_path", help="Input path for weights. If not specified, will try to load default weights provided by keras.",
                  default='https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5')
parser.add_option("--bucket_path", dest="bucket_path", help="bucket path for stroing weights & configs",  default='gs://nifty-episode-231612-mlengine/my_job_files/')

(options, args) = parser.parse_args()

if not options.train_path:# if filename is not given
        parser.error('Error: path to training data must be specified. Pass --path to command line')

if options.parser == 'simple':
        from simple_parser_text import get_data
elif options.parser == 'simple_pick':
        from simple_parser_pkl import get_data
else:
        raise ValueError("Command line option parser must be one of 'pascal_voc' or 'simple'")

# pass the settings from the command line, and persist them in the config object
C = config.Config()

C.use_horizontal_flips = bool(options.horizontal_flips)
C.use_vertical_flips = bool(options.vertical_flips)
C.rot_90 = bool(options.rot_90)

C.model_path = options.output_weight_path
C.num_rois = int(options.num_rois)

if options.network == 'vgg':
        C.network = 'vgg'
        from keras_frcnn import vgg as nn
elif options.network == 'resnet50':
        from keras_frcnn import resnet as nn
        C.network = 'resnet50'
else:
        print('Not a valid model')
        raise ValueError


# check if weight path was passed via command line
if options.input_weight_path:
        C.base_net_weights = options.input_weight_path
else:
        # set the path to weights based on backend and model
        C.base_net_weights = nn.get_weight_path()# 'resnet50_weights_th_dim_ordering_th_kernels_notop.h5'

all_imgs, classes_count, class_mapping = get_data(options.train_path)

if 'bg' not in classes_count:
        classes_count['bg'] = 0
        class_mapping['bg'] = len(class_mapping)

C.class_mapping = class_mapping

inv_map = {v: k for k, v in class_mapping.items()}

print('Training images per class:')
pprint.pprint(classes_count)
print('Num classes (including bg) = {}'.format(len(classes_count)))

config_output_filename = options.bucket_path + options.config_filename# gs://input-your-bucket-name/train_on_gcloud/my_job_files/config.pickle

def new_open(name, mode, buffering=-1):# to open & load files from gcloud storage
        return file_io.FileIO(name, mode)


with new_open(config_output_filename, 'wb') as config_f:
        pickle.dump(C,config_f, protocol=2)# dumps config.pickle(compatible for python 2) in gcloud bucket
        print('Config has been written to {}, and can be loaded when testing to ensure correct results'.format(config_output_filename))

random.shuffle(all_imgs)

num_imgs = len(all_imgs)

train_imgs = [s for s in all_imgs if s['imageset'] == 'trainval']
val_imgs = [s for s in all_imgs if s['imageset'] == 'test']

print('Num train samples {}'.format(len(train_imgs)))
print('Num val samples {}'.format(len(val_imgs)))


data_gen_train = data_generators.get_anchor_gt(train_imgs, classes_count, C, nn.get_img_output_length, K.image_dim_ordering(), mode='train')
data_gen_val = data_generators.get_anchor_gt(val_imgs, classes_count, C, nn.get_img_output_length,K.image_dim_ordering(), mode='val')

if K.image_dim_ordering() == 'th':
        input_shape_img = (3, None, None)
else:
        input_shape_img = (None, None, 3)

img_input = Input(shape=input_shape_img)
roi_input = Input(shape=(None, 4))

# define the base network (resnet here, can be VGG, Inception, etc)
shared_layers = nn.nn_base(img_input, trainable=True)

# define the RPN, built on the base layers
num_anchors = len(C.anchor_box_scales) * len(C.anchor_box_ratios)
rpn = nn.rpn(shared_layers, num_anchors)

classifier = nn.classifier(shared_layers, roi_input, C.num_rois, nb_classes=len(classes_count), trainable=True)

model_rpn = Model(img_input, rpn[:2])
model_classifier = Model([img_input, roi_input], classifier)

# this is a model that holds both the RPN and the classifier, used to load/save weights for the models
model_all = Model([img_input, roi_input], rpn[:2] + classifier)

try:
        print('loading weights from {}'.format(C.base_net_weights))

        weights_path = get_file('base_weights.h5',C.base_net_weights)# downloading and adding weight paths
        model_rpn.load_weights(weights_path)
        model_classifier.load_weights(weights_path)
        print('weights loaded.')
except:
        print('Could not load pretrained model weights. Weights can be found in the keras application folder \
                https://github.com/fchollet/keras/tree/master/keras/applications')

optimizer = Adam(lr=1e-5)
optimizer_classifier = Adam(lr=1e-5)
model_rpn.compile(optimizer=optimizer, loss=[losses.rpn_loss_cls(num_anchors), losses.rpn_loss_regr(num_anchors)])
model_classifier.compile(optimizer=optimizer_classifier, loss=[losses.class_loss_cls, losses.class_loss_regr(len(classes_count)-1)], metrics={'dense_class_{}'.format(len(classes_count)): 'accuracy'})
model_all.compile(optimizer='sgd', loss='mae')

epoch_length = 1000
num_epochs = int(options.num_epochs)
iter_num = 0

losses = np.zeros((epoch_length, 5))
rpn_accuracy_rpn_monitor = []
rpn_accuracy_for_epoch = []
start_time = time.time()

best_loss = np.Inf

class_mapping_inv = {v: k for k, v in class_mapping.items()}
print('Starting training')

vis = True

for epoch_num in range(num_epochs):

        progbar = generic_utils.Progbar(epoch_length)
        print('Epoch {}/{}'.format(epoch_num + 1, num_epochs))

        while True:
                try:

                        if len(rpn_accuracy_rpn_monitor) == epoch_length and C.verbose:
                                mean_overlapping_bboxes = float(sum(rpn_accuracy_rpn_monitor))/len(rpn_accuracy_rpn_monitor)
                                rpn_accuracy_rpn_monitor = []
                                print('Average number of overlapping bounding boxes from RPN = {} for {} previous iterations'.format(mean_overlapping_bboxes, epoch_length))
                                if mean_overlapping_bboxes == 0:
                                        print('RPN is not producing bounding boxes that overlap the ground truth boxes. Check RPN settings or keep training.')

                        X, Y, img_data = next(data_gen_train)

                        loss_rpn = model_rpn.train_on_batch(X, Y)

                        P_rpn = model_rpn.predict_on_batch(X)

                        R = roi_helpers.rpn_to_roi(P_rpn[0], P_rpn[1], C, K.image_dim_ordering(), use_regr=True, overlap_thresh=0.7, max_boxes=300)
                        # note: calc_iou converts from (x1,y1,x2,y2) to (x,y,w,h) format
                        X2, Y1, Y2, IouS = roi_helpers.calc_iou(R, img_data, C, class_mapping)

                        if X2 is None:
                                rpn_accuracy_rpn_monitor.append(0)
                                rpn_accuracy_for_epoch.append(0)
                                continue

                        neg_samples = np.where(Y1[0, :, -1] == 1)
                        pos_samples = np.where(Y1[0, :, -1] == 0)

                        if len(neg_samples) > 0:
                                neg_samples = neg_samples[0]
                        else:
                                neg_samples = []

                        if len(pos_samples) > 0:
                                pos_samples = pos_samples[0]
                        else:
                                pos_samples = []
                        
                        rpn_accuracy_rpn_monitor.append(len(pos_samples))
                        rpn_accuracy_for_epoch.append((len(pos_samples)))

                        if C.num_rois > 1:
                                if len(pos_samples) < C.num_rois//2:
                                        selected_pos_samples = pos_samples.tolist()
                                else:
                                        selected_pos_samples = np.random.choice(pos_samples, C.num_rois//2, replace=False).tolist()
                                try:
                                        selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=False).tolist()
                                except:
                                        selected_neg_samples = np.random.choice(neg_samples, C.num_rois - len(selected_pos_samples), replace=True).tolist()

                                sel_samples = selected_pos_samples + selected_neg_samples
                        else:
                                # in the extreme case where num_rois = 1, we pick a random pos or neg sample
                                selected_pos_samples = pos_samples.tolist()
                                selected_neg_samples = neg_samples.tolist()
                                if np.random.randint(0, 2):
                                        sel_samples = random.choice(neg_samples)
                                else:
                                        sel_samples = random.choice(pos_samples)

                        loss_class = model_classifier.train_on_batch([X, X2[:, sel_samples, :]], [Y1[:, sel_samples, :], Y2[:, sel_samples, :]])

                        losses[iter_num, 0] = loss_rpn[1]
                        losses[iter_num, 1] = loss_rpn[2]

                        losses[iter_num, 2] = loss_class[1]
                        losses[iter_num, 3] = loss_class[2]
                        losses[iter_num, 4] = loss_class[3]

                        progbar.update(iter_num+1, [('rpn_cls', losses[iter_num, 0]), ('rpn_regr', losses[iter_num, 1]),
                                                                          ('detector_cls', losses[iter_num, 2]), ('detector_regr', losses[iter_num, 3])])

                        iter_num += 1
                        
                        if iter_num == epoch_length:
                                loss_rpn_cls = np.mean(losses[:, 0])
                                loss_rpn_regr = np.mean(losses[:, 1])
                                loss_class_cls = np.mean(losses[:, 2])
                                loss_class_regr = np.mean(losses[:, 3])
                                class_acc = np.mean(losses[:, 4])

                                mean_overlapping_bboxes = float(sum(rpn_accuracy_for_epoch)) / len(rpn_accuracy_for_epoch)
                                rpn_accuracy_for_epoch = []

                                if C.verbose:
                                        print('Mean number of bounding boxes from RPN overlapping ground truth boxes: {}'.format(mean_overlapping_bboxes))
                                        print('Classifier accuracy for bounding boxes from RPN: {}'.format(class_acc))
                                        print('Loss RPN classifier: {}'.format(loss_rpn_cls))
                                        print('Loss RPN regression: {}'.format(loss_rpn_regr))
                                        print('Loss Detector classifier: {}'.format(loss_class_cls))
                                        print('Loss Detector regression: {}'.format(loss_class_regr))
                                        print('Elapsed time: {}'.format(time.time() - start_time))

                                curr_loss = loss_rpn_cls + loss_rpn_regr + loss_class_cls + loss_class_regr
                                iter_num = 0
                                start_time = time.time()

                                if curr_loss < best_loss:
                                        if C.verbose:
                                                print('Total loss decreased from {} to {}, saving weights'.format(best_loss,curr_loss))
                                        best_loss = curr_loss
                                        model_weights= 'model_frcnn_new.hdf5'
                                        model_all.save_weights(model_weights)

                                        with new_open(model_weights, mode='r') as infile:# to write hdf5 file to gs://input-your-bucket-name/train_on_gcloud/my_job_files/
                                                with new_open(C.model_path + model_weights, mode='w+') as outfile:
                                                        outfile.write(infile.read())

                                break

                except Exception as e:
                        print('Exception: {}'.format(e))
                        continue

print('Training complete, exiting.')

Overwriting task.py


* Exit out of **trainer/**, to the directory containing **setup.py** for initiating dependency packaging  before training on gcloud.

In [19]:
%cd ..
!ls

/content/training-frcnn-google-ml-engine/move_to_cloudshell
annotations.txt  setup.py  trainer


* Define a JOB_NAME

In [0]:
JOB_NAME = 'test_job_gcloudColab'

Run the following cell to package the **`trainer/`** directory:
* It uploads the package to specified **gs://$BUCKET_NAME/JOB_NAME/**, and instruct AI Platform to run the **`trainer.task`** module from that package.

* The **`--stream-logs`** flag lets you view training logs in the cell below (One can
also view logs and other job details in the GCP Console, if you've enbaled **Stackdriver logging service**.)

For staging to package and further training:
* BUCKET_NAME = 'nifty-episode-231612-mlengine'
* JOB_NAME = 'test_job_GoogleColab'
* REGION= 'asia-east1'
* package-path= trainer/
* modele-name = trainer.task

In [23]:
!ls trainer/

config.py		    __init__.py     RoiPoolingConv.py	   vgg.py
data_augment.py		    losses.py	    simple_parser_pkl.py
data_generators.py	    resnet.py	    simple_parser_text.py
FixedBatchNormalization.py  roi_helpers.py  task.py


In [25]:
! gcloud ai-platform jobs submit training $JOB_NAME --package-path trainer/ --module-name trainer.task --region $REGION  --scale-tier=CUSTOM --master-machine-type=standard_gpu --staging-bucket gs://$BUCKET_NAME --stream-logs

Job [test_job_gcloudColab] submitted successfully.
INFO	2019-07-26 05:53:21 +0000	service		Validating job requirements...
INFO	2019-07-26 05:53:21 +0000	service		Job creation request has been successfully validated.
INFO	2019-07-26 05:53:22 +0000	service		Waiting for job to be provisioned.
INFO	2019-07-26 05:53:22 +0000	service		Job test_job_gcloudColab is queued.
INFO	2019-07-26 05:53:24 +0000	service		Waiting for training program to start.
INFO	2019-07-26 05:55:20 +0000	master-replica-0		Running task with arguments: --cluster={"master": ["127.0.0.1:2222"]} --task={"type": "master", "index": 0} --job={  "scale_tier": "CUSTOM",  "master_type": "standard_gpu",  "package_uris": ["gs://nifty-episode-231612-mlengine/test_job_gcloudColab/48b4b2e4a8eb7b77540ebec80d974c8d1f1a217b6c1e4f02297bb524c2d11c4b/frcnn_trainer-0.1.tar.gz"],  "python_module": "trainer.task",  "region": "asia-east1",  "run_on_raw_vm": true}
INFO	2019-07-26 05:55:40 +0000	master-replica-0		Running module trainer.task.
INF

## 2. Online predictions in AI Platform

### Create model and version resources in AI Platform

To serve online predictions using the model you trained and exported in Part 1,
create a *model* resource in AI Platform and a *version* resource
within it. The version resource is what actually uses your trained model to
serve predictions. This structure lets you adjust and retrain your model many times and
organize all the versions together in AI Platform. Learn more about [models
and
versions](https://cloud.google.com/ml-engine/docs/tensorflow/projects-models-versions-jobs).


* First, Define a name and create the model resource;
Also Enable Online prediction logging, to stream logs that contain the **stderr and stdout streams** from your prediction nodes, and can be useful for debugging during version creation and inferencing.

In [0]:
MODEL_NAME = "food_predictor"

! gcloud beta ai-platform models create $MODEL_NAME \
  --regions $REGION --enable-console-logging

Next, create the model version. The training job is exported to a timestamped directory in your Cloud Storage bucket. AI Platform uses this directory to create a model version.
Learn more about [SavedModel and
AI Platform](https://cloud.google.com/ml-engine/docs/tensorflow/deploying-models).

You may be able to find the path to this directory in your training job's logs.
Look for a line like:

```
Model exported to:  gs://<your-bucket-name>/keras-job-dir/keras_export/1545439782
```

Execute the following command to identify your SavedModel directory and use it to create a model version resource:

In [0]:
python setup.py sdist --formats=gztar

gsutil cp dist/test_code_new_model_beta5-0.1.tar.gz gs://nifty-episode-231612-mlengine/cloud_test_package_2/cloud_test_package_v5

MODEL_NAME="FoodPredictor_060619"
VERSION_NAME='v5_a'
REGION=asia-east1


gcloud beta ai-platform versions create $VERSION_NAME --model $MODEL_NAME
--python-version 3.5 --runtime-version 1.5 --machine-type mls1-c4-m2
--origin gs://nifty-episode-231612-mlengine/cloud_test_package_2/cloud_test_package_v5
--package-uris gs://nifty-episode-231612-mlengine/cloud_test_package_2/cloud_test_package_v5/test_code_new_model_beta5-0.1.tar.gz
--prediction-class predictor.MyPredictor



[m[1mNAME[m
    gcloud - manage Google Cloud Platform resources and developer workflow

[m[1mSYNOPSIS[m
    [1mgcloud[m [4mGROUP[m | [4mCOMMAND[m [[1m--account[m=[4mACCOUNT[m]
        [[1m--billing-project[m=[4mBILLING_PROJECT[m] [[1m--configuration[m=[4mCONFIGURATION[m]
        [[1m--flags-file[m=[4mYAML_FILE[m] [[1m--flatten[m=[[4mKEY[m,...]] [[1m--format[m=[4mFORMAT[m]
        [[1m--help[m] [[1m--project[m=[4mPROJECT_ID[m] [[1m--quiet[m, [1m-q[m]
        [[1m--impersonate-service-account[m=[4mSERVICE_ACCOUNT_EMAIL[m] [[1m--log-http[m]
        [[1m--trace-token[m=[4mTRACE_TOKEN[m] [[1m--no-user-output-enabled[m]

[m[1mDESCRIPTION[m
    The [1mgcloud[m CLI manages authentication, local configuration, developer
    workflow, and interactions with the Google Cloud Platform APIs.

[m[1mGLOBAL FLAGS[m
     [1m--account[m=[4mACCOUNT[m
        Google Cloud Platform user account to use for invocation. Overrides the
       