<h1> Structured Machine Learning using Tensorflow, Google Cloud Datalab and Cloud ML - 1/2</h1>
<hr />
<b>This notebook demonstrates a process to train, evaluate and export a ML model to Google Cloud Storage. It leverages a pre-built machine learning model to predict Length of Stay in ED and inpatient care settings. This is step 1 of 2.</b>
<h3>
<br />
<ol>
<li> Setup Environment </li> <br />
<li> Label generation - Generate Labels in TFRecord format </li> <br />
<li> Generate TFSequenceExamples </li> <br />
<li> Train and Evaluate Machine Learning Model </li> <br />
</ol></h3>
<hr />

<h2> 1. Setup environment</h2>
<ul>
    <li> Initialize environment variables for your environment</li>
    <li>Please change the values of the following before executing rest of the cells in this notebook: <br />
        <b>1. GCP_PROJECT and </b> <br />
        <b>2. GCS_BUCKET </b> <br />
        <b>3. GCS_REGION </b>
    </li>
</ul>

In [22]:
import os
GCP_PROJECT = 'dp-workspace'
GCS_BUCKET = 'gs://cluster19-bkt'
GCS_REGION = 'us-central1'
os.putenv("REGION", GCS_REGION)
LABELS_JOB = 'bundlesTolabels'
SEQEX_JOB = 'gen_seqex'
STAGING_LOCATION = GCS_BUCKET+'/staging'
TEMP_LOCATION = GCS_BUCKET+'/temp'
RUNNER = 'DirectRunner'
TF_RECORD_BUNDLES = 'gs://cluster19-bkt/synthea/bundles/bundles*'
os.putenv("BUNDLES_IN_GCS", TF_RECORD_BUNDLES)
LABELS_PATH = GCS_BUCKET+'/synthea/train/label'
TF_RECORD_LABELS = GCS_BUCKET+'/synthea/train/label-00000-of-00001.tfrecords'
os.putenv("LABELS_IN_GCS", TF_RECORD_LABELS)
SEQEX_PATH = GCS_BUCKET+'/synthea/train/seqex'
TF_RECORD_SEQEX = GCS_BUCKET+'/synthea/train/seqex*'
os.putenv("SEQEX_IN_GCS", TF_RECORD_SEQEX)
MODEL_PATH = GCS_BUCKET+'/synthea/model/'
os.putenv("MODEL_IN_GCS", MODEL_PATH+"*")
SAVED_MODEL_PATH = MODEL_PATH + 'export'
os.putenv("SAVED_MODEL_IN_GCS", SAVED_MODEL_PATH+"*")
TRAINING_DATASET = GCS_BUCKET+'/synthea/train/seqex-00000-of-00003.tfrecords'
VALIDATION_DATASET = GCS_BUCKET+'/synthea/train/seqex-00001-of-00003.tfrecords'
SERVING_DATASET = GCS_BUCKET+'/synthea/train/seqex-00002-of-00003.tfrecords'
os.putenv("SERV_DS", SERVING_DATASET)
os.putenv("SERV_LOC", GCS_BUCKET+"/synthea/serv/")

<b>Import dependencies. </b>

In [23]:
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import GoogleCloudOptions
from apache_beam.options.pipeline_options import StandardOptions
import apache_beam as beam
from tensorflow.core.example import example_pb2
import tensorflow as tf
import time

from proto.stu3 import google_extensions_pb2
from proto.stu3 import resources_pb2
from proto.stu3 import version_config_pb2

from google.protobuf import text_format
from py.google.fhir.labels import label
from py.google.fhir.labels import bundle_to_label
from py.google.fhir.seqex import bundle_to_seqex
from py.google.fhir.models import model

<b>Optionally, enable logging for debugging.</b>

In [24]:
import logging
logger = logging.getLogger()
#logger.setLevel(logging.INFO)
logger.setLevel(logging.ERROR)

<h2> 2. Label generation - Generate Labels in TFRecord format </h2>
<ul>
    <li>The next few cells generates labels in TFRecord format.</li>
    <li>Bundles in TFRecord format have already been generated from Synthetic FHIR data</li>
    <li>Bundles will be used as inputs and are stored in Google Cloud Storage</li>
    <li>Output labels will also be stored in Google Cloud Storage </li>
</ul>

<b>2a. Let's examine GCS bucket that holds the bundles in TFRecord format.</b>

In [25]:
%bash
gsutil ls -l ${BUNDLES_IN_GCS}

  40287400  2019-03-04T21:57:54Z  gs://cluster19-bkt/synthea/bundles/bundles-00001-of-00010.tfrecords
  32071078  2019-03-02T02:26:22Z  gs://cluster19-bkt/synthea/bundles/bundles-00002-of-00010.tfrecords
  34101053  2019-03-02T02:26:22Z  gs://cluster19-bkt/synthea/bundles/bundles-00003-of-00010.tfrecords
  34407672  2019-03-02T02:26:22Z  gs://cluster19-bkt/synthea/bundles/bundles-00004-of-00010.tfrecords
  48389070  2019-03-02T02:26:22Z  gs://cluster19-bkt/synthea/bundles/bundles-00005-of-00010.tfrecords
  29202755  2019-03-02T02:26:23Z  gs://cluster19-bkt/synthea/bundles/bundles-00006-of-00010.tfrecords
  32379919  2019-03-02T02:26:23Z  gs://cluster19-bkt/synthea/bundles/bundles-00007-of-00010.tfrecords
  46919280  2019-03-02T02:26:23Z  gs://cluster19-bkt/synthea/bundles/bundles-00008-of-00010.tfrecords
  47418405  2019-03-04T16:33:36Z  gs://cluster19-bkt/synthea/bundles/bundles-00009-of-00010.tfrecords
TOTAL: 9 objects, 345176632 bytes (329.19 MiB)


<b>2b. Delete labels generated from previous runs. </b>

In [26]:
%bash
gsutil rm ${LABELS_IN_GCS}

Removing gs://cluster19-bkt/synthea/train/label-00000-of-00001.tfrecords...
/ [1 objects]                                                                   
Operation completed over 1 objects.                                              


<b>2c. Set options needed to initialize the pipeline. </b>

In [6]:
options = PipelineOptions()
google_cloud_options = options.view_as(GoogleCloudOptions)
google_cloud_options.project = GCP_PROJECT
google_cloud_options.job_name = LABELS_JOB
google_cloud_options.staging_location = STAGING_LOCATION
google_cloud_options.temp_location = TEMP_LOCATION
options.view_as(StandardOptions).runner = RUNNER

<b>2d. Initialize the pipeline to generate labels. </b>

In [7]:
p = beam.Pipeline(options=options)

bundles = p | 'read' >> beam.io.ReadFromTFRecord(
    TF_RECORD_BUNDLES, coder=beam.coders.ProtoCoder(resources_pb2.Bundle))

labels = bundles | 'BundleToLabel' >> beam.ParDo(
    bundle_to_label.LengthOfStayRangeLabelAt24HoursFn(for_synthea=True))

_ = labels | beam.io.WriteToTFRecord(
    LABELS_PATH,
    coder=beam.coders.ProtoCoder(google_extensions_pb2.EventLabel),
    file_name_suffix='.tfrecords')

Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.



<b>2e. Run the pipeline to generate labels. </b>

In [8]:
logger.setLevel(logging.INFO)
start = time.time()
p.run().wait_until_finish()
end = time.time()
print(end-start)

I0306 22:16:00.090907 139695413741312 fn_api_runner.py:912] Running ((ref_AppliedPTransform_WriteToTFRecord/Write/WriteImpl/DoOnce/Read_9)+((ref_AppliedPTransform_WriteToTFRecord/Write/WriteImpl/InitializeWrite_10)+(ref_PCollection_PCollection_4/Write)))+(ref_PCollection_PCollection_3/Write)
I0306 22:16:00.101476 139695413741312 bundle_processor.py:291] start <DataOutputOperation ref_PCollection_PCollection_3/Write >
I0306 22:16:00.104068 139695413741312 bundle_processor.py:291] start <DataOutputOperation ref_PCollection_PCollection_4/Write >
I0306 22:16:00.106225 139695413741312 bundle_processor.py:291] start <DoOperation WriteToTFRecord/Write/WriteImpl/InitializeWrite output_tags=['out']>
I0306 22:16:00.111064 139695413741312 bundle_processor.py:291] start <ReadOperation WriteToTFRecord/Write/WriteImpl/DoOnce/Read source=SourceBundle(weight=1.0, source=<apache_beam.transforms.create_source._CreateSource object at 0x7f0d10b6d410>, start_position=None, stop_position=None)>
I0306 22:16:

I0306 22:16:28.931315 139695413741312 bundle_processor.py:303] finish <DataInputOperation WriteToTFRecord/Write/WriteImpl/GroupByKey/Read receivers=[ConsumerSet[WriteToTFRecord/Write/WriteImpl/GroupByKey/Read.out0, coder=WindowedValueCoder[TupleCoder[LengthPrefixCoder[FastPrimitivesCoder], IterableCoder[LengthPrefixCoder[FastPrimitivesCoder]]]], len(consumers)=1]]>
I0306 22:16:28.933068 139695413741312 bundle_processor.py:303] finish <DoOperation WriteToTFRecord/Write/WriteImpl/Extract output_tags=['out'], receivers=[ConsumerSet[WriteToTFRecord/Write/WriteImpl/Extract.out0, coder=WindowedValueCoder[LengthPrefixCoder[FastPrimitivesCoder]], len(consumers)=1]]>
I0306 22:16:28.934916 139695413741312 bundle_processor.py:303] finish <DataOutputOperation ref_PCollection_PCollection_11/Write >
I0306 22:16:28.939438 139695413741312 fn_api_runner.py:912] Running ((ref_PCollection_PCollection_3/Read)+(ref_AppliedPTransform_WriteToTFRecord/Write/WriteImpl/PreFinalize_19))+(ref_PCollection_PCollect

30.2971529961


<b>2f. Let's examine the location in GCS where the generated labels have been stored. </b>

In [10]:
%bash
gsutil ls -l ${LABELS_IN_GCS}

     92641  2019-03-06T22:16:29Z  gs://cluster19-bkt/synthea/train/label-00000-of-00001.tfrecords
TOTAL: 1 objects, 92641 bytes (90.47 KiB)


<h2> 3. Generate TFSequenceExamples</h2>
<ul><b>
    <li>The next few cell generates Tensorflow sequence examples and save examples to Google Cloud Storage (GCS) for later use.</li>
    <li>Bundles in TFRecord format have already been generated from Synthetic FHIR data.</li>
    <li>Bundles will be used as inputs and are stored in GCS</li>
    <li>Generated sequence examples will also be stored in GCS</li></b>
</ul>

<b>3a. Delete Sequence Examples generated from previous runs and regenerate. </b>

In [11]:
%bash
gsutil rm ${SEQEX_IN_GCS}

Removing gs://cluster19-bkt/synthea/train/seqex-00000-of-00003.tfrecords...
/ [1 objects]                                                                   Removing gs://cluster19-bkt/synthea/train/seqex-00001-of-00003.tfrecords...
/ [2 objects]                                                                   Removing gs://cluster19-bkt/synthea/train/seqex-00002-of-00003.tfrecords...
/ [3 objects]                                                                   
Operation completed over 3 objects.                                              


<b>3b. Initialize the pipeline to generate Sequence Examples. </b>

In [12]:
google_cloud_options.job_name = SEQEX_JOB
p1 = beam.Pipeline(options=options)

def _get_version_config(version_config_path):
  with open(version_config_path) as f:
    return text_format.Parse(f.read(), version_config_pb2.VersionConfig())

version_config = _get_version_config("/usr/local/fhir/proto/stu3/version_config.textproto")

keyed_bundles = ( 
    p1 
    | 'readBundles' >> beam.io.ReadFromTFRecord(
        TF_RECORD_BUNDLES, coder=beam.coders.ProtoCoder(resources_pb2.Bundle))
    | 'KeyBundlesByPatientId' >> beam.ParDo(
        bundle_to_seqex.KeyBundleByPatientIdFn()))

event_labels = ( 
    p1 | 'readEventLabels' >> beam.io.ReadFromTFRecord(
        TF_RECORD_LABELS,
        coder=beam.coders.ProtoCoder(google_extensions_pb2.EventLabel)))

keyed_event_labels = bundle_to_seqex.CreateTriggerLabelsPairLists(
    event_labels)

bundles_and_labels = bundle_to_seqex.CreateBundleAndLabels(
    keyed_bundles, keyed_event_labels)
_ = ( 
    bundles_and_labels
    | 'GenerateSeqex' >> beam.ParDo(
        bundle_to_seqex.BundleAndLabelsToSeqexDoFn(
            version_config=version_config,
            enable_attribution=False,
            generate_sequence_label=False))
    | 'WriteSeqex' >> beam.io.WriteToTFRecord(
        SEQEX_PATH,
        coder=beam.coders.ProtoCoder(example_pb2.SequenceExample),
        file_name_suffix='.tfrecords',
        num_shards=3))

I0306 22:16:59.596043 139695413741312 gcsio.py:446] Starting the size estimation of the input
I0306 22:16:59.600212 139695413741312 client.py:614] Attempting refresh to obtain initial access_token
I0306 22:16:59.764015 139695413741312 gcsio.py:460] Finished listing 9 files in 0.167971134186 seconds.
I0306 22:16:59.778098 139695413741312 client.py:614] Attempting refresh to obtain initial access_token
I0306 22:16:59.977226 139695413741312 client.py:614] Attempting refresh to obtain initial access_token


<b>3c. Now let's run the pipeline to generate Sequence Examples. Depending on the size of the dataset, this step may take more than 5 minuites to run so have patience!!! </b> <br />
We are creating three shards of sequence examples. One for <b>training</b>, second one for <b>evaluation,</b> and third one for <b>inference.</b>

In [13]:
logger.setLevel(logging.CRITICAL)
start = time.time()
p1.run().wait_until_finish()
end = time.time()
print(end-start)

226.426615


<b> 3d. Let's examine the location in GCS where the generated Sequence Examples have been stored. Copy the third set of sequence examples to a serving area in GCS.  </b>

In [14]:
%bash
gsutil ls -l ${SEQEX_IN_GCS}
gsutil cp ${SERV_DS} ${SERV_LOC}

  45853651  2019-03-06T22:20:48Z  gs://cluster19-bkt/synthea/train/seqex-00000-of-00003.tfrecords
  45438501  2019-03-06T22:20:48Z  gs://cluster19-bkt/synthea/train/seqex-00001-of-00003.tfrecords
  45296041  2019-03-06T22:20:48Z  gs://cluster19-bkt/synthea/train/seqex-00002-of-00003.tfrecords
TOTAL: 3 objects, 136588193 bytes (130.26 MiB)


Copying gs://cluster19-bkt/synthea/train/seqex-00002-of-00003.tfrecords [Content-Type=application/octet-stream]...
/ [0 files][    0.0 B/ 43.2 MiB]                                                / [1 files][ 43.2 MiB/ 43.2 MiB]                                                
Operation completed over 1 objects/43.2 MiB.                                     


<h2> 4. Train and Evaluate ML Model</h2>
<ul>
    <li>The next few cell demonstrate the process to train a ML Model using the training data set created in Step 3</li>
    <li>Training requires sequence examples in TFRecord format</li>
    <li>Trained ML model will be stored in Google Cloud Storage </li>
    <li>Model will be evaluated and the evaluation output will be printed</li>
</ul>

<b>4a. Delete previously trained model and retrain it with the new dataset. </b>

In [15]:
%bash
gsutil -m rm -r ${MODEL_IN_GCS}

Removing gs://cluster19-bkt/synthea/model/#1551735301753782...
Removing gs://cluster19-bkt/synthea/model/checkpoint#1551735304980143...
Removing gs://cluster19-bkt/synthea/model/events.out.tfevents.1551735264.cluster19-m#1551735328485151...
Removing gs://cluster19-bkt/synthea/model/graph.pbtxt#1551735273356477...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-0.data-00000-of-00002#1551735279142064...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-0.data-00001-of-00002#1551735278653956...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-0.index#1551735279569651...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-0.meta#1551735283073927...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-300.data-00000-of-00002#1551735303096410...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-300.data-00001-of-00002#1551735302571612...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-300.index#1551735303571176...
Removing gs://cluster19-bkt/synthea/model/model.ckpt-300.

<b> 4b. Import Tensorflow model and prepare it for training and validation

In [16]:
import tensorflow as tf
from py.google.fhir.models import model
from py.google.fhir.models.model import create_hparams
from py.google.fhir.models.model import get_input_fn
from py.google.fhir.models.model import make_estimator

tf.reset_default_graph()
hparams = model.create_hparams()

time_crossed_features = [
        cross.split(':') for cross in hparams.time_crossed_features if cross
    ]

train_input_fn = get_input_fn(tf.estimator.ModeKeys.TRAIN, TRAINING_DATASET, 'label.length_of_stay_range.class',
                              True, hparams.time_windows, hparams.include_age, hparams.categorical_context_features,
                              hparams.sequence_features, time_crossed_features, batch_size=24)
validation_input_fn = get_input_fn(tf.estimator.ModeKeys.EVAL, VALIDATION_DATASET, 'label.length_of_stay_range.class',
                                   True, hparams.time_windows, hparams.include_age, hparams.categorical_context_features,
                                   hparams.sequence_features, time_crossed_features, batch_size=24)

<b> 4c. Check that we can read data.</b>

In [17]:
map_, label_ = train_input_fn()
success = False
with tf.train.MonitoredSession() as sess:
  map_['label'] = label_
  print(sess.run(map_))
  print("Successfully read an input batch")

Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.


W0306 22:24:11.604448 139695413741312 tf_logging.py:125] From /usr/local/fhir/py/google/fhir/models/model.py:410: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.


INFO:tensorflow:Graph was finalized.


I0306 22:24:15.063673 139695413741312 tf_logging.py:115] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0306 22:24:15.674421 139695413741312 tf_logging.py:115] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0306 22:24:15.743711 139695413741312 tf_logging.py:115] Done running local_init_op.


{'s-Observation.code_Observation.value.quantity.value_Observation.value.quantity.unit_Observation.value.string-til-0': SparseTensorValue(indices=array([[21,  0],
       [21,  1],
       [21,  2],
       [21,  3],
       [21,  4],
       [21,  5],
       [21,  6],
       [21,  7],
       [21,  8],
       [21,  9],
       [21, 10],
       [21, 11],
       [21, 12],
       [21, 13],
       [21, 14],
       [21, 15],
       [21, 16],
       [21, 17],
       [21, 18],
       [21, 19],
       [21, 20],
       [21, 21],
       [21, 22],
       [21, 23],
       [21, 24],
       [21, 25],
       [21, 26],
       [21, 27],
       [21, 28],
       [21, 29],
       [21, 30],
       [21, 31],
       [21, 32],
       [21, 33],
       [21, 34],
       [21, 35],
       [21, 36],
       [21, 37],
       [21, 38],
       [21, 39],
       [21, 40],
       [21, 41],
       [21, 42],
       [21, 43]]), values=array(['loinc:20454-5-n/a-n/a-n/a', 'loinc:5767-9-n/a-n/a-n/a',
       'loinc:32623-1-9.786781-fL-

<b> 4d. Define Estimator. </b>

In [18]:
LABEL_VALUES = ['less_or_equal_3', '3_7', '7_14', 'above_14']
estimator = make_estimator(hparams, LABEL_VALUES, MODEL_PATH)

Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.


W0306 22:24:33.192502 139695413741312 tf_logging.py:125] From /usr/local/fhir/py/google/fhir/models/model.py:684: __init__ (from tensorflow.contrib.learn.python.learn.estimators.run_config) is deprecated and will be removed in a future version.
Instructions for updating:
When switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.


INFO:tensorflow:Using config: {'_save_checkpoints_secs': 180, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0d102c9a90>, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_protocol': None, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_num_ps_replicas': 0, '_model_dir': 'gs://cluster19-bkt/synthea/model/', '_tf_random_seed': None, '_master': '', '_device_fn': None, '_num_worker_replicas': 0, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_environment': 'local', '_save_summary_steps': 100}


I0306 22:24:33.199081 139695413741312 tf_logging.py:115] Using config: {'_save_checkpoints_secs': 180, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0d102c9a90>, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_protocol': None, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_num_ps_replicas': 0, '_model_dir': 'gs://cluster19-bkt/synthea/model/', '_tf_random_seed': None, '_master': '', '_device_fn': None, '_num_worker_replicas': 0, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_environment': 'local', '_save_summary_steps': 100}


INFO:tensorflow:Using config: {'_save_checkpoints_secs': 180, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0d102c9350>, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_protocol': None, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_num_ps_replicas': 0, '_model_dir': 'gs://cluster19-bkt/synthea/model/', '_tf_random_seed': None, '_master': '', '_device_fn': None, '_num_worker_replicas': 0, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_environment': 'local', '_save_summary_steps': 100}


I0306 22:24:33.204991 139695413741312 tf_logging.py:115] Using config: {'_save_checkpoints_secs': 180, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_task_type': None, '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0d102c9350>, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_protocol': None, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_num_ps_replicas': 0, '_model_dir': 'gs://cluster19-bkt/synthea/model/', '_tf_random_seed': None, '_master': '', '_device_fn': None, '_num_worker_replicas': 0, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_environment': 'local', '_save_summary_steps': 100}


<b> 4e. Train and Evaluate</b>

In [19]:
train_spec = tf.estimator.TrainSpec(input_fn=train_input_fn, max_steps=300)
eval_spec = tf.estimator.EvalSpec(input_fn=validation_input_fn, steps=40, throttle_secs=60)

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

INFO:tensorflow:Not using Distribute Coordinator.


I0306 22:24:36.847106 139695413741312 tf_logging.py:115] Not using Distribute Coordinator.


INFO:tensorflow:Running training and evaluation locally (non-distributed).


I0306 22:24:36.851262 139695413741312 tf_logging.py:115] Running training and evaluation locally (non-distributed).


INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 180.


I0306 22:24:36.855804 139695413741312 tf_logging.py:115] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 180.


INFO:tensorflow:Calling model_fn.


I0306 22:24:40.926853 139695413741312 tf_logging.py:115] Calling model_fn.


INFO:tensorflow:Calling model_fn.


I0306 22:24:40.930792 139695413741312 tf_logging.py:115] Calling model_fn.


INFO:tensorflow:Done calling model_fn.


I0306 22:24:47.121191 139695413741312 tf_logging.py:115] Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


I0306 22:24:47.125195 139695413741312 tf_logging.py:115] Done calling model_fn.


INFO:tensorflow:Create CheckpointSaverHook.


I0306 22:24:47.129283 139695413741312 tf_logging.py:115] Create CheckpointSaverHook.


INFO:tensorflow:Graph was finalized.


I0306 22:24:51.849224 139695413741312 tf_logging.py:115] Graph was finalized.


INFO:tensorflow:Running local_init_op.


I0306 22:24:53.015283 139695413741312 tf_logging.py:115] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0306 22:24:53.204015 139695413741312 tf_logging.py:115] Done running local_init_op.


INFO:tensorflow:Saving checkpoints for 0 into gs://cluster19-bkt/synthea/model/model.ckpt.


I0306 22:25:00.018141 139695413741312 tf_logging.py:115] Saving checkpoints for 0 into gs://cluster19-bkt/synthea/model/model.ckpt.


INFO:tensorflow:loss = 1.3862944, step = 1


I0306 22:25:16.209939 139695413741312 tf_logging.py:115] loss = 1.3862944, step = 1


INFO:tensorflow:global_step/sec: 15.9075


I0306 22:25:22.495450 139695413741312 tf_logging.py:115] global_step/sec: 15.9075


INFO:tensorflow:loss = 0.5688575, step = 101 (6.293 sec)


I0306 22:25:22.502588 139695413741312 tf_logging.py:115] loss = 0.5688575, step = 101 (6.293 sec)


INFO:tensorflow:global_step/sec: 67.5985


I0306 22:25:23.974694 139695413741312 tf_logging.py:115] global_step/sec: 67.5985


INFO:tensorflow:loss = 0.54932564, step = 201 (1.479 sec)


I0306 22:25:23.981414 139695413741312 tf_logging.py:115] loss = 0.54932564, step = 201 (1.479 sec)


INFO:tensorflow:Saving checkpoints for 300 into gs://cluster19-bkt/synthea/model/model.ckpt.


I0306 22:25:25.432375 139695413741312 tf_logging.py:115] Saving checkpoints for 300 into gs://cluster19-bkt/synthea/model/model.ckpt.


INFO:tensorflow:Calling model_fn.


I0306 22:25:41.183012 139695413741312 tf_logging.py:115] Calling model_fn.


INFO:tensorflow:Calling model_fn.


I0306 22:25:41.186968 139695413741312 tf_logging.py:115] Calling model_fn.


INFO:tensorflow:Done calling model_fn.


I0306 22:25:44.601139 139695413741312 tf_logging.py:115] Done calling model_fn.


INFO:tensorflow:Done calling model_fn.


I0306 22:25:45.867211 139695413741312 tf_logging.py:115] Done calling model_fn.


INFO:tensorflow:Starting evaluation at 2019-03-06-22:25:45


I0306 22:25:45.896682 139695413741312 tf_logging.py:115] Starting evaluation at 2019-03-06-22:25:45


INFO:tensorflow:Graph was finalized.


I0306 22:25:46.363262 139695413741312 tf_logging.py:115] Graph was finalized.


INFO:tensorflow:Restoring parameters from gs://cluster19-bkt/synthea/model/model.ckpt-300


I0306 22:25:46.493905 139695413741312 tf_logging.py:115] Restoring parameters from gs://cluster19-bkt/synthea/model/model.ckpt-300


INFO:tensorflow:Running local_init_op.


I0306 22:25:48.002795 139695413741312 tf_logging.py:115] Running local_init_op.


INFO:tensorflow:Done running local_init_op.


I0306 22:25:48.193170 139695413741312 tf_logging.py:115] Done running local_init_op.


INFO:tensorflow:Evaluation [4/40]


I0306 22:25:53.838809 139695413741312 tf_logging.py:115] Evaluation [4/40]


INFO:tensorflow:Finished evaluation at 2019-03-06-22:25:54


I0306 22:25:54.449218 139695413741312 tf_logging.py:115] Finished evaluation at 2019-03-06-22:25:54


INFO:tensorflow:Saving dict for global step 300: accuracy = 0.93877554, auc_pr_at_most_7d = 1.0, auc_roc_at_most_7d = 0.0, average_loss = 0.42557618, global_step = 300, loss = 0.41817275, precision_3_7 = nan, precision_7_14 = nan, precision_above_14 = nan, precision_at_1 = 0.9387755102040817, precision_at_2 = 0.5, precision_at_most_7d = 1.0, precision_less_or_equal_3 = 0.9387755102040817, recall_3_7 = 0.0, recall_7_14 = nan, recall_above_14 = nan, recall_at_1 = 0.9387755102040817, recall_at_2 = 1.0, recall_at_most_7d = 1.0, recall_less_or_equal_3 = 1.0


I0306 22:25:54.453027 139695413741312 tf_logging.py:115] Saving dict for global step 300: accuracy = 0.93877554, auc_pr_at_most_7d = 1.0, auc_roc_at_most_7d = 0.0, average_loss = 0.42557618, global_step = 300, loss = 0.41817275, precision_3_7 = nan, precision_7_14 = nan, precision_above_14 = nan, precision_at_1 = 0.9387755102040817, precision_at_2 = 0.5, precision_at_most_7d = 1.0, precision_less_or_equal_3 = 0.9387755102040817, recall_3_7 = 0.0, recall_7_14 = nan, recall_above_14 = nan, recall_at_1 = 0.9387755102040817, recall_at_2 = 1.0, recall_at_most_7d = 1.0, recall_less_or_equal_3 = 1.0


INFO:tensorflow:Saving 'checkpoint_path' summary for global step 300: gs://cluster19-bkt/synthea/model/model.ckpt-300


I0306 22:25:58.658610 139695413741312 tf_logging.py:115] Saving 'checkpoint_path' summary for global step 300: gs://cluster19-bkt/synthea/model/model.ckpt-300


INFO:tensorflow:Loss for final step: 0.44990516.


I0306 22:26:00.918080 139695413741312 tf_logging.py:115] Loss for final step: 0.44990516.


({'accuracy': 0.93877554,
  'auc_pr_at_most_7d': 1.0,
  'auc_roc_at_most_7d': 0.0,
  'average_loss': 0.42557618,
  'global_step': 300,
  'loss': 0.41817275,
  'precision_3_7': nan,
  'precision_7_14': nan,
  'precision_above_14': nan,
  'precision_at_1': 0.9387755102040817,
  'precision_at_2': 0.5,
  'precision_at_most_7d': 1.0,
  'precision_less_or_equal_3': 0.9387755102040817,
  'recall_3_7': 0.0,
  'recall_7_14': nan,
  'recall_above_14': nan,
  'recall_at_1': 0.9387755102040817,
  'recall_at_2': 1.0,
  'recall_at_most_7d': 1.0,
  'recall_less_or_equal_3': 1.0},
 [])

<b> 4f. Inspect and understand the TF runs and graphs us TensorBoard </b>

In [20]:
from google.datalab.ml import TensorBoard as tb
tb.start(MODEL_PATH)

4745

<b>4g. List the GCS location where Model has been stored. </b>

In [None]:
%bash
gsutil ls -l ${MODEL_IN_GCS}