This notebook (and the slides from lecture 8) will help you go straight from training a model in Colab to deploying it in a webpage with TensorFlow.js - without having to leave the browser.

Configure this notebook to work with your GitHub account by populating these fields.

In [1]:
!pip install tensorflowjs



In [0]:
# your github username
USER_NAME = "" 

# the email associated with your commits
# (may not matter if you leave it as this)
USER_EMAIL = "" 

# the user token you've created 
TOKEN = "" 

# site name
# for example, if my user_name is "foo", then this notebook will create
# a site at https://foo.github.io/dljsonfly/
SITE_NAME = "dljsonfly"

Next, run this cell to configure git.

In [0]:
!git config --global user.email {USER_NAME}
!git config --global user.name  {USER_EMAIL}

Clone your GitHub pages repo (see the lecture 8 slides for instructions on how to create one).

In [0]:
import os
repo_path = USER_NAME + '.github.io'
if not os.path.exists(os.path.join(os.getcwd(), repo_path)):
  !git clone https://{USER_NAME}:{TOKEN}@github.com/{USER_NAME}/{USER_NAME}.github.io

In [5]:
os.chdir(repo_path)
!git pull

From https://github.com/rahulvasaikar/rahulvasaikar.github.io
 * [new branch]      master     -> origin/master
Already up to date.


Create a folder for your site.

In [0]:
project_path = os.path.join(os.getcwd(), SITE_NAME)
if not os.path.exists(project_path): 
  os.mkdir(project_path)
os.chdir(project_path)

These paths will be used by the converter script.

In [0]:
# DO NOT MODIFY
MODEL_DIR = os.path.join(project_path, "model_js")
if not os.path.exists(MODEL_DIR):
  os.mkdir(MODEL_DIR)

As an example, we will create and vectorize a few documents. (Check out https://www.gutenberg.org/ for a bunch of free e-books.)

In [8]:
!wget http://www.gutenberg.org/cache/epub/28885/pg28885.txt -O alice_book.txt

--2019-01-22 07:41:05--  http://www.gutenberg.org/cache/epub/28885/pg28885.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 177428 (173K) [text/plain]
Saving to: ‘alice_book.txt’


2019-01-22 07:41:05 (752 KB/s) - ‘alice_book.txt’ saved [177428/177428]



In [0]:
with open('alice_book.txt','r') as f:
  lines=f.read()

In [10]:
import re
pat = re.compile(r'([A-Z][^\.!?]*[\.!?])', re.M)
sentences = pat.findall(lines)
for s in sentences:
  print (s + '$')

Project Gutenberg's Alice's Adventures in Wonderland, by Lewis Carroll

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.$
You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.$
Title: Alice's Adventures in Wonderland
       Illustrated by Arthur Rackham.$
With a Proem by Austin Dobson

Author: Lewis Carroll

Illustrator: Arthur Rackham

Release Date: May 19, 2009 [EBook #28885]

Language: English


*** START OF THIS PROJECT GUTENBERG EBOOK ALICE'S ADVENTURES IN WONDERLAND ***




Produced by Jana Srna, Emmy and the Online Distributed
Proofreading Team at http://www.$
This file was
produced from images generously made available by the
University of Florida Digital Collections.$
ALICE'S ADVENTURES IN WONDERLAND

[Illustration: "Alice"]

[Illustration:

          ALICE'S·ADVENTURES
          IN·WONDERLAND
          BY·LEWIS·CARROLL
          ILLUSTRATED·BY
  

In [11]:
sentences

["Project Gutenberg's Alice's Adventures in Wonderland, by Lewis Carroll\n\nThis eBook is for the use of anyone anywhere at no cost and with\nalmost no restrictions whatsoever.",
 'You may copy it, give it away or\nre-use it under the terms of the Project Gutenberg License included\nwith this eBook or online at www.',
 "Title: Alice's Adventures in Wonderland\n       Illustrated by Arthur Rackham.",
 "With a Proem by Austin Dobson\n\nAuthor: Lewis Carroll\n\nIllustrator: Arthur Rackham\n\nRelease Date: May 19, 2009 [EBook #28885]\n\nLanguage: English\n\n\n*** START OF THIS PROJECT GUTENBERG EBOOK ALICE'S ADVENTURES IN WONDERLAND ***\n\n\n\n\nProduced by Jana Srna, Emmy and the Online Distributed\nProofreading Team at http://www.",
 'This file was\nproduced from images generously made available by the\nUniversity of Florida Digital Collections.',
 'ALICE\'S ADVENTURES IN WONDERLAND\n\n[Illustration: "Alice"]\n\n[Illustration:\n\n          ALICE\'S·ADVENTURES\n          IN·WONDERLAND\n  

In [0]:
alice_book_first= sentences[30:1100]

In [13]:
!wget http://www.gutenberg.org/files/46/46-0.txt -O christmas_carol_book.txt

--2019-01-22 07:41:10--  http://www.gutenberg.org/files/46/46-0.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 182057 (178K) [text/plain]
Saving to: ‘christmas_carol_book.txt’


2019-01-22 07:41:11 (623 KB/s) - ‘christmas_carol_book.txt’ saved [182057/182057]



In [0]:
with open('christmas_carol_book.txt','r') as s:
  lines_s=s.read()

In [15]:
import re
pat = re.compile(r'([A-Z][^\.!?]*[\.!?])', re.M)
sentences = pat.findall(lines_s)
for s in sentences:
  print (s)

The Project Gutenberg EBook of A Christmas Carol, by Charles Dickens

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.
You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.
Title: A Christmas Carol
       A Ghost Story of Christmas

Author: Charles Dickens

Release Date: August 11, 2004 [EBook #46]
Last Updated: March 4, 2018

Language: English

Character set encoding: UTF-8

*** START OF THIS PROJECT GUTENBERG EBOOK A CHRISTMAS CAROL ***




Produced by Jose Menendez




A CHRISTMAS CAROL

IN PROSE
BEING
A Ghost Story of Christmas

by Charles Dickens



PREFACE

I HAVE endeavoured in this Ghostly little book,
to raise the Ghost of an Idea, which shall not put my
readers out of humour with themselves, with each other,
with the season, or with me.
May it haunt their houses
pleasantly, and no one wish to lay it.
Their faithful Friend and Servant,
           

In [16]:
sentences[8:1008]

['There is no doubt\nwhatever about that.',
 'The register of his burial was\nsigned by the clergyman, the clerk, the undertaker,\nand the chief mourner.',
 "Scrooge signed it: and\nScrooge's name was good upon 'Change, for anything he\nchose to put his hand to.",
 'Old Marley was as dead as a\ndoor-nail.',
 'Mind!',
 "I don't mean to say that I know, of my\nown knowledge, what there is particularly dead about\na door-nail.",
 'I might have been inclined, myself, to\nregard a coffin-nail as the deadest piece of ironmongery\nin the trade.',
 "But the wisdom of our ancestors\nis in the simile; and my unhallowed hands\nshall not disturb it, or the Country's done for.",
 'You\nwill therefore permit me to repeat, emphatically, that\nMarley was as dead as a door-nail.',
 'Scrooge knew he was dead?',
 'Of course he did.',
 'How could it be otherwise?',
 "Scrooge and he were\npartners for I don't know how many years.",
 'Scrooge\nwas his sole executor, his sole administrator, his sole\nassign,

In [0]:
carol_book_second= sentences[8:1008]

In [18]:
!wget http://www.gutenberg.org/cache/epub/345/pg345.txt -O dracula_book.txt

--2019-01-22 07:41:16--  http://www.gutenberg.org/cache/epub/345/pg345.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 883160 (862K) [text/plain]
Saving to: ‘dracula_book.txt’


2019-01-22 07:41:16 (1.68 MB/s) - ‘dracula_book.txt’ saved [883160/883160]



In [0]:
with open('dracula_book.txt','r') as t:
  lines_t=t.read()

In [20]:
import re
pat = re.compile(r'([A-Z][^\.!?]*[\.!?])', re.M)
sentences = pat.findall(lines_t)
for s in sentences:
  print (s)

The Project Gutenberg EBook of Dracula, by Bram Stoker

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.
You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.
Title: Dracula

Author: Bram Stoker

Release Date: August 16, 2013 [EBook #345]

Language: English


*** START OF THIS PROJECT GUTENBERG EBOOK DRACULA ***




Produced by Chuck Greif and the Online Distributed
Proofreading Team at http://www.
This file was
produced from images generously made available by The
Internet Archive)







                                DRACULA





                                DRACULA

                                  _by_

                              Bram Stoker

                        [Illustration: colophon]

                                NEW YORK

                            GROSSET & DUNLAP

                              _Publishers_

      Copyright, 1897,

In [21]:
sentences[30:1030]

['Mina.',
 'I asked the\nwaiter, and he said it was called "paprika hendl," and that, as it was a\nnational dish, I should be able to get it anywhere along the\nCarpathians.',
 "I found my smattering of German very useful here; indeed, I\ndon't know how I should be able to get on without it.",
 'Having had some time at my disposal when in London, I had visited the\nBritish Museum, and made search among the books and maps in the library\nregarding Transylvania; it had struck me that some foreknowledge of the\ncountry could hardly fail to have some importance in dealing with a\nnobleman of that country.',
 'I find that the district he named is in the\nextreme east of the country, just on the borders of three states,\nTransylvania, Moldavia and Bukovina, in the midst of the Carpathian\nmountains; one of the wildest and least known portions of Europe.',
 'I was\nnot able to light on any map or work giving the exact locality of the\nCastle Dracula, as there are no maps of this country as ye

In [0]:
dracula_book_third= sentences[30:1030]

In [0]:
x_train = []
y_train = []
 
for lines in alice_book_first:
    x_train.append(lines)
    y_train.append(0)
    
for lines in carol_book_second:
    x_train.append(lines)
    y_train.append(1)

for lines in dracula_book_third:
    x_train.append(lines)
    y_train.append(2)
    
 

**X train and Y train**

In [24]:
print(len(x_train))
print(len(y_train))

3070
3070


Tokenize the documents, create a word index (word -> number).

In [25]:
max_len = 20
num_words = 1000
from keras.preprocessing.text import Tokenizer
# Fit the tokenizer on the training data
t = Tokenizer(num_words=num_words)
t.fit_on_texts(x_train)

Using TensorFlow backend.


In [26]:
print(t.word_index)



Here's how we vectorize a document.

In [27]:
vectorized = t.texts_to_sequences([alice_book_first[835]])
print(vectorized)

[[6, 453, 364, 79, 63, 654, 107]]


Apply padding if necessary.

In [0]:
from keras.preprocessing.sequence import pad_sequences
padded = pad_sequences(vectorized, maxlen=max_len, padding='post')

In [29]:
padded

array([[  6, 453, 364,  79,  63, 654, 107,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0]], dtype=int32)

We will save the word index in metadata. Later, we'll use it to convert words typed in the browser to numbers for prediction.

In [0]:
metadata = {
  'word_index': t.word_index,
  'max_len': max_len,
  'vocabulary_size': num_words,
}

Define a model.

In [31]:
embedding_size = 8
n_classes = 3
epochs = 1

import keras
model = keras.Sequential()
model.add(keras.layers.Embedding(num_words, embedding_size, input_shape=(max_len,)))
model.add(keras.layers.LSTM(16,return_sequences = True))
model.add(keras.layers.LSTM(16))
model.add(keras.layers.Dense(3, activation='softmax'))
model.compile('adam', 'sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 20, 8)             8000      
_________________________________________________________________
lstm_1 (LSTM)                (None, 20, 16)            1600      
_________________________________________________________________
lstm_2 (LSTM)                (None, 16)                2112      
_________________________________________________________________
dense_1 (Dense)              (None, 3)                 51        
Total params: 11,763
Trainable params: 11,763
Non-trainable params: 0
_________________________________________________________________


Prepare some training data.

In [32]:
x_train = t.texts_to_sequences(x_train)
x_train = pad_sequences(x_train, maxlen=max_len, padding='post')
print(x_train)

[[  6 304 385 ...   0   0   0]
 [  1 914 653 ...   0   0   0]
 [  1 556   0 ...   0   0   0]
 ...
 [ 23 130   5 ...   0   0   0]
 [ 52  79   1 ...   0   0   0]
 [ 16   6 419 ...   0   0   0]]


In [33]:
model.fit(x_train, y_train, epochs=epochs)

Epoch 1/1


<keras.callbacks.History at 0x7fec100d0be0>

Demo using the model to make predictions.

In [34]:
test_example = "Alice took up the fan and gloves, and, as the hall was very hot, she kept fanning herself all the time she went on talking!"
x_test = t.texts_to_sequences([test_example])
x_test = pad_sequences(x_test, maxlen=max_len, padding='post')
print(x_test)

[[530   2 575   2  11   1 310   9  42 413  14 315 116  27   1  59  14  60
   20 321]]


In [35]:
preds = model.predict(x_test)
print(preds)
import numpy as np
print(np.argmax(preds))

[[0.43711308 0.42073983 0.14214708]]
0


Convert the model

In [36]:
import json
import tensorflowjs as tfjs

metadata_json_path = os.path.join(MODEL_DIR, 'metadata.json')
json.dump(metadata, open(metadata_json_path, 'wt'))
tfjs.converters.save_keras_model(model, MODEL_DIR)
print('\nSaved model artifcats in directory: %s' % MODEL_DIR)


Saved model artifcats in directory: /content/rahulvasaikar.github.io/dljsonfly/model_js


Write an index.html and an index.js file configured to load our model.

In [0]:
index_js = """
const HOSTED_URLS = {
  model:
      'model_js/model.json',
  metadata:
      'model_js/metadata.json'
};

const examples = {
  'example1':
      'Alice took up the fan and gloves, and, as the hall was very hot, she kept fanning herself all the time she went on talking!',
  'example2':
      'Bob, with a sudden declension in his high spirits, for he had been Tims blood horse all the way from church, and had come home rampant.',
  'example3':
      'Sometimes we saw little towns or castles on the top of steep hills such as we see in old missals'      
};

function status(statusText) {
  console.log(statusText);
  document.getElementById('status').textContent = statusText;
}

function showMetadata(metadataJSON) {
  document.getElementById('vocabularySize').textContent =
      metadataJSON['vocabulary_size'];
  document.getElementById('maxLen').textContent =
      metadataJSON['max_len'];
}

function settextField(text, predict) {
  const textField = document.getElementById('text-entry');
  textField.value = text;
  doPredict(predict);
}

function setPredictFunction(predict) {
  const textField = document.getElementById('text-entry');
  textField.addEventListener('input', () => doPredict(predict));
}

function disableLoadModelButtons() {
  document.getElementById('load-model').style.display = 'none';
}

function doPredict(predict) {
  const textField = document.getElementById('text-entry');
  const result = predict(textField.value);
  score_string = "Class scores: ";
  for (var x in result.score) {
    score_string += x + " ->  " + result.score[x].toFixed(3) + ", "
  }
  //console.log(score_string);
  status(
      score_string + ' elapsed: ' + result.elapsed.toFixed(3) + ' ms)');
}

function prepUI(predict) {
  setPredictFunction(predict);
  const testExampleSelect = document.getElementById('example-select');
  testExampleSelect.addEventListener('change', () => {
    settextField(examples[testExampleSelect.value], predict);
  });
  settextField(examples['example1'], predict);
}

async function urlExists(url) {
  status('Testing url ' + url);
  try {
    const response = await fetch(url, {method: 'HEAD'});
    return response.ok;
  } catch (err) {
    return false;
  }
}

async function loadHostedPretrainedModel(url) {
  status('Loading pretrained model from ' + url);
  try {
    const model = await tf.loadModel(url);
    status('Done loading pretrained model.');
    disableLoadModelButtons();
    return model;
  } catch (err) {
    console.error(err);
    status('Loading pretrained model failed.');
  }
}

async function loadHostedMetadata(url) {
  status('Loading metadata from ' + url);
  try {
    const metadataJson = await fetch(url);
    const metadata = await metadataJson.json();
    status('Done loading metadata.');
    return metadata;
  } catch (err) {
    console.error(err);
    status('Loading metadata failed.');
  }
}

class Classifier {

  async init(urls) {
    this.urls = urls;
    this.model = await loadHostedPretrainedModel(urls.model);
    await this.loadMetadata();
    return this;
  }

  async loadMetadata() {
    const metadata =
        await loadHostedMetadata(this.urls.metadata);
    showMetadata(metadata);
    this.maxLen = metadata['max_len'];
    console.log('maxLen = ' + this.maxLen);
    this.wordIndex = metadata['word_index']
  }

  predict(text) {
    // Convert to lower case and remove all punctuations.
    const inputText =
        text.trim().toLowerCase().replace(/(\.|\,|\!)/g, '').split(' ');
    // Look up word indices.
    const inputBuffer = tf.buffer([1, this.maxLen], 'float32');
    for (let i = 0; i < inputText.length; ++i) {
      const word = inputText[i];
      inputBuffer.set(this.wordIndex[word], 0, i);
      //console.log(word, this.wordIndex[word], inputBuffer);
    }
    const input = inputBuffer.toTensor();
    //console.log(input);

    status('Running inference');
    const beginMs = performance.now();
    const predictOut = this.model.predict(input);
    //console.log(predictOut.dataSync());
    const score = predictOut.dataSync();//[0];
    predictOut.dispose();
    const endMs = performance.now();

    return {score: score, elapsed: (endMs - beginMs)};
  }
};

async function setup() {
  if (await urlExists(HOSTED_URLS.model)) {
    status('Model available: ' + HOSTED_URLS.model);
    const button = document.getElementById('load-model');
    button.addEventListener('click', async () => {
      const predictor = await new Classifier().init(HOSTED_URLS);
      prepUI(x => predictor.predict(x));
    });
    button.style.display = 'inline-block';
  }

  status('Standing by.');
}

setup();
"""

In [0]:
index_html = """
<!doctype html>

<body>
  <style>
    #textfield {
      font-size: 120%;
      width: 60%;
      height: 200px;
    }
  </style>
  <h1>
    Title
  </h1>
  <hr>
  <div class="create-model">
    <button id="load-model" style="display:none">Load model</button>
  </div>
  <div>
    <div>
      <span>Vocabulary size: </span>
      <span id="vocabularySize"></span>
    </div>
    <div>
      <span>Max length: </span>
      <span id="maxLen"></span>
    </div>
  </div>
  <hr>
  <div>
    <select id="example-select" class="form-control">
      <option value="example1">Alice's Adventures in Wonderland</option>
      <option value="example2">Carol</option>
      <option value="example3">Dracula</option>
    </select>
  </div>
  <div>
    <textarea id="text-entry"></textarea>
  </div>
  <hr>
  <div>
    <span id="status">Standing by.</span>
  </div>

  <script src='https://cdn.jsdelivr.net/npm/@tensorflow/tfjs/dist/tf.min.js'></script>
  <script src='index.js'></script>
</body>
"""

In [0]:
with open('index.html','w') as f:
  f.write(index_html)
  
with open('index.js','w') as f:
  f.write(index_js)

In [40]:
!ls

alice_book.txt		  dracula_book.txt  index.js
christmas_carol_book.txt  index.html	    model_js


Commit and push everything. Note: we're storing large binary files in GitHub, this isn't ideal (if you want to deploy a model down the road, better to host it in a cloud storage bucket).

In [41]:
!git add . 
!git commit -m "colab -> github"
!git push https://{USER_NAME}:{TOKEN}@github.com/{USER_NAME}/{USER_NAME}.github.io/ master

[master c632df9] colab -> github
 1 file changed, 0 insertions(+), 0 deletions(-)
 rewrite dljsonfly/model_js/group1-shard1of1 (98%)
Counting objects: 5, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (5/5), 43.13 KiB | 14.38 MiB/s, done.
Total 5 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.[K
To https://github.com/rahulvasaikar/rahulvasaikar.github.io/
   10f7724..c632df9  master -> master


All done! Hopefully everything worked. You may need to wait a few moments for the changes to appear in your site. If not working, check the JavaScript console for errors (in Chrome: View -> Developer -> JavaScript Console).

In [42]:
print("Now, visit https://%s.github.io/%s/" % (USER_NAME, SITE_NAME))

Now, visit https://rahulvasaikar.github.io/dljsonfly/


If you are debugging and Chrome is failing to pick up your changes, though you've verified they're present in your GitHub repo, see the second answer to: https://superuser.com/questions/89809/how-to-force-refresh-without-cache-in-google-chrome