To run this model directly in the browser with zero setup, open it in [Colab here](https://colab.research.google.com/github/sararob/keras-wine-model/blob/master/predictions.ipynb).

In [0]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

In [0]:
import itertools
import os
import math
import numpy as np
import pandas as pd
import tensorflow as tf
import pickle

from sklearn.preprocessing import LabelEncoder
from tensorflow import keras
layers = keras.layers

# This code was tested with TensorFlow v1.7
print("You have TensorFlow version", tf.__version__)

You have TensorFlow version 1.7.0


In [0]:
# Load our model
!wget 'https://storage.googleapis.com/keras_wine/final_wine_model.h5'
model = keras.models.load_model('final_wine_model.h5')

--2018-05-09 18:32:50--  https://storage.googleapis.com/keras_wine/final_wine_model.h5
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.197.128, 2607:f8b0:400e:c03::80
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.197.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 38208016 (36M) [application/octet-stream]
Saving to: ‘final_wine_model.h5.2’


2018-05-09 18:32:51 (124 MB/s) - ‘final_wine_model.h5.2’ saved [38208016/38208016]



In [0]:
# Load our vocabulary tokenizer and variety encoder
!wget 'https://storage.googleapis.com/keras_wine/word_tokenizer.p'
tokenizer = pickle.load(open('word_tokenizer.p', 'rb'))

!wget 'https://storage.googleapis.com/keras_wine/variety_encoder.p'
encoder = pickle.load(open('variety_encoder.p', 'rb'))

--2018-05-09 18:32:53--  https://storage.googleapis.com/keras_wine/word_tokenizer.p
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.197.128, 2607:f8b0:400e:c03::80
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.197.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1235247 (1.2M) [application/octet-stream]
Saving to: ‘word_tokenizer.p.2’


2018-05-09 18:32:53 (66.0 MB/s) - ‘word_tokenizer.p.2’ saved [1235247/1235247]

--2018-05-09 18:32:54--  https://storage.googleapis.com/keras_wine/variety_encoder.p
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.197.128, 2607:f8b0:400e:c03::80
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.197.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1029 (1.0K) [application/octet-stream]
Saving to: ‘variety_encoder.p.1’


2018-05-09 18:32:54 (15.2 MB/s) - ‘variety_encoder.p.1’ saved [1029/1029]





In [0]:
# Let's make predictions on some raw data

# Enter wine descriptions here
test_descriptions = [
    'From 18-year-old vines, this supple well-balanced effort blends flavors of mocha, cherry, vanilla and breakfast tea. Superbly integrated and delicious even at this early stage, this wine seems destined for a long and savory cellar life. Drink now through 2028.',
    'The Quarts de Chaume, the four fingers of land that rise above the Layon Valley, are one of the pinnacles of sweet wines in the Loire. Showing botrytis and layers of dryness over the honey and peach jelly flavors, but also has great freshness. The aftertaste just lasts.',
    'Nicely oaked blackberry, licorice, vanilla and charred aromas are smooth and sultry. This is an outstanding wine from an excellent year. Forward barrel-spice and mocha flavors adorn core blackberry and raspberry fruit, while this runs long and tastes vaguely chocolaty on the velvety finish. Enjoy this top-notch Tempranillo through 2030.',
    'Bright, light oak shadings dress up this medium-bodied wine, complementing the red cherry and strawberry flavors. Its fresh, fruity and not very tannic—easy to drink and enjoy.',
    'This wine features black cherry, blackberry, blueberry with aromas of black licorice and earth. Ending with a creamy vanilla finish.'
]

# Enter the corresponding varieties here
test_varieties = [
    'Pinot Noir',
    'Chenin Blanc',
    'Tempranillo',
    'Sauvignon Blanc',
    'Syrah'
]

# Enter the corresponding prices here
labels = [
    48,
    152,
    80,
    10,
    23
]

In [0]:
# Vocab and variety lookup
vocab_lookup = tokenizer.word_index
first_20_words = {k: vocab_lookup[k] for k in list(vocab_lookup)[:20]}
print("Sample vocab\n", first_20_words, "\n")
print("Variety encoder\n", encoder.classes_, "\n")

Sample vocab
 {'and': 1, 'the': 2, 'a': 3, 'of': 4, 'with': 5, 'this': 6, 'is': 7, 'flavors': 8, 'wine': 9, 'in': 10, 'to': 11, 'it': 12, 'fruit': 13, 'but': 14, 'on': 15, 'that': 16, "it's": 17, 'finish': 18, 'cherry': 19, 'aromas': 20} 

Variety encoder
 ['Albariño' 'Barbera' 'Bordeaux-style Red Blend'
 'Bordeaux-style White Blend' 'Cabernet Franc' 'Cabernet Sauvignon'
 'Carmenère' 'Champagne Blend' 'Chardonnay' 'Chenin Blanc'
 'Corvina, Rondinella, Molinara' 'Gewürztraminer' 'Grenache'
 'Grüner Veltliner' 'Malbec' 'Merlot' 'Nebbiolo' 'Petite Sirah'
 'Pinot Grigio' 'Pinot Gris' 'Pinot Noir' 'Port' 'Portuguese Red'
 'Portuguese White' 'Prosecco' 'Red Blend' 'Rhône-style Red Blend'
 'Riesling' 'Rosé' 'Sangiovese' 'Sangiovese Grosso' 'Sauvignon Blanc'
 'Shiraz' 'Sparkling Blend' 'Syrah' 'Tempranillo' 'Tempranillo Blend'
 'Viognier' 'White Blend' 'Zinfandel'] 



In [0]:
# Wide model features
bow_description = tokenizer.texts_to_matrix(test_descriptions)
variety = encoder.transform(test_varieties)
variety = keras.utils.to_categorical(variety, len(encoder.classes_))

# Print an example for the model inputs
print("Bag of words matrix")
print(bow_description[0], "\n")
print("Variety matrix")
print(variety[0], "\n")

Bag of words matrix
[0. 1. 0. ... 0. 0. 0.] 

Variety matrix
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] 



In [0]:
# Deep model feature: word embeddings of wine descriptions
embed_description = tokenizer.texts_to_sequences(test_descriptions)
embed_description = keras.preprocessing.sequence.pad_sequences(
    embed_description, maxlen=170, padding="post")

print(embed_description[0])

[  25 1475  284  347  504    6  344   55   85  552 1007    8    4  226
   19   52    1 3614  360 2997  476    1  150  232   60    6  887  871
    6    9  218 4282   21    3   86    1  245  318 1282   32   39   80
 7083    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0]


In [0]:
predictions = model.predict([bow_description, variety] + [embed_description])

In [0]:
for i in range(len(test_descriptions)):
    val = predictions[i]
    print(test_descriptions[i])
    print('Predicted: ', val[0], 'Actual: ', labels[i], '\n')

From 18-year-old vines, this supple well-balanced effort blends flavors of mocha, cherry, vanilla and breakfast tea. Superbly integrated and delicious even at this early stage, this wine seems destined for a long and savory cellar life. Drink now through 2028.
Predicted:  46.476532 Actual:  48 

The Quarts de Chaume, the four fingers of land that rise above the Layon Valley, are one of the pinnacles of sweet wines in the Loire. Showing botrytis and layers of dryness over the honey and peach jelly flavors, but also has great freshness. The aftertaste just lasts.
Predicted:  117.74728 Actual:  152 

Nicely oaked blackberry, licorice, vanilla and charred aromas are smooth and sultry. This is an outstanding wine from an excellent year. Forward barrel-spice and mocha flavors adorn core blackberry and raspberry fruit, while this runs long and tastes vaguely chocolaty on the velvety finish. Enjoy this top-notch Tempranillo through 2030.
Predicted:  90.468124 Actual:  80 

Bright, light oak sh