# Intro
Prepare the *Wine Spectator* Top 100 review set for training a text classifier model.
Reference: [Dataset Splitting Best Practices in Python](https://www.kdnuggets.com/2020/05/dataset-splitting-best-practices-python.html)

# Load the *Wine Spectator* Top 100 model and test

## File setup

In [1]:
import os
from flask import Flask, render_template, url_for, flash, redirect, request
# ML Imports
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import losses
from tensorflow.keras import preprocessing
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
from tensorflow.keras.models import load_model
# text vectorization workaround
import re
import string

In [2]:
# Ensure TensorFlow version is at least 2.4.0
print(tf.__version__)

2.4.1


In [3]:
app = Flask(__name__)

## Load model

In [4]:
# see workaround: https://github.com/tensorflow/tensorflow/issues/45231
# text vectorization workaround
@tf.keras.utils.register_keras_serializable()
def custom_standardization(input_data):
   lowercase = tf.strings.lower(input_data)
   output = tf.strings.regex_replace(lowercase,
                                     '[%s]' % re.escape(string.punctuation),
                                     '')
   return output

In [5]:
# import the model
# basedir = os.path.abspath(os.path.dirname(__file__))
wine_classifier = tf.keras.models.load_model('./model/wine_classifier')

In [6]:
print(wine_classifier)

<tensorflow.python.keras.engine.sequential.Sequential object at 0x7fe0b83ae518>


In [7]:
print(wine_classifier.summary())

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
text_vectorization (TextVect (None, 50)                0         
_________________________________________________________________
sequential (Sequential)      (None, 6)                 160118    
_________________________________________________________________
activation (Activation)      (None, 6)                 0         
Total params: 160,118
Trainable params: 160,118
Non-trainable params: 0
_________________________________________________________________
None


In [8]:
sample_text = ["Dusty plum and berry aromas are generic and easygoing. A plump and jammy palate is chunky and a bit scratchy, while this everyday Malbec tastes of saucy plum and berry fruits prior to a finish that leaves lighter raspberry and red plum notes. (Credit: Wine Enthusiast)"]

In [9]:
examples = [
    # https://www.winemag.com/buying-guide/far-niente-2018-cabernet-sauvignon-napa-valley/
    'This memorable Cabernet Sauvignon has small amounts of Petit Verdot, Cabernet Franc, Merlot and Malbec blended in. Aged 17 months in a majority of new French oak, it unfurls flavors of red fruit, dried herb and clove over a core of youthful tannin and spicy oak. Best from 2028–2038.',
    # https://www.winemag.com/buying-guide/wohlmuth-2018-ried-hochsteinriegl-sauvignon-blanc-sudsteiermark/
    'An initial hint of crushed ivy and citrus leaf peeks through on the nose. The palate then shows green-tinged ripeness, as if a juicy Mirabelle were spritzed with lime. All is bedded on a light-footed yet profound palate. It offers a gorgeous combination of smoothness and freshness. Drink by 2040. ANNE KREBIEHL MW',
    # https://www.winemag.com/buying-guide/g-h-mumm-2013-brut-millesime-champagne/
    'A Pinot Noir-dominated Champagne, this is richly textured and showing attractive signs of maturity. The toastiness is balanced by crispness with a tangy lemon flavor that broadens into ripe apples. Drink this seductive wine now. ROGER VOSS'
]

In [10]:
wine_classifier.predict(examples)

array([[0.5006297 , 0.73010963, 0.5000723 , 0.500203  , 0.50028676,
        0.5000135 ],
       [0.53989583, 0.5466102 , 0.5098959 , 0.5355584 , 0.6121535 ,
        0.5036642 ],
       [0.51753193, 0.5353362 , 0.5044297 , 0.52936447, 0.6566458 ,
        0.501142  ]], dtype=float32)

In [11]:
wine_classifier.predict(sample_text)

array([[0.5003417 , 0.73068464, 0.5000357 , 0.5000711 , 0.50001603,
        0.5000107 ]], dtype=float32)