<a href="https://colab.research.google.com/github/tomdyer10/wine_expert/blob/master/Generating_classification_profiles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Let's attempt to build a generative model which creates descriptions of each wine regions characteristics.

I will first build a 'pseudo' GAN which generates many 3-grams and selects the most applicable for the target class. This will be repeated 10 times to give a 30 word long regional description.

In [0]:
from fastai.text import *
import pandas as pd 
import numpy as np

In [0]:
data_lm = load_data('drive/My Drive/wine_reviews', file='data/region_clas_data_lm')
data_clas = load_data('drive/My Drive/wine_reviews', file='data/region_clas_data_clas')

In [0]:
classifier = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
classifier.load('region_classifier/fifth')

Downloading https://s3.amazonaws.com/fast-ai-modelzoo/wt103-fwd


RNNLearner(data=TextClasDataBunch;

Train: LabelList (86085 items)
x: TextList
xxbos xxmaj aromas include tropical fruit , broom , brimstone and dried herb . xxmaj the palate is n't overly expressive , offering unripened apple , citrus and dried sage alongside brisk acidity .,xxbos xxmaj tart and snappy , the flavors of lime flesh and rind dominate . xxmaj some green pineapple pokes through , with crisp acidity underscoring the flavors . xxmaj the wine was all stainless - steel fermented .,xxbos xxmaj pineapple rind , lemon pith and orange blossom start off the aromas . xxmaj the palate is a bit more opulent , with notes of honey - drizzled guava and mango giving way to a slightly astringent , semidry finish .,xxbos xxmaj much like the regular bottling from 2012 , this comes across as rather rough and tannic , with rustic , earthy , herbal characteristics . xxmaj nonetheless , if you think of it as a pleasantly unfussy country wine , it 's a good companion to a hearty winter stew .,xxb

In [0]:
lang_model = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3)

In [0]:
lang_model.load_encoder('regional_encoder')

LanguageLearner(data=TextLMDataBunch;

Train: LabelList (86085 items)
x: LMTextList
xxbos xxmaj aromas include tropical fruit , broom , brimstone and dried herb . xxmaj the palate is n't overly expressive , offering unripened apple , citrus and dried sage alongside brisk acidity .,xxbos xxmaj tart and snappy , the flavors of lime flesh and rind dominate . xxmaj some green pineapple pokes through , with crisp acidity underscoring the flavors . xxmaj the wine was all stainless - steel fermented .,xxbos xxmaj pineapple rind , lemon pith and orange blossom start off the aromas . xxmaj the palate is a bit more opulent , with notes of honey - drizzled guava and mango giving way to a slightly astringent , semidry finish .,xxbos xxmaj much like the regular bottling from 2012 , this comes across as rather rough and tannic , with rustic , earthy , herbal characteristics . xxmaj nonetheless , if you think of it as a pleasantly unfussy country wine , it 's a good companion to a hearty winter stew 

# Psuedo-GAN

Now we have loaded the models, lets define a simple loop to create our class descriptions.

We are going to make 100 language model predictions per round and select the highest scoring overall phrase for the selected class.

In [0]:
#define selected class
data_clas.classes[0]

'France - Alsace'

In [0]:
base = 'Wine from this region'
rounds = 5
predictions_per_round = 100
target_class_idx = 0

for i in range(0, rounds):
  text_preds = [lang_model.predict(base, 3) for x in range(0, predictions_per_round)]
  target_class_preds = [classifier.predict(x)[2][target_class_idx] for x in text_preds]
  base = [x for y,x in sorted(zip(target_class_preds, text_preds), reverse=True)][0]

print(base)

Wine from this region , bone dry and bone dry , you 'll try to find Prosecco


Doesn't make much sense, but the output seems to focus on dry wines. A quick check of the France - Alsace region wikipedia page suggests that it mainly produces dry white wines. 

In [0]:
classifier.predict(base)

(Category France - Alsace,
 tensor(0),
 tensor([9.9717e-01, 5.5278e-08, 5.7698e-09, 7.3604e-13, 9.5528e-08, 5.0648e-09,
         1.3603e-16, 8.5859e-10, 2.4294e-10, 1.5693e-14, 3.0071e-08, 3.9155e-07,
         1.9107e-06, 1.1499e-08, 3.0257e-08, 1.5858e-08, 3.0872e-09, 6.8109e-09,
         1.7725e-08, 1.1407e-03, 1.6897e-03, 1.3916e-07, 3.0922e-12, 6.0380e-13,
         6.7875e-08, 4.5105e-14, 2.6126e-07]))

In [0]:
base = 'This wine'
rounds = 5
predictions_per_round = 50
target_class_idx = 7

for i in range(0, rounds):
  text_preds = [lang_model.predict(base, 3) for x in range(0, predictions_per_round)]
  target_class_preds = [classifier.predict(x)[2][target_class_idx] for x in text_preds]
  base = [x for y,x in sorted(zip(target_class_preds, text_preds), reverse=True)][0]

print(base)

This wine , highly aromatic , with all five Xxbos Associated point Nouveau


In [0]:
data_clas.classes[7]

'France - Loire Valley'

In [0]:
base = 'This wine'
rounds = 5
predictions_per_round = 50
target_class_idx = 20

for i in range(0, rounds):
  text_preds = [lang_model.predict(base, 3) for x in range(0, predictions_per_round)]
  target_class_preds = [classifier.predict(x)[2][target_class_idx] for x in text_preds]
  base = [x for y,x in sorted(zip(target_class_preds, text_preds), reverse=True)][0]

print(base)

This wine consistently produces basic Satisfy , while Lodi , Monterey


In [0]:
data_clas.classes[20]

'US - California'

This one picks up two regions/places in california 