### Latent Dirichlet Allocation (LDA): model selection and evaluation

Modified from [Evaluate Topic Models: Latent Dirichlet Allocation (LDA)](https://towardsdatascience.com/evaluate-topic-model-in-python-latent-dirichlet-allocation-lda-7d57484bb5d0).

LDA is used to cluster documents in particular topics. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions. 

* Each document is modeled as a multinomial distribution of topics and each topic is modeled as a multinomial distribution of words.
* LDA assumes that the every chunk of text we feed into it will contain words that are somehow related. Therefore choosing the right corpus of data is crucial. 
* It also assumes documents are produced from a mixture of topics. Those topics then generate words based on their probability distribution. 

We will use [Gensim](https://radimrehurek.com/gensim/) for topic modelling with LDA. Gensim is a popular open-source library for unsupervised topic modeling and natural language processing.

Gensim is implemented in Python and Cython, the latter for improved performance. Gensim is designed to handle large text collections using data streaming and incremental online algorithms, which differentiates it from most other machine learning software packages that target only in-memory processing. 

**We will evaluate and select topic models using a measure of topic coherence.**

### Why evaluate topic models?

Probabilistic topic models, such as LDA, are popular tools for text analysis, providing both a predictive and latent topic representation of the corpus. There is a longstanding assumption that the latent space discovered by these models is meaningful and useful, but evaluating such assumptions is challenging due to its unsupervised training process. Since there is a no-gold standard list of topics to compare against every corpus, we cannot compute the normal performance metrics we do for supervised learning.

However, it is still critically important to identify if a trained model is objectively good or bad, as well to compare different models/methods, and to do so, we need a quality measure. While implicit knowledge and "eyeballing" are popular, they are not objective approaches that can be applied systematically. We need something better, that perferably captures the model's quality in a single metric that can be maximized and compared. 

These are the commonly used approaches for evaluation:

**Eye Balling Models**
- Top N words
- Topics / Documents

**Intrinsic Evaluation Metrics**
- Capturing model semantics
- Topics interpretability

**Human Judgements**
- [What is a topic?](https://proceedings.neurips.cc/paper/2009/file/f92586a25bb3145facd64ab20fd554ff-Paper.pdf)

**Extrinsic Evaluation Metrics/Evaluation at task**
- Is model good at performing predefined tasks, such as clustering 

A big problem is that natural language is messy, ambiguous and full of subjective interpretation, and sometimes trying to de-ambiguite reduces the language to an unnatural form. Nevertheless, in order to use a single quality metric we will have to accept such risks.

### What is Topic Coherence?

Perplexity is often used as an example of an intrinsic evaluation measure. It comes from the language modeling community and aims to capture how surprised a model is of new data it has not seen before. It is measured as the normalized log-likelihood of a held-out test set.

Focussing on the log-likelihood part, you can think of the perplexity metric as measuring how probable some new unseen data is given the model that was learned earlier. That is to say, how well does the model represent or reproduce the statistics of the held-out data.

[However, past research has shown that predictive likelihood (or equivalently, perplexity) and human judgment are often not correlated, and even sometimes slightly anti-correlated.](http://qpleple.com/perplexity-to-evaluate-topic-models/) And that served as a motivation for more work trying to model the human judgment, and thus `Topic Coherence`.

The topic coherence concept combines a number of papers into one framework that allows evaluating the coherence of topics inferred by a topic model. But,

#### What is topic coherence?
Topic Coherence measures score a single topic by measuring the degree of semantic similarity between high scoring words in the topic. These measurements help distinguish between topics that are semantically interpretable topics and topics that are artifacts of statistical inference. But,

#### What is coherence?
A set of statements or facts is said to be coherent, if they support each other. Thus, a coherent fact set can be interpreted in a context that covers all or most of the facts. An example of a coherent fact set is "the game is a team sport", "the game is played with a ball", "the game demands great physical efforts"

### Coherence Measures

1. `C_v` measure is based on a sliding window, one-set segmentation of the top words and an indirect confirmation measure that uses normalized pointwise mutual information (NPMI) and the cosine similarity. See explainations in the paper by [Syed and Spruit](http://www.saf21.eu/wp-content/uploads/2017/09/5004a165.pdf) the [paper by Röder et al](https://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf) and in [this blog post](https://towardsdatascience.com/c%E1%B5%A5-topic-coherence-explained-fc70e2a85227).
2. `C_p` is based on a sliding window, one-preceding segmentation of the top words and the confirmation measure of Fitelson's coherence. See explainations in the [paper by Röder et al](https://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf).
3. [`C_uci` measure is based on a sliding window and the pointwise mutual information (PMI) of all word pairs of the given top words.](http://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf)
4. [`C_umass` is based on document cooccurrence counts, a one-preceding segmentation and a logarithmic conditional probability as confirmation measure.](http://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf)
5. [`C_npmi` is an enhanced version of the C_uci coherence using the normalized pointwise mutual information (NPMI).](http://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf)
6. `C_a` is baseed on a context window, a pairwise comparison of the top words and an indirect confirmation measure that uses normalized pointwise mutual information (NPMI) and the cosine similarity.



### Model Implementation
1. Loading Data
2. Data Cleaning
3. Phrase Modeling: Bi-grams and Tri-grams
4. Data Transformation: Corpus and Dictionary
5. Base Model
6. Hyper-parameter Tuning
7. Final model
8. Visualize Results

** **

For this tutorial, we’ll use the dataset of papers published in NeurIPS (NIPS) conference which is one of the most prestigious yearly events in the machine learning community. The CSV data file contains information on the different NeurIPS papers that were published from 1987 until 2016 (29 years!). These papers discuss a wide variety of topics in machine learning, from neural networks to optimization methods, and many more.

<img src="https://s3.amazonaws.com/assets.datacamp.com/production/project_158/img/nips_logo.png" alt="The logo of NIPS (Neural Information Processing Systems)">

Let’s start by looking at the content of the file

** **
#### Step 0: Install the latest version of Gensim
** **

An old (and buggy) version of Gensim is installed by default on Google Colab. Please upgrade to the latest version using the command above and restart the runtime so that the new version is loaded.

While we're at it, let's also install [pyLDAvis](https://pyldavis.readthedocs.io/en/latest/readme.html#:~:text=pyLDAvis%20is%20designed%20to%20help,an%20interactive%20web%2Dbased%20visualization.), a package that helps users interpret the topics in a topic model. The package extracts information from an LDA topic model to inform an interactive web-based visualization.

In [18]:
!pip install -U gensim pyldavis




[notice] A new release of pip available: 22.2.2 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [19]:
import gensim
from packaging import version
assert version.parse("3.7") < version.parse(gensim.__version__)

** **
#### Step 1: Loading Data
** **

In [20]:
# from google.colab import files

# uploaded = files.upload()

# for fn in uploaded.keys():
#   print('User uploaded file "{name}" with length {length} bytes'.format(
#       name=fn, length=len(uploaded[fn])))

In [21]:
# Importing modules
import pandas as pd
import os

# Read data into papers
# bigger data files on canvas to get a better performing model
papers = pd.read_csv(R"C:\Users\darre\Downloads\papers_100.csv")

# Print head
papers.head()

Unnamed: 0.1,Unnamed: 0,paper_text
0,3081,Localizing Bugs in Program Executions\nwith Gr...
1,6184,Combinatorial Energy Learning for Image\nSegme...
2,4615,A multi-agent control framework for co-adaptat...
3,3936,Distributed Non-Stochastic Experts\n\nVarun Ka...
4,1986,Learning Rankings via Convex Hull Separation\n...


** **
#### Step 2: Data Cleaning
** **

Since the goal of this analysis is to perform topic modeling, we will solely focus on the text data from each paper, and drop other metadata columns

In [22]:
# Remove the columns
papers = papers.drop(columns=['Unnamed: 0'], axis=1)

# sample only 100 papers
# papers = papers.sample(100)

# Print out the first rows of papers
papers.head()

Unnamed: 0,paper_text
0,Localizing Bugs in Program Executions\nwith Gr...
1,Combinatorial Energy Learning for Image\nSegme...
2,A multi-agent control framework for co-adaptat...
3,Distributed Non-Stochastic Experts\n\nVarun Ka...
4,Learning Rankings via Convex Hull Separation\n...


##### Remove punctuation/lower casing

Next, let’s perform a simple preprocessing on the content of paper_text column to make them more amenable for analysis, and reliable results. To do that, we’ll use a regular expression to remove any punctuation, and then lowercase the text

In [23]:
# Load the regular expression library
import re

# Remove punctuation
papers['paper_text_processed'] = papers['paper_text'].map(lambda x: re.sub('[,\.!?]', '', x))

# Convert the titles to lowercase
papers['paper_text_processed'] = papers['paper_text_processed'].map(lambda x: x.lower())

# Print out the first rows of papers
papers['paper_text_processed'].head()

0    localizing bugs in program executions\nwith gr...
1    combinatorial energy learning for image\nsegme...
2    a multi-agent control framework for co-adaptat...
3    distributed non-stochastic experts\n\nvarun ka...
4    learning rankings via convex hull separation\n...
Name: paper_text_processed, dtype: object

In [24]:
papers.head()

Unnamed: 0,paper_text,paper_text_processed
0,Localizing Bugs in Program Executions\nwith Gr...,localizing bugs in program executions\nwith gr...
1,Combinatorial Energy Learning for Image\nSegme...,combinatorial energy learning for image\nsegme...
2,A multi-agent control framework for co-adaptat...,a multi-agent control framework for co-adaptat...
3,Distributed Non-Stochastic Experts\n\nVarun Ka...,distributed non-stochastic experts\n\nvarun ka...
4,Learning Rankings via Convex Hull Separation\n...,learning rankings via convex hull separation\n...


##### Tokenize words and further clean-up text

Let’s tokenize each sentence into a list of words, removing punctuations and unnecessary characters altogether.

In [25]:
import gensim
from gensim.utils import simple_preprocess

def sent_to_words(sentences):
    for sentence in sentences:
        yield(gensim.utils.simple_preprocess(str(sentence), deacc=True))  # deacc=True removes punctuations

data = papers.paper_text_processed.values.tolist()
data_words = list(sent_to_words(data))

print(data_words[:1][0][:30])

['localizing', 'bugs', 'in', 'program', 'executions', 'with', 'graphical', 'models', 'valentin', 'dallmeier', 'saarland', 'university', 'saarbruecken', 'germany', 'dallmeier', 'csuni', 'saarlandde', 'laura', 'dietz', 'max', 'planck', 'institute', 'for', 'computer', 'science', 'saarbruecken', 'germany', 'dietz', 'mpi', 'infmpgde']


** **
#### Step 3: Phrase Modeling: Bigram and Trigram Models
** **

Bigrams are two words frequently occurring together in the document. Trigrams are 3 words frequently occurring. Some examples in our example are: 'back_bumper', 'oil_leakage', 'maryland_college_park' etc.

Gensim's Phrases model can build and implement the bigrams, trigrams, quadgrams and more. The two important arguments to Phrases are min_count and threshold.

In [26]:
# Build the bigram and trigram models
bigram = gensim.models.Phrases(data_words, min_count=5, threshold=100) # higher threshold fewer phrases.
trigram = gensim.models.Phrases(bigram[data_words], threshold=100)  

# Faster way to get a sentence clubbed as a trigram/bigram
bigram_mod = gensim.models.phrases.Phraser(bigram)
trigram_mod = gensim.models.phrases.Phraser(trigram)

#### Remove Stopwords, Make Bigrams and Lemmatize

The phrase models are ready. Let’s define the functions to remove the stopwords, make trigrams and lemmatization and call them sequentially.

In [27]:
# NLTK Stop words
!pip install nltk
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords

stop_words = stopwords.words('english')
stop_words.extend(['from', 'subject', 're', 'edu', 'use'])




[notice] A new release of pip available: 22.2.2 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\darre\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [28]:
# Define functions for stopwords, bigrams, trigrams and lemmatization
def remove_stopwords(texts):
    return [[word for word in simple_preprocess(str(doc)) if word not in stop_words] for doc in texts]

def make_bigrams(texts):
    return [bigram_mod[doc] for doc in texts]

def make_trigrams(texts):
    return [trigram_mod[bigram_mod[doc]] for doc in texts]

def lemmatization(texts, allowed_postags=['NOUN', 'ADJ', 'VERB', 'ADV']):
    """https://spacy.io/api/annotation"""
    texts_out = []
    for sent in texts:
        doc = nlp(" ".join(sent)) 
        texts_out.append([token.lemma_ for token in doc if token.pos_ in allowed_postags])
    return texts_out

Let's call the functions in order.

But first download `en_core_web_sm`, a pre-trained English Spacy pipeline optimized for CPU.

In [29]:
!python -m spacy download en_core_web_sm

c:\Users\darre\AppData\Local\Programs\Python\Python310\python.exe: No module named spacy


In [31]:
!pip install spacy
# strange error, library might need a complicated install
# unchanged code should just work in colab
import spacy

# Remove Stop Words
data_words_nostops = remove_stopwords(data_words)

# Form Bigrams
data_words_bigrams = make_bigrams(data_words_nostops)

# Initialize spacy 'en' model, keeping only tagger component (for efficiency)
nlp = spacy.load("en_core_web_sm", disable=['parser', 'ner'])

# Do lemmatization keeping only noun, adj, vb, adv
data_lemmatized = lemmatization(data_words_bigrams, allowed_postags=['NOUN', 'ADJ', 'VERB', 'ADV'])

print(data_lemmatized[:1][0][:30])

Collecting spacy
  Downloading spacy-3.7.0-cp310-cp310-win_amd64.whl (12.1 MB)
     ---------------------------------------- 12.1/12.1 MB 8.7 MB/s eta 0:00:00
Collecting murmurhash<1.1.0,>=0.28.0
  Downloading murmurhash-1.0.10-cp310-cp310-win_amd64.whl (25 kB)
Collecting spacy-legacy<3.1.0,>=3.0.11
  Downloading spacy_legacy-3.0.12-py2.py3-none-any.whl (29 kB)
Collecting typer<0.10.0,>=0.3.0
  Downloading typer-0.9.0-py3-none-any.whl (45 kB)
     ---------------------------------------- 45.9/45.9 kB 2.4 MB/s eta 0:00:00
Collecting pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4
  Downloading pydantic-2.4.2-py3-none-any.whl (395 kB)
     -------------------------------------- 395.8/395.8 kB 8.2 MB/s eta 0:00:00
Collecting thinc<8.3.0,>=8.1.8
  Downloading thinc-8.2.1-cp310-cp310-win_amd64.whl (1.5 MB)
     ---------------------------------------- 1.5/1.5 MB 6.7 MB/s eta 0:00:00
Collecting catalogue<2.1.0,>=2.0.6
  Downloading catalogue-2.0.10-py3-none-any.whl (17 kB)
Collecting wasabi<1.2.0,>=0.9


[notice] A new release of pip available: 22.2.2 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip


OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

In [None]:
len(data_lemmatized)

** **
#### Step 4: Data transformation: Corpus and Dictionary
** **

The two main inputs to the LDA topic model are the dictionary (`id2word`) and the `corpus`. Let’s create them.

In [None]:
import gensim.corpora as corpora

# Create Dictionary
id2word = corpora.Dictionary(data_lemmatized)

# Create Corpus
texts = data_lemmatized

# Term Document Frequency
corpus = [id2word.doc2bow(text) for text in texts]

# View
print(corpus[:1][0][:30])

** **
#### Step 5: Base Model 
** **

We have everything required to train the base LDA model. In addition to the corpus and dictionary, we need to provide the number of topics as well. Apart from that, `alpha` and `eta` are hyperparameters that affect sparsity of the topics. According to the Gensim docs, both defaults to `1.0/num_topics` prior (we'll use default for the base model).

`chunksize` controls how many documents are processed at a time in the training algorithm. Increasing chunksize will speed up training, at least as long as the chunk of documents easily fit into memory.

`passes` controls how often we train the model on the entire corpus (set to 10). Another word for passes might be "epochs". `iterations` is somewhat technical, but essentially it controls how often we repeat a particular loop over each document. It is important to set the number of `passes` and `iterations` high enough.

In [None]:
# not sure but alpha and eta might be different alphabets. 

# Build LDA model
lda_model = gensim.models.LdaMulticore(corpus=corpus,
                                       id2word=id2word,
                                       num_topics=10, 
                                       random_state=100,
                                       chunksize=100,
                                       passes=10,
                                       per_word_topics=True)

** **
The above LDA model is built with 10 different topics where each topic is a combination of keywords and each keyword contributes a certain weightage to the topic.

We can see the keywords for each topic and the weightage(importance) of each keyword using `lda_model.print_topics()`

In [None]:
from pprint import pprint

# Print the Keyword in the 10 topics
pprint(lda_model.print_topics())
doc_lda = lda_model[corpus]

#### Compute Model Perplexity and Coherence Score

Let's calculate the baseline coherence score

In [None]:
from gensim.models import CoherenceModel

# Compute Coherence Score
coherence_model_c_v = CoherenceModel(model=lda_model, texts=data_lemmatized, dictionary=id2word, coherence='c_v')
coherence_model_u_mass = CoherenceModel(model=lda_model, corpus=corpus, dictionary=id2word, coherence='u_mass')
coherence_c_v = coherence_model_c_v.get_coherence()
print('Coherence Score: ', coherence_c_v)

coherence_u_mass = coherence_model_u_mass.get_coherence()
print('Coherence Score: ', coherence_u_mass)


** **
#### Step 6: Hyperparameter tuning
** **
First, let's differentiate between model hyperparameters and model parameters :

- `Model hyperparameters` can be thought of as settings for a machine learning algorithm that are tuned by the data scientist before training. Examples would be the number of trees in the random forest, or in our case, number of topics K

- `Model parameters` can be thought of as what the model learns during training, such as the weights for each word in a given topic.

Now that we have the baseline coherence score for the default LDA model, let's perform a series of sensitivity tests to help determine the following model hyperparameters: 
- Number of Topics (`k`)
- Dirichlet hyperparameter `alpha`: Document-Topic Density
- Dirichlet hyperparameter `eta`: Word-Topic Density

We'll perform these tests in sequence, one parameter at a time by keeping others constant and run them over the two difference validation corpus sets. We'll use `u_mass` as our choice of metric for performance comparison 

In [None]:
# supporting function
def compute_coherence_values(corpus, dictionary, k, alpha, eta):
    
    lda_model = gensim.models.LdaMulticore(corpus=corpus,
                                           id2word=dictionary,
                                           num_topics=k, 
                                           random_state=100,
                                           chunksize=100,
                                           passes=10,
                                           alpha=alpha,
                                           eta=eta)
    
    # coherence_model = CoherenceModel(model=lda_model, corpus=corpus, 
    #                                  dictionary=id2word, coherence='u_mass')
    
    coherence_model = CoherenceModel(model=lda_model, texts=data_lemmatized, 
                                     dictionary=id2word, coherence='c_v')    
    
    return coherence_model.get_coherence()

Let's call the function, and iterate it over the range of topics, alpha, and beta parameter values

In [None]:
import numpy as np
import tqdm

grid = {}
grid['Validation_Set'] = {}

# Topics range
min_topics = 5
max_topics = 13
step_size = 1
topics_range = range(min_topics, max_topics, step_size)

# Alpha parameters
alphas = list(np.arange(0.31, 1, 0.3))

# Eta parameters
etas = list(np.arange(0.31, 1, 0.3))

# Validation sets
num_of_docs = len(corpus)
valid_set = gensim.utils.ClippedCorpus(corpus, int(num_of_docs*0.75))

results = {'Number of topics': [],
           'Alpha': [],
           'Eta': [],
           'Coherence': []}

# Can take a long time to run
pbar = tqdm.tqdm(total=(len(etas)*len(alphas)*len(topics_range)))

# iterate through number of topics
for k in topics_range:
    # iterate through alpha values
    for alpha in alphas:
        # iterare through beta values
        for eta in etas:
            # get the coherence score for the given parameters
            coherence = compute_coherence_values(corpus=valid_set, 
                                                  dictionary=id2word, 
                                                  k=k, 
                                                  alpha=alpha, 
                                                  eta=eta)
            # Save the model results
            results['Number of topics'].append(k)
            results['Alpha'].append(alpha)
            results['Eta'].append(eta)
            results['Coherence'].append(coherence)
            
            pbar.update(1)

pbar.close()

results = pd.DataFrame(results)            
results.to_csv('lda_tuning_results.csv', index=False)

In [None]:
results

In [None]:
import matplotlib.pyplot as plt

fig = plt.figure()
ax = fig.add_subplot()
ax.plot(results['Number of topics'], results['Coherence'], 'ok')
ax.set_xlabel('Number of topics')
ax.set_ylabel('Coherence -- c_v')

In [None]:
import matplotlib.pyplot as plt

fig = plt.figure()
ax = fig.add_subplot()
ax.plot(results['Number of topics'], results['Coherence'], 'ok')
ax.set_xlabel('Number of topics')
ax.set_ylabel('Coherence -- c_v')

In [None]:
results.loc[results['Coherence'].argmax()]

** **
#### Step 7: Final Model
** **

Based on the above model selection, let's train the final model with parameters yielding highest coherence score.

In [None]:
num_topics = 9

lda_model = gensim.models.LdaMulticore(corpus=corpus,
                                       id2word=id2word,
                                       num_topics=num_topics, 
                                       random_state=100,
                                       chunksize=100,
                                       passes=10,
                                       alpha=0.91,
                                       eta=0.91)

In [None]:
from pprint import pprint

# Print the Keyword in the 10 topics
pprint(lda_model.print_topics())
doc_lda = lda_model[corpus]

** **
#### Step 8: Visualize Results
** **

In [None]:
import pyLDAvis.gensim_models as gensimvis
#import pyLDAvis.gensim
import pickle 
import pyLDAvis

# Visualize the topics
pyLDAvis.enable_notebook()

LDAvis_data_filepath = os.path.join(f'ldavis_tuned_{num_topics}')

# this is a bit time consuming - make the if statement True
# if you want to execute visualization prep yourself
if True:

    LDAvis_prepared = gensimvis.prepare(lda_model, corpus, id2word)
    with open(LDAvis_data_filepath, 'wb') as f:
      
        pickle.dump(LDAvis_prepared, f)

# load the pre-prepared pyLDAvis data from disk
with open(LDAvis_data_filepath, 'rb') as f:
    LDAvis_prepared = pickle.load(f)

pyLDAvis.save_html(LDAvis_prepared, f'ldavis_tuned_{num_topics}.html')

LDAvis_prepared

** **
#### Step 9: Predict
** **

In [None]:
unseen_doc = """The replica method is a non-rigorous but widely-accepted 
technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. 
This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. 
It is shown that with random linear measurements and Gaussian noise, 
the asymptotic behavior of the MAP estimate of an n-dimensional
vector ?decouples? as n scalar MAP estimators. The result is a counterpart to Guo
and Verd?u?s replica analysis of minimum mean-squared error estimation.
The replica MAP analysis can be readily applied to many estimators used in
compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, 
and zero norm-regularized estimation. In the case of lasso estimation
the scalar estimator reduces to a soft-thresholding operator, 
and for zero normregularized estimation it reduces to a hard-threshold. 
Among other benefits, the replica method provides a computationally-tractable method 
for exactly computing various performance metrics including mean-squared error 
and sparsity pattern recovery probability."""

# Data preprocessing step for the unseen document
unseen_words = gensim.utils.simple_preprocess(str(unseen_doc), deacc=True)
unseen_words_nostops = remove_stopwords([unseen_words])
unseen_words_bigrams = make_bigrams(unseen_words_nostops)
unseen_lemmatized = lemmatization(unseen_words_bigrams, allowed_postags=['NOUN', 'ADJ', 'VERB', 'ADV'])
bow_vector = id2word.doc2bow(unseen_lemmatized[0])

for index, score in sorted(lda_model[bow_vector], key=lambda tup: -1*tup[1]):
    print(f"Score: {score}\t Topic: {lda_model.print_topic(index, 5)}")


#### References:
1. http://qpleple.com/perplexity-to-evaluate-topic-models/
2. https://www.amazon.com/Machine-Learning-Probabilistic-Perspective-Computation/dp/0262018020
3. https://papers.nips.cc/paper/3700-reading-tea-leaves-how-humans-interpret-topic-models.pdf
4. https://github.com/mattilyra/pydataberlin-2017/blob/master/notebook/EvaluatingUnsupervisedModels.ipynb
5. https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/
6. http://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf
7. http://palmetto.aksw.org/palmetto-webapp/