# Gibbs Sampler

```yaml
Course:    DS 5001 
Module:    08 Lab
Topic:     Gibbs Sampler
Author:    R.C. Alvarado
Date:      03 March 2023 (revised)
```
**Purpose:** We develop a simple topic modeler using collapsed Gibbs sampling as described by [Griffiths and Steyvers (2004)](https://collab.its.virginia.edu/access/content/group/b9e58ce7-0f44-48fe-9861-b7a7657f551a/Articles/sciencetopics.pdf).

# Setup

In [1]:
import pandas as pd
import numpy as np
from tqdm import tqdm
import re
from nltk.corpus import stopwords 

# Convert F1 Corpus 

We want to convert any given F1 corpus (DOC) into unannotated TOKEN and VOCAB tables.

This is so we can work with ad hoc training data.

In [2]:
class Corpus:

    def __init__(self, doc_list:list, doc_col='doc_str'):
        "Create DOC table from F1 list"
        self.DOC = pd.DataFrame(doc_list, columns=[doc_col])
        self.DOC.index.name = 'doc_id'
        self.stop_words = set(stopwords.words('english')) 
        
    def convert_corpus(self):        
        "Convert raw docs into TOKEN and BOW tables"
        tokens = []
        for i, row in self.DOC.iterrows():
            for j, token in enumerate(row.doc_str.split()):
                term_str = re.sub(r'[\W_]+', '', token).lower()
                if term_str not in self.stop_words:
                    tokens.append((i, j, term_str))
        self.TOKEN = pd.DataFrame(tokens, columns=['doc_id','token_num','term_str'])\
            .set_index(['doc_id','token_num'])
        self.BOW = self.TOKEN.groupby(['doc_id','term_str']).term_str.count().to_frame('n')
        return self
        
    def extract_vocab(self):
        "Extract vocabulary VOCAB"
        self.VOCAB = self.TOKEN.term_str.value_counts().to_frame('n')
        self.VOCAB.index.name = 'term_str'   
        return self 

In [3]:
raw_docs = """
I ate a banana and a spinach smoothie for breakfast.
I like to eat broccoli and bananas.
Chinchillas and kittens are cute.
My sister adopted a kitten yesterday.
Look at this cute hamster munching on a piece of broccoli.
""".split("\n")[1:-1]

In [4]:
corpus1 = Corpus(raw_docs).convert_corpus().extract_vocab()

In [5]:
corpus1.BOW

Unnamed: 0_level_0,Unnamed: 1_level_0,n
doc_id,term_str,Unnamed: 2_level_1
0,ate,1
0,banana,1
0,breakfast,1
0,smoothie,1
0,spinach,1
1,bananas,1
1,broccoli,1
1,eat,1
1,like,1
2,chinchillas,1


# Gibbs Sampler

We sample each document and word combination in the BOW table. In each case,
we are looking for two values:

* the topic with which a word has been most frequently labeled
* the topic with which the document has the most labeled words

We combine these values in order to align the label of the current word with the rest of the data.\
If a the topic is highly associated with both the word and the document, then that topic will get a high value.

Note that all that is going on here is a sorting operation -- the random assignment does not predict anything.\
Instead, we are just gathering words under topics and topics under documents.

**From Darling 2011:**
<hr />
<div style="float:left;">
<img src="images/gibbs-algo-text.png" width="650px" />
<img src="images/gibbs-algo.png" width="650px" />
</div>

In [73]:
class GibbsSampler:

    def __init__(self, n_topics=10, iters=100, a = 1, b = 1):

        # Map arguments
        self.n_topics = n_topics
        self.iters = iters
        self.a = a
        self.b = b 
        
        # Define topic table
        topic_names = [f"T{str(t).zfill(len(str(self.n_topics)))}" for t in range(self.n_topics)]
        self.TOPIC = pd.DataFrame({'top_terms':'TBD'}, index=topic_names)
        self.TOPIC.index.name = 'topic_id'

    def add_corpus(self, corpus:Corpus):
        
        # Copy BOW and assign random topics        
        self.BOW = corpus.BOW.copy()
        self.BOW['topic_id'] = self.TOPIC.sample(len(self.BOW), replace=True).index
        
        # Get vocab length
        self.VOCAB = corpus.VOCAB
        self.W = self.VOCAB.shape[0]       
        
        return self
            
    def compute_topics(self):
        
        # Create count tables
        self.THETA = self.BOW.groupby(['doc_id', 'topic_id']).size().unstack(fill_value=0)
        self.PHI = self.BOW.groupby(['topic_id', 'term_str']).size().unstack(fill_value=0)
        self.TOPIC['n'] = self.BOW['topic_id'].value_counts().reindex(self.TOPIC.index, fill_value=0)

        # Iterate
        for i in tqdm(range(self.iters)):

            # Estimate topic per word
            for (doc_id, term_str), row in self.BOW.iterrows():

                # Get the current topic
                z = row['topic_id']

                # Decrement counts
                self.THETA.at[doc_id, z] -= 1
                self.PHI.at[z, term_str] -= 1
                self.TOPIC.at[z, 'n'] -= 1
                self.W -= 1

                # Calculate topic probabilities
                doc_topic_prob = (self.THETA.loc[doc_id] + self.a) / (self.THETA.loc[doc_id].sum() + self.a * self.n_topics)                
                topic_term_prob = (self.PHI[term_str] + self.b) / (self.TOPIC['n'] + self.b * self.W) 
                
                PZ = doc_topic_prob * topic_term_prob
                PZ /= PZ.sum()  # Normalize

                # Sample new topic
                z_new = PZ.sample(weights=PZ).index[0]
                self.BOW.at[(doc_id, term_str), 'topic_id'] = z_new

                # Increment counts
                self.THETA.at[doc_id, z_new] += 1
                self.PHI.at[z_new, term_str] += 1
                self.TOPIC.at[z_new, 'n'] += 1
                self.W += 1
                
        return self
        
    def get_top_terms(self):
        # Get top words for each topic
        for topic_id in self.TOPIC.index:
            self.TOPIC.loc[topic_id, 'top_terms'] = ' '.join(self.PHI.loc[topic_id, self.PHI.loc[topic_id] > 0].index.values)
            
        return self

In [51]:
def do_all(f1_list:[], k=4, iters=100):
    corpus = Corpus(f1_list).convert_corpus().extract_vocab()
    model = GibbsSampler(n_topics=k, iters=iters, a=1, b=1).add_corpus(corpus).compute_topics().get_top_terms()
    corpus = corpus.DOC.join(model.THETA.astype('int'))
    return corpus, model

# Demo 1

We use a toy example to see if the method works.\
Because our codd is not vert efficient, we just 

## Data

A small F1 corpus.

In [8]:
raw_docs = """
I ate a banana and a spinach smoothie for breakfast.
I like to eat broccoli and bananas.
Chinchillas and kittens are cute.
My sister adopted a kitten yesterday.
Look at this cute hamster munching on a piece of broccoli.
""".split("\n")[1:-1]

## Process

In [74]:
cp1, tm1 = do_all(raw_docs, k=5, iters=500)

  1%|          | 5/500 [00:00<00:22, 21.68it/s]

T topic_id
T0    5
T1    3
T2    2
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    2
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    2
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    2
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    1
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    2
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    2
T3    4
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    4
T2    2
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    2
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    2
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    2
T3    4
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    5
T2    2
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    1
T3    5
T4    5
Name: n, dtyp

  2%|▏         | 11/500 [00:00<00:21, 23.28it/s]

T topic_id
T0    2
T1    6
T2    3
T3    4
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    6
T2    3
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    6
T2    2
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    2
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    2
T3    3
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    6
T2    2
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    3
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    3
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    4
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    4
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    3
T3    2
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    7
T2    3
T3    2
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    6
T2    3
T3    2
T4    6
Name: n, dtyp

  4%|▍         | 20/500 [00:00<00:18, 25.41it/s]

T topic_id
T0    2
T1    8
T2    4
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    9
T2    4
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    8
T2    4
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    8
T2    3
T3    1
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    3
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    3
T3    0
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    4
T3    0
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    6
T2    4
T3    1
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    6
T2    4
T3    1
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    4
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    4
T3    2
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    5
T2    4
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    5
T3    3
T4    6
Name: n, dtyp

  5%|▌         | 27/500 [00:01<00:17, 26.96it/s]

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    6
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    7
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    7
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    7
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    6
T3    4
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    7
T2    6
T3    4
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    5
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    5
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    5
T3    4
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    5
T2    6
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    5
T3    4
T4    1
Name: n, dtyp

  7%|▋         | 37/500 [00:01<00:18, 25.56it/s]

T topic_id
T0    4
T1    2
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    6
T3    5
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    2
T2    6
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    6
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    6
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    6
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    5
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    4
T3    5
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    3
T2    3
T3    6
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    2
T2    3
T3    7
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    7
T4    3
Name: n, dtyp

  9%|▉         | 46/500 [00:01<00:16, 27.89it/s]

T topic_id
T0    6
T1    8
T2    2
T3    1
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    8
T2    2
T3    1
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    9
T2    2
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    8
T2    3
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    8
T2    3
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    8
T2    4
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    8
T2    4
T3    1
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    8
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    8
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    8
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    9
T2    4
T3    4
T4    3
Name: n, dtyp

 11%|█         | 55/500 [00:02<00:15, 28.27it/s]

T topic_id
T0    4
T1    1
T2    7
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    7
T3    1
T4    9
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    7
T3    1
T4    9
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    8
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    8
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    7
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    7
T3    3
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    2
T2    6
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    2
T2    7
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    2
T2    7
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    2
T2    8
T3    4
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    2
T2    8
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    2
T2    8
T3    4
T4    5
Name: n, dtyp

 13%|█▎        | 64/500 [00:02<00:16, 25.83it/s]

T topic_id
T0    5
T1    4
T2    6
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    6
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    6
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    6
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    6
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    6
T3    1
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    4
T2    6
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    7
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    7
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    7
T3    1
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    4
T2    7
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    8
T3    1
T4    5
Name: n, dtyp

 15%|█▍        | 73/500 [00:02<00:15, 26.70it/s]

T topic_id
T0    4
T1    2
T2    9
T3    5
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1     2
T2    10
T3     4
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1     2
T2    10
T3     4
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    9
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1     2
T2    10
T3     4
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0     5
T1     2
T2    10
T3     3
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0     5
T1     2
T2    10
T3     3
T4     1
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    2
T2    9
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    9
T3    3
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    8
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    9
T3    4
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    2
T2    9
T3    4
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    9
T3  

 16%|█▌        | 79/500 [00:03<00:16, 25.69it/s]

T topic_id
T0    7
T1    1
T2    3
T3    6
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    3
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    3
T3    6
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    3
T2    3
T3    6
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    2
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    2
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    2
T3    7
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    4
T2    2
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    2
T3    6
T4    6
Name: n, dtyp

 16%|█▋        | 82/500 [00:03<00:18, 23.01it/s]

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    7
T2    5
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    7
T2    5
T3    3
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    7
T2    5
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    0
T1    7
T2    5
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    0
T1    7
T2    6
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    0
T1    7
T2    6
T3    3
T4    5
Name: n, dtyp

 18%|█▊        | 91/500 [00:03<00:16, 24.22it/s]

T topic_id
T0    4
T1    5
T2    1
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    2
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    3
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    3
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    3
T3    5
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    6
T2    3
T3    4
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    4
T3    4
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    4
T3    5
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    4
T3    5
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    6
T2    4
T3    5
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    6
T2    3
T3    5
T4    1
Name: n, dtyp

 20%|██        | 100/500 [00:03<00:16, 23.92it/s]

T topic_id
T0    5
T1    2
T2    2
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    3
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    3
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    6
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    2
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    2
T2    5
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    5
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    5
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    5
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    6
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    7
T3    3
T4    6
Name: n, dtyp

 21%|██        | 106/500 [00:04<00:16, 23.59it/s]

T topic_id
T0    1
T1    6
T2    4
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    6
T2    4
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    5
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    5
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    5
T3    8
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    5
T3    8
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    5
T3    7
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    5
T2    4
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    4
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    4
T3    8
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    4
T3    8
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    3
T2    4
T3    8
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    4
T3    8
T4    5
Name: n, dtyp

 23%|██▎       | 115/500 [00:04<00:15, 24.27it/s]

T topic_id
T0    3
T1    1
T2    3
T3    8
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    1
T2    3
T3    8
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    3
T3    8
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    2
T3    8
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    2
T2    2
T3    9
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    2
T3    8
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    2
T3    9
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    2
T3    9
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    4
T2    2
T3    9
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    5
T2    2
T3    9
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    2
T3    8
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    2
T3    8
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    2
T3    8
T4    3
Name: n, dtyp

 25%|██▍       | 124/500 [00:05<00:16, 23.04it/s]

T topic_id
T0    5
T1    3
T2    4
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    4
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    4
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    5
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    5
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    5
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    5
T3    1
T4    8
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    3
T2    5
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    5
T3    2
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    5
T3    3
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    2
T2    5
T3    4
T4    8
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    2
T2    6
T3    4
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    2
T2    6
T3    4
T4    8
Name: n, dtyp

 25%|██▌       | 127/500 [00:05<00:17, 21.45it/s]

T topic_id
T0    1
T1    4
T2    6
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    6
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    5
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    6
T3    6
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    3
T2    5
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    3
T2    5
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    5
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    3
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    3
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    3
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    3
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    3
T3    4
T4    7
Name: n, dtyp

 27%|██▋       | 136/500 [00:05<00:16, 22.00it/s]

T topic_id
T0    2
T1    4
T2    3
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    3
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    4
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    4
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    4
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    4
T3    4
T4    8
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    5
T2    4
T3    3
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    4
T3    3
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    6
T2    4
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    4
T3    3
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    6
T2    5
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    6
T2    5
T3    2
T4    7
Name: n, dtyp

 30%|██▉       | 148/500 [00:06<00:14, 24.68it/s]

T topic_id
T0    1
T1    4
T2    5
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    5
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    3
T2    5
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    4
T2    5
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    5
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    5
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    5
T3    3
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    4
T2    6
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    6
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    5
T2    6
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    6
T2    5
T3    3
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    6
T2    4
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    6
T2    4
T3    3
T4    5
Name: n, dtyp

 30%|███       | 151/500 [00:06<00:14, 24.81it/s]

T topic_id
T0    4
T1    2
T2    4
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    3
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    7
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    3
T3    6
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    3
T2    3
T3    6
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    3
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    2
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    2
T3    5
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    3
T2    3
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    3
T3    6
T4    3
Name: n, dtyp

 32%|███▏      | 160/500 [00:06<00:13, 24.83it/s]

T topic_id
T0    4
T1    6
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    1
T3    5
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    8
T2    1
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    7
T2    1
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    7
T2    1
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    2
T3    4
T4    4
Name: n, dtyp

 33%|███▎      | 166/500 [00:06<00:14, 23.08it/s]

T topic_id
T0    7
T1    4
T2    4
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    5
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    4
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    5
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    5
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    6
T3    3
T4    3
Name: n, dtyp

 36%|███▌      | 178/500 [00:07<00:13, 24.40it/s]

T topic_id
T0    7
T1    4
T2    5
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    4
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    5
T2    3
T3    2
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    5
T2    3
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    4
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    4
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    4
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    4
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    5
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    5
T3    2
T4    4
Name: n, dtyp

 38%|███▊      | 190/500 [00:07<00:12, 24.22it/s]

T topic_id
T0    9
T1    3
T2    1
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    10
T1     2
T2     1
T3     4
T4     4
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    3
T2    1
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    10
T1     3
T2     0
T3     4
T4     4
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    3
T2    0
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    3
T2    1
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    3
T2    1
T3    5
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    3
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    2
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    3
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    3
T3    6
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    3
T2    3
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    3
T3    6
T4    2
Nam

 40%|████      | 202/500 [00:08<00:11, 25.22it/s]

T topic_id
T0    4
T1    5
T2    4
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    4
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    4
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    5
T2    4
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    5
T2    4
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    4
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    5
T2    4
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    4
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtyp

 43%|████▎     | 214/500 [00:08<00:11, 25.08it/s]

T topic_id
T0     2
T1    12
T2     0
T3     2
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1    11
T2     0
T3     3
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1    10
T2     0
T3     4
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    9
T2    0
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    8
T2    0
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    0
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    0
T3    4
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    8
T2    0
T3    4
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    9
T2    0
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    10
T2     0
T3     2
T4     6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    9
T2    0
T3    3
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    9
T2    0
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    8
T2    0
T3    3
T

 45%|████▍     | 223/500 [00:09<00:11, 23.53it/s]

T topic_id
T0    6
T1    3
T2    5
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    6
T3    6
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    7
T3    6
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    6
T3    6
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    6
T3    5
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    6
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    5
T3    5
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    3
T2    4
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    2
T2    4
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    3
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    2
T3    7
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    9
T1    2
T2    2
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    2
T3    6
T4    3
Name: n, dtyp

 46%|████▋     | 232/500 [00:09<00:10, 24.56it/s]

T topic_id
T0    6
T1    2
T2    2
T3    9
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    9
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    9
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    9
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    9
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    9
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    9
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    1
T2    4
T3    9
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    4
T3    9
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    4
T3    9
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    4
T3    9
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    1
T2    4
T3    8
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    4
T3    7
T4    2
Name: n, dtyp

 48%|████▊     | 241/500 [00:09<00:10, 24.99it/s]

T topic_id
T0    3
T1    4
T2    2
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    2
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    2
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    2
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    3
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    3
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    3
T3    7
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    2
T2    3
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    4
T3    6
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    3
T2    4
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    4
T3    5
T4    6
Name: n, dtyp

 51%|█████     | 253/500 [00:10<00:09, 24.77it/s]

T topic_id
T0    2
T1    7
T2    7
T3    1
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    8
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    9
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    8
T2    9
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    9
T2    9
T3    1
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0     1
T1     8
T2    10
T3     1
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0     1
T1     8
T2    10
T3     1
T4     1
Name: n, dtype: int64
P 1

T topic_id
T0     1
T1     8
T2    10
T3     1
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0     1
T1     8
T2    10
T3     1
T4     1
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    9
T2    9
T3    1
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    9
T2    9
T3    1
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    9
T2    9
T3    1
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    9
T2    9
T3    0
T

 52%|█████▏    | 259/500 [00:10<00:10, 22.18it/s]

T topic_id
T0    8
T1    4
T2    6
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    6
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    3
T2    5
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    5
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    4
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    2
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    3
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    4
T3    4
T4    3
Name: n, dtyp

 54%|█████▎    | 268/500 [00:11<00:10, 22.87it/s]

T topic_id
T0    5
T1    6
T2    4
T3    1
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    6
T2    4
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    3
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    3
T3    2
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    4
T2    3
T3    2
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    4
T2    4
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    4
T3    2
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    4
T3    2
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    4
T3    2
T4    9
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    4
T3    2
T4    8
Name: n, dtyp

 55%|█████▌    | 277/500 [00:11<00:08, 24.83it/s]

T topic_id
T0    4
T1    4
T2    4
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    4
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    5
T2    5
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    5
T2    5
T3    2
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    5
T2    5
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    5
T2    5
T3    1
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    4
T2    5
T3    2
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    4
T3    2
T4    8
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    4
T2    3
T3    2
T4    9
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    3
T3    2
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    3
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    2
T4    6
Name: n, dtyp

 58%|█████▊    | 289/500 [00:11<00:08, 24.58it/s]

T topic_id
T0    4
T1    9
T2    2
T3    1
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1    10
T2     2
T3     0
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    11
T2     2
T3     0
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    11
T2     2
T3     0
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    11
T2     2
T3     1
T4     4
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    10
T2     2
T3     1
T4     5
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1    10
T2     2
T3     1
T4     4
Name: n, dtype: int64
P 1

T topic_id
T0     4
T1    10
T2     2
T3     1
T4     4
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1    10
T2     2
T3     2
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    9
T2    2
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1    10
T2     2
T3     3
T4     2
Name: n, dtype: int64
P 1

T topic_id
T0     4
T1    10
T2     2
T3     3
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0 

 60%|██████    | 301/500 [00:12<00:07, 25.31it/s]

T topic_id
T0    8
T1    1
T2    4
T3    8
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    4
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    4
T3    7
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    4
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    5
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    1
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    0
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    0
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    0
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    0
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    0
T2    6
T3    5
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    0
T2    7
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    0
T2    6
T3    5
T4    4
Name: n, dtyp

 63%|██████▎   | 313/500 [00:12<00:07, 25.66it/s]

T topic_id
T0     1
T1     4
T2    12
T3     1
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1     4
T2    11
T3     1
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1     4
T2    10
T3     1
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1     4
T2    10
T3     2
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    9
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    9
T3    1
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    9
T3    1
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    5
T2    9
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    9
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    6
T2    8
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    6
T2    8
T3    1
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    7
T2    7
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    7
T3    1
T

 65%|██████▌   | 325/500 [00:13<00:07, 24.58it/s]

T topic_id
T0    4
T1    5
T2    2
T3    9
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    6
T2    2
T3    9
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    2
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    2
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    2
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    2
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    1
T3    8
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    8
T2    1
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    1
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    9
T2    1
T3    8
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    10
T2     1
T3     7
T4     0
Name: n, dtype: int64
P 1

T topic_id
T0     3
T1    10
T2     1
T3     7
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    11
T2     1
T3     6
T4     

 66%|██████▌   | 331/500 [00:13<00:06, 24.98it/s]

T topic_id
T0    6
T1    3
T2    7
T3    5
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    7
T3    5
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    8
T3    5
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    8
T3    5
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    8
T3    5
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    7
T3    5
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    7
T3    5
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    4
T2    6
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    6
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    6
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    3
T2    6
T3    6
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    4
T2    5
T3    6
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    4
T2    6
T3    6
T4    2
Name: n, dtyp

 67%|██████▋   | 334/500 [00:13<00:07, 21.50it/s]

T topic_id
T0    5
T1    3
T2    1
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    1
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    1
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    1
T3    5
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    2
T2    1
T3    5
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    1
T2    1
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    1
T2    1
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    1
T2    1
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    1
T2    0
T3    4
T4    8
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    0
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    2
T2    0
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    2
T2    0
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    0
T3    4
T4    7
Name: n, dtyp

 69%|██████▉   | 346/500 [00:14<00:06, 23.59it/s]

T topic_id
T0    3
T1    4
T2    5
T3    7
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    4
T2    5
T3    7
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    6
T3    7
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    6
T3    7
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    3
T2    6
T3    7
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    3
T2    6
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    6
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    7
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    6
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    1
T2    6
T3    7
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    1
T2    6
T3    6
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    1
T2    7
T3    6
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    1
T2    7
T3    6
T4    3
Name: n, dtyp

 72%|███████▏  | 361/500 [00:14<00:05, 24.71it/s]

T topic_id
T0    5
T1    5
T2    5
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    4
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    4
T3    2
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    6
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    6
T2    2
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    6
T2    2
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    7
T2    2
T3    2
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    7
T2    2
T3    2
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    6
T2    2
T3    3
T4    2
Name: n, dtyp

 75%|███████▍  | 373/500 [00:15<00:05, 22.81it/s]

T topic_id
T0    4
T1    2
T2    4
T3    5
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    2
T2    3
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    2
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    5
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    2
T2    2
T3    4
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    2
T2    2
T3    4
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    2
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    2
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    2
T3    5
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    2
T3    5
T4    7
Name: n, dtyp

 76%|███████▋  | 382/500 [00:15<00:05, 23.20it/s]

T topic_id
T0    8
T1    5
T2    4
T3    2
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    5
T2    4
T3    2
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    9
T1    5
T2    4
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    6
T2    4
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    6
T2    4
T3    2
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    4
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    4
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    4
T2    6
T3    2
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    4
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    5
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    5
T3    3
T4    2
Name: n, dtyp

 78%|███████▊  | 388/500 [00:16<00:05, 22.07it/s]

T topic_id
T0    4
T1    6
T2    2
T3    3
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    2
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    6
T2    2
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    6
T2    2
T3    3
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    6
T2    2
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    2
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    2
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    6
T2    2
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    2
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    3
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    5
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    5
T3    3
T4    5
Name: n, dtyp

 79%|███████▉  | 394/500 [00:16<00:04, 22.74it/s]

T topic_id
T0    5
T1    6
T2    6
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    6
T2    5
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    6
T2    5
T3    0
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    6
T2    5
T3    0
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    6
T2    5
T3    0
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    5
T2    5
T3    0
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    6
T2    5
T3    0
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    5
T2    5
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    5
T3    2
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    4
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    4
T2    4
T3    3
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    4
T2    4
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    5
T3    3
T4    2
Name: n, dtyp

 80%|████████  | 400/500 [00:16<00:04, 22.46it/s]

T topic_id
T0    7
T1    6
T2    2
T3    4
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    6
T2    2
T3    4
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    2
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    2
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    4
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    3
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    4
T2    2
T3    5
T4    3
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    3
T2    3
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    3
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    4
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    4
T2    4
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    3
T3    5
T4    3
Name: n, dtyp

 81%|████████  | 406/500 [00:16<00:03, 24.60it/s]

T topic_id
T0     2
T1     3
T2     6
T3    10
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1     3
T2     6
T3    10
T4     0
Name: n, dtype: int64
P 1

T topic_id
T0     1
T1     4
T2     6
T3    10
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     1
T1     4
T2     5
T3    11
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     1
T1     3
T2     6
T3    11
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1     2
T2     6
T3    11
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1     2
T2     6
T3    11
T4     0
Name: n, dtype: int64
P 0

T topic_id
T0     2
T1     2
T2     7
T3    10
T4     0
Name: n, dtype: int64
P 1

T topic_id
T0     2
T1     2
T2     7
T3    10
T4     0
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    2
T2    8
T3    9
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    2
T2    8
T3    9
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    2
T2    8
T3    9
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    3


 82%|████████▏ | 412/500 [00:17<00:03, 23.45it/s]

T topic_id
T0    4
T1    5
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    4
T3    2
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    4
T3    1
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    6
T2    4
T3    1
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    6
T2    4
T3    1
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    4
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    4
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    3
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    3
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    3
T3    1
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    7
T2    3
T3    0
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    4
T3    0
T4    7
Name: n, dtyp

 84%|████████▍ | 421/500 [00:17<00:03, 24.02it/s]

T topic_id
T0    2
T1    7
T2    5
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    7
T2    5
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    5
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    6
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    7
T2    6
T3    2
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    7
T2    5
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    8
T2    5
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    8
T2    5
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    8
T2    4
T3    3
T4    4
Name: n, dtyp

 86%|████████▌ | 430/500 [00:17<00:02, 25.61it/s]

T topic_id
T0    5
T1    5
T2    2
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    3
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    3
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    3
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    5
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    4
T2    4
T3    7
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    4
T2    4
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    5
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    5
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    5
T3    7
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    3
T2    5
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    5
T3    6
T4    1
Name: n, dtyp

 87%|████████▋ | 436/500 [00:18<00:03, 21.02it/s]

T topic_id
T0    5
T1    3
T2    2
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    2
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    2
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    1
T2    2
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    2
T3    6
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    3
T3    5
T4    6
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    5
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    3
T2    3
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    3
T3    4
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    3
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    2
T2    3
T3    3
T4    5
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    2
T2    4
T3    3
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    2
T2    4
T3    3
T4    5
Name: n, dtyp

 88%|████████▊ | 442/500 [00:18<00:02, 23.10it/s]

T topic_id
T0    4
T1    7
T2    7
T3    2
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    8
T2    7
T3    1
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    9
T2    7
T3    1
T4    0
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    8
T2    7
T3    1
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    7
T3    1
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    6
T3    2
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    7
T2    5
T3    4
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    5
T3    4
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    6
T2    4
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    7
T2    4
T3    5
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    4
T1    7
T2    3
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    8
T2    3
T3    5
T4    2
Name: n, dtyp

 91%|█████████ | 454/500 [00:18<00:01, 25.80it/s]

T topic_id
T0    10
T1     5
T2     1
T3     2
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0    11
T1     4
T2     1
T3     2
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0    10
T1     4
T2     2
T3     2
T4     3
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    4
T2    3
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    5
T2    2
T3    2
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    9
T1    5
T2    3
T3    1
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    5
T2    3
T3    1
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    5
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    3
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    5
T2    3
T3    3
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    5
T2    3
T3    3
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    5
T2    3
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    3
T3    4
T4    

 92%|█████████▏| 460/500 [00:19<00:01, 26.53it/s]

T topic_id
T0    4
T1    4
T2    3
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    5
T2    2
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    5
T2    2
T3    3
T4    6
Name: n, dtype: int64
P 1

T topic_id
T0    5
T1    4
T2    2
T3    3
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    2
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    2
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    2
T3    2
T4    7
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    3
T2    2
T3    2
T4    7
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    2
T3    3
T4    6
Name: n, dtyp

 93%|█████████▎| 466/500 [00:19<00:01, 26.99it/s]

T topic_id
T0    2
T1    5
T2    3
T3    8
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    6
T2    4
T3    7
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    5
T2    4
T3    8
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    2
T1    4
T2    4
T3    8
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    3
T2    4
T3    9
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    9
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    8
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    2
T2    4
T3    7
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    4
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    4
T1    2
T2    4
T3    6
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    4
T3    5
T4    5
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    2
T2    4
T3    5
T4    5
Name: n, dtyp

 95%|█████████▌| 475/500 [00:19<00:01, 22.86it/s]

T topic_id
T0    8
T1    5
T2    1
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    5
T2    1
T3    3
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    6
T2    1
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    6
T2    1
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    8
T1    6
T2    1
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    6
T2    2
T3    2
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    7
T2    2
T3    1
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    8
T2    2
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    9
T2    2
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    9
T2    2
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    8
T2    2
T3    0
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    8
T2    3
T3    0
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    8
T2    3
T3    0
T4    4
Name: n, dtyp

 97%|█████████▋| 487/500 [00:20<00:00, 24.74it/s]

T topic_id
T0    6
T1    2
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    5
T1    3
T2    3
T3    7
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    3
T3    6
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    3
T3    5
T4    3
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    4
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    4
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    2
T3    5
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    3
T3    4
T4    4
Name: n, dtype: int64
P 1

T topic_id
T0    8
T1    3
T2    2
T3    4
T4    4
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    3
T2    3
T3    4
T4    4
Name: n, dtyp

 99%|█████████▊| 493/500 [00:20<00:00, 24.94it/s]

T topic_id
T0    7
T1    3
T2    3
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    3
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    3
T3    8
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    4
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    3
T2    4
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    5
T3    7
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    6
T1    2
T2    6
T3    6
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    6
T1    1
T2    7
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    6
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    6
T3    5
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    6
T3    6
T4    1
Name: n, dtype: int64
P 1

T topic_id
T0    7
T1    1
T2    6
T3    6
T4    1
Name: n, dtype: int64
P 0

T topic_id
T0    7
T1    1
T2    6
T3    6
T4    1
Name: n, dtyp

100%|██████████| 500/500 [00:20<00:00, 24.23it/s]

T topic_id
T0     5
T1    10
T2     2
T3     2
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0     6
T1    10
T2     1
T3     2
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0     5
T1    10
T2     2
T3     2
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1    11
T2     2
T3     2
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0     4
T1    10
T2     3
T3     2
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    10
T2     4
T3     2
T4     2
Name: n, dtype: int64
P 0

T topic_id
T0     3
T1    10
T2     4
T3     2
T4     2
Name: n, dtype: int64
P 1

T topic_id
T0    3
T1    9
T2    4
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    3
T1    9
T2    4
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    9
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    2
T1    9
T2    5
T3    3
T4    2
Name: n, dtype: int64
P 1

T topic_id
T0    1
T1    9
T2    6
T3    3
T4    2
Name: n, dtype: int64
P 0

T topic_id
T0    1
T1    9
T2




In [55]:
tm1.TOPIC

Unnamed: 0_level_0,top_terms,n
topic_id,Unnamed: 1_level_1,Unnamed: 2_level_1
T0,ate banana smoothie,3
T1,adopted broccoli eat yesterday,4
T2,bananas breakfast cute kitten kittens look sis...,9
T3,,0
T4,broccoli chinchillas hamster like munching piece,6


In [56]:
cp1.style.background_gradient(axis=None)

Unnamed: 0_level_0,doc_str,T0,T1,T2,T3,T4
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,I ate a banana and a spinach smoothie for breakfast.,3,0,2,0,0
1,I like to eat broccoli and bananas.,0,2,1,0,1
2,Chinchillas and kittens are cute.,0,0,2,0,1
3,My sister adopted a kitten yesterday.,0,2,2,0,0
4,Look at this cute hamster munching on a piece of broccoli.,0,0,2,0,4


# Demo 2

## Data

In [34]:
some_documents = [
    ["Hadoop", "Big Data", "HBase", "Java", "Spark", "Storm", "Cassandra"],
    ["NoSQL", "MongoDB", "Cassandra", "HBase", "Postgres"],
    ["Python", "scikit-learn", "scipy", "numpy", "statsmodels", "pandas"],
    ["R", "Python", "statistics", "regression", "probability"],
    ["machine learning", "regression", "decision trees", "libsvm"],
    ["Python", "R", "Java", "C++", "Haskell", "programming languages"],
    ["statistics", "probability", "mathematics", "theory"],
    ["machine learning", "scikit-learn", "Mahout", "neural networks"],
    ["neural networks", "deep learning", "Big Data", "artificial intelligence"],
    ["Hadoop", "Java", "MapReduce", "Big Data"],
    ["statistics", "R", "statsmodels"],
    ["C++", "deep learning", "artificial intelligence", "probability"],
    ["pandas", "R", "Python"],
    ["databases", "HBase", "Postgres", "MySQL", "MongoDB"],
    ["libsvm", "regression", "support vector machines"]
]
raw_docs2  = [' '.join(item) for item in some_documents]

## Process

In [42]:
cp2, tm2 = do_all(raw_docs2, k=10, iters=500)

100%|██████████| 500/500 [00:52<00:00,  9.47it/s]


In [43]:
tm2.TOPIC

Unnamed: 0_level_0,top_terms,n
topic_id,Unnamed: 1_level_1,Unnamed: 2_level_1
T00,hbase intelligence java languages machines mys...,8
T01,decision java learning libsvm networks statist...,7
T02,cassandra mahout mongodb r scikitlearn,6
T03,cassandra mapreduce numpy postgres r regressio...,9
T04,artificial big c data databases deep haskell h...,15
T05,artificial c data hadoop intelligence neural p...,13
T06,hbase mongodb pandas r,4
T07,machine mathematics probability regression sci...,6
T08,big deep hadoop learning libsvm machine networ...,11
T09,pandas python statistics,3


In [37]:
cp2.style.background_gradient(axis=None)

Unnamed: 0_level_0,doc_str,T00,T01,T02,T03,T04,T05,T06,T07,T08,T09
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0,Hadoop Big Data HBase Java Spark Storm Cassandra,1,1,1,0,1,0,3,0,0,1
1,NoSQL MongoDB Cassandra HBase Postgres,0,1,1,0,0,0,1,0,2,0
2,Python scikit-learn scipy numpy statsmodels pandas,0,2,3,1,0,0,0,0,0,0
3,R Python statistics regression probability,0,0,1,0,1,2,0,0,0,1
4,machine learning regression decision trees libsvm,0,0,1,0,0,0,0,2,2,1
5,Python R Java C++ Haskell programming languages,1,0,0,2,1,1,0,0,1,1
6,statistics probability mathematics theory,0,0,0,0,0,1,2,1,0,0
7,machine learning scikit-learn Mahout neural networks,2,0,0,0,0,1,3,0,0,0
8,neural networks deep learning Big Data artificial intelligence,3,0,0,0,0,2,1,0,1,1
9,Hadoop Java MapReduce Big Data,1,0,0,0,0,0,0,3,0,1
