# **Tutorial** - (semi)-supervised topic modeling
(last updated 26-04-2021)

In this tutorial, we will be looking at a new feature of BERTopic, namely (semi)-supervised topic modeling! This allows us to steer the dimensionality reduction of the embeddings into a space that closely follows any labels you might already have. 

## Semi-supervised modeling
(semi)-supervised topic modeling is a class of methods that allows the user to perform topic modeling with previously defined labels. This might help nudge the model towards specific topics or classes for which you have labels. 

<br>

<img src="https://raw.githubusercontent.com/MaartenGr/BERTopic/master/images/logo.png" width="40%">

# Enabling the GPU

First, you'll need to enable GPUs for the notebook:

- Navigate to Edit→Notebook Settings
- select GPU from the Hardware Accelerator drop-down

[Reference](https://colab.research.google.com/notebooks/gpu.ipynb)

# Installing BERTopic

We start by installing BERTopic from PyPi:

In [3]:
%%capture
!pip install bertopic

## Restart the Notebook
After installing BERTopic, some packages that were already loaded were updated and in order to correctly use them, we should now restart the notebook.

From the Menu:

Runtime → Restart Runtime

# **Data**
For this example, we use the popular 20 Newsgroups dataset which contains roughly 18000 newsgroups posts that each is assigned to one of 20 topics:

In [1]:
import pandas as pd
from bertopic import BERTopic
#from sklearn.datasets import fetch_20newsgroups

#data = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))
#docs = data["data"]
#targets = data["target"]
#target_names = data["target_names"]
#classes = [data["target_names"][i] for i in data["target"]]

In [2]:
import numpy as np
import io
from google.colab import drive
drive.mount('/content/drive')
unlabeled_data = pd.read_csv('/content/drive/My Drive/BERTopic_Unlabeled.csv')
labeled_data = pd.read_csv('/content/drive/My Drive/BERTopic_Labeled.csv')

Mounted at /content/drive


  exec(code_obj, self.user_global_ns, self.user_ns)


In [3]:
def make_string(text):
    text_preprocessed = (" ").join(text)
    return text_preprocessed

In [4]:
from ast import literal_eval
unlabeled_data['tokens'] = unlabeled_data['tokens'].apply(lambda row: literal_eval(row))
unlabeled_data['tokens'] = unlabeled_data['tokens'].apply(lambda x: make_string(x))
labeled_data['tokens'] = labeled_data['tokens'].apply(lambda row: literal_eval(row))
labeled_data['tokens'] = labeled_data['tokens'].apply(lambda x: make_string(x))
labeled_data = labeled_data.replace(np.nan, False)
labeled_data['true count'] = labeled_data[['university', 'relationships','break ups', 'divorce', 'weddings', 'death', 'family', 'friendship']].sum(axis=1)

In [5]:
labeled_data['true count'].value_counts()

1    409
0    393
2    149
3     38
4      9
5      2
Name: true count, dtype: int64

In [6]:
labeled_data['true list'] = labeled_data[['university', 'relationships','break ups', 'divorce', 'weddings', 'death', 'family', 'friendship']].apply(lambda row: row[row == True].index.tolist(), axis=1)
labeled_data['true list'] = labeled_data['true list'].apply(lambda x: [-1] if x == [] else x )

In [7]:
labeled_data

Unnamed: 0.3,Unnamed: 0,Unnamed: 0.2,Unnamed: 0.1,index,Unnamed: 0.1.1,bookId,title,series,author,rating,...,divorce,weddings,death,family,friendship,labeled?,Contains True?,tokens,true count,true list
0,0,0,0,0,39822,34838660-not-part-of-the-plan,Not Part of the Plan,Blue Moon #4,Lucy Score (Goodreads Author),4.46,...,False,False,False,False,False,Yes,1.0,wall street journal amazon bestselling author ...,1,[relationships]
1,1,1,1,1,34235,20176552-dragon-age-volume-1,"Dragon Age, Volume 1",Dragon Age Graphic Novels #1-3,"David Gaider, Chad Hardin (Illustrator), Antho...",4.26,...,False,False,False,False,False,Yes,0.0,helping set stage biowares hotly anticipated d...,0,[-1]
2,2,2,2,2,27904,124110.Dangerous_to_Know,Dangerous to Know,False,Barbara Taylor Bradford (Goodreads Author),3.73,...,True,False,True,True,False,Yes,1.0,sebastian locke fiftysixyearold patriarch powe...,4,"[relationships, divorce, death, family]"
3,3,3,3,3,10515,1046450.The_Wheel_of_Fortune,The Wheel of Fortune,False,Susan Howatch,4.11,...,False,False,False,True,False,Yes,1.0,take back oxmoon lost paradise childhood take ...,2,"[relationships, family]"
4,4,4,4,4,935,872333.Blue_Bloods,Blue Bloods,Blue Bloods #1,Melissa de la Cruz (Goodreads Author),3.69,...,False,False,False,False,False,Yes,0.0,mayflower set sail carried board men women wou...,0,[-1]
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,995,995,995,995,17361,588326.The_Blue_Helmet,The Blue Helmet,False,William Bell,3.42,...,False,False,False,False,True,False,1.0,lee wants tarantula member biggest powerful ga...,1,[friendship]
996,996,996,996,996,9029,93007.The_Merry_Adventures_of_Robin_Hood,The Merry Adventures of Robin Hood,False,Howard Pyle,4.07,...,False,False,False,False,False,False,0.0,merry adventures robin hood great renown notti...,0,[-1]
997,997,997,997,997,32216,1085376.Before_You_Sleep,Before You Sleep,False,"Linn Ullmann, Tiina Nunnally (Translator)",3.34,...,False,False,False,True,False,False,1.0,moving presentday oslo brooklyn sleep tells st...,1,[family]
998,998,998,998,998,1036,28195.Inkspell,Inkspell,Inkworld #2,"Cornelia Funke (Goodreads Author), Anthea Bell...",3.91,...,False,False,False,False,False,False,0.0,captivating sequel inkheart critically acclaim...,0,[-1]


In [8]:
import random

In [9]:
labeled_data['target'] = labeled_data['true list'].apply(lambda x: random.choice(x))

In [10]:
test_set = labeled_data[700:899]
full_labeled = labeled_data
labeled_data.drop(labeled_data.index[700:900],0,inplace=True)

In [11]:
labeled_data['tokens'][900]



Each document can be put into one of the following categories:

In [17]:
target_names = ['university', 'relationships','break ups', 'divorce', 'weddings', 'death', 'family', 'friendship']
labeled_data['target_num'] = labeled_data['target'].apply(lambda x: target_names.index(x) if x != -1 else x)
unlabeled_data['target_num'] = -1
unlabeled_data['target'] = -1

In [20]:
L_data = labeled_data[['tokens', 'target', 'target_num']]
UL_data = unlabeled_data[['tokens', 'target', 'target_num']]
data = L_data.append(UL_data, ignore_index=True)

In [21]:
data

Unnamed: 0,tokens,target,target_num
0,wall street journal amazon bestselling author ...,relationships,1
1,helping set stage biowares hotly anticipated d...,-1,-1
2,sebastian locke fiftysixyearold patriarch powe...,divorce,3
3,take back oxmoon lost paradise childhood take ...,relationships,1
4,mayflower set sail carried board men women wou...,-1,-1
...,...,...,...
29447,katy carr lively daredevil oldest sister big f...,-1,-1
29448,one day could become anonymous free obligation...,-1,-1
29449,jesus son first collection stories denis johns...,-1,-1
29450,akiko crew spuckler boach mr beeba poog gax fa...,-1,-1


In [22]:
classes = data['target']
docs = data['tokens']
targets = data['target_num']

# **(semi)-Supervised modeling**


## Basic Model
Before we start with semi-supervised modeling, let us first take a look at the output of the basic model.

The topics that were created mostly make sense. There are some clearly defined topics such as "nasa, orbit, spacecraft, moon" but also some topics that seem mostly derived from other topics. We can visualize this by extracting the topic representations per class and see if our unsupervised model closely resembles this. 

**NOTE**: You can **hover** over the bars to see the representation per class!!

The results do seem promising. Topics like "nasa, space, etc" seem to be clearly related to sci.space, but some topics were created that span many categories. For example, we expect the topic "bike, bikes, etc"  to only appear in rec.motorcycles.  

## Semi-supervised
In the example above you might notice that some topics were somewhat smushed together. What we would like to see is a clear separation between those topics. Fortunately, we have to labels and can use them to improve the model. 

Since we are not interested in any other topics, this method is called semi-supervised topic modeling. In practice, this means that we have the labels of some documents but not all. 

For this example let's say we only have the labels of all computer-related categories:

When generating our new labels it is important to mark unknown classes as **-1**. Next, we use those newly constructed labels to again run BERTopic:

In [24]:
topic_model = BERTopic(verbose=True, calculate_probabilities=True)
topics, probs = topic_model.fit_transform(docs, y=targets)

Downloading:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/612 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/350 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/13.2k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/349 [00:00<?, ?B/s]

Batches:   0%|          | 0/921 [00:00<?, ?it/s]

2022-07-27 13:34:19,447 - BERTopic - Transformed documents to Embeddings
2022-07-27 13:35:32,646 - BERTopic - Reduced dimensionality
2022-07-27 13:36:57,598 - BERTopic - Clustered reduced embeddings


In [42]:
topic_model.save("Bertopic_all_data_v1")

In [43]:
topic_model.get_topic_info()

Unnamed: 0,Topic,Count,Name
0,-1,19432,-1_one_life_new_world
1,0,769,0_dragon_dragons_magic_king
2,1,652,1_killer_detective_case_murder
3,2,410,2_vampire_vampires_blood_werewolf
4,3,262,3_lady_duke_earl_london
...,...,...,...
220,219,10,219_kimihiro_watanuki_yko_chapters
221,220,10,220_oscar_bailey_smiler_beth
222,221,10,221_photographer_mcknight_derrick_suspense
223,222,10,222_ellie_rachel_wes_kiss


In [44]:
topic_model.visualize_topics()

In [45]:
topic_model.reduce_topics(docs, topics, nr_topics=30)

2022-07-27 14:30:52,680 - BERTopic - Reduced number of topics from 225 to 31


([-1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  15,
  -1,
  -1,
  -1,
  -1,
  -1,
  0,
  -1,
  -1,
  1,
  -1,
  -1,
  0,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  2,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  27,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  4,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  3,
  -1,
  -1,
  9,
  -1,
  -1,
  -1,
  0,
  -1,
  0,
  -1,
  -1,
  -1,
  5,
  4,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  1,
  -1,
  -1,
  -1,
  -1,
  1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  0,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  3,
  -1,
  -1,
  13,
  -1,
  1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  1,
  -1,
  -1,
  2,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  1,
  3,
  -1,
  -1,
  -1,
  -1,
  -1,
  1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  1,
  -1,
  5,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  -1,
  28,
  -1,
  -1,
  25,
  -

In [46]:
topic_model.get_topic_info()

Unnamed: 0,Topic,Count,Name
0,-1,23501,-1_one_life_new_world
1,0,858,0_dragon_magic_king_must
2,1,762,1_killer_murder_case_detective
3,2,455,2_vampire_vampires_blood_werewolf
4,3,282,3_nazi_german_war_jewish
5,4,262,4_lady_duke_man_london
6,5,251,5_planet_earth_alien_human
7,6,242,6_book_allah_jesus_bible
8,7,199,7_paris_french_novel_one
9,8,190,8_fantasy_trilogy_magic_book


In [47]:
topic_model.visualize_topics()

In [2]:
all_tops_model = BERTopic.load("Bertopic_all_data_v1")

FileNotFoundError: ignored

In [1]:
tops, t_similarity = all_tops_model.find_topics('friendship')
tops

NameError: ignored

In [83]:
t_similarity

[0.7473531465860221,
 0.6069352770781393,
 0.6020480569764376,
 0.598251909942831,
 0.5605734669081204]

In [85]:
def refine_topics(LE):
  tops, t_similarity = all_tops_model.find_topics(LE)
  for s in t_similarity:
    if s < 0.5:
      idx = t_similarity.index(s)
      tops.pop(idx)
      t_similarity.remove(s)
  if -1 in tops: 
    id1 = tops.index(-1)
    t_similarity.pop(id1)
    tops.remove(-1)
  d = {LE: tops}
  return d
  

In [86]:
def LE_tops():
  LE_tops_dict = {}
  for life_event in ['university', 'relationships', 'break ups', 'divorce', 'weddings', 'death', 'family', 'friendship']:
    d = refine_topics(life_event)
    LE_tops_dict.update(d)
  return LE_tops_dict

In [87]:
LE_tops()

{'break ups': [81, 69],
 'death': [14, 1, 25, 82],
 'divorce': [86, 25, 3, 183],
 'family': [25, 85, 11, 15],
 'friendship': [40, 183, 25, 81],
 'relationships': [183, 25, 26, 11],
 'university': [81, 173, 3],
 'weddings': [86, 25, 84]}

In [88]:
LE_Top_Map = LE_tops()

In [89]:
LE_Top_Map

{'break ups': [81, 69],
 'death': [14, 1, 25, 82],
 'divorce': [86, 25, 3, 183],
 'family': [25, 85, 11, 15],
 'friendship': [40, 183, 25, 81],
 'relationships': [183, 25, 26, 11],
 'university': [81, 173, 3],
 'weddings': [86, 25, 84]}

In [93]:
docs = test_set['tokens']

In [94]:
docs

700    nobel prizewinning genetic researcher diednow ...
701    female population sunnydale starts strutting g...
702    medusa project team arrived sydney another exc...
703    eugenie markham shaman hire shes paidto bind b...
704    cassandra caravello one renaissance venices lu...
                             ...                        
894    fans alice sebold scott smith dark gripping de...
895    nurse amy leatheran agrees look american archa...
896    heres chance read one fifteen stories created ...
897    previously published print anthologies five go...
898    fans kane chronicles series adore gorgeous pri...
Name: tokens, Length: 199, dtype: object

In [96]:
docs[700]



In [97]:
doc_tops, doc_probs = all_tops_model.transform(docs[700])

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:25:47,644 - BERTopic - Reduced dimensionality
2022-07-27 15:25:47,686 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:25:47,689 - BERTopic - Predicted clusters


In [98]:
doc_tops

[-1]

In [106]:
#for each doc, for each LE, get list of probabilities of topic in LE. If any probability in list > 0.5, classify as LE. 
docs = test_set['tokens']
results = []
for doc in docs:
  doc_tops, doc_probs = all_tops_model.transform(doc)
  temp = []
  for LE in ['university', 'relationships', 'break ups', 'divorce', 'weddings', 'death', 'family', 'friendship']:
    LE_topics = LE_Top_Map.get(LE)
    print(LE_topics)
    pred = False
    for t in LE_topics:
      if doc_probs[0][t+1] > 0.5:
        pred = True
    temp.append(pred)
  results.append(temp)

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:27,220 - BERTopic - Reduced dimensionality
2022-07-27 15:37:27,299 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:27,305 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:31,251 - BERTopic - Reduced dimensionality
2022-07-27 15:37:31,298 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:31,300 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:34,378 - BERTopic - Reduced dimensionality
2022-07-27 15:37:34,416 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:34,418 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:37,524 - BERTopic - Reduced dimensionality
2022-07-27 15:37:37,564 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:37,566 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:41,150 - BERTopic - Reduced dimensionality
2022-07-27 15:37:41,191 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:41,194 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:44,280 - BERTopic - Reduced dimensionality
2022-07-27 15:37:44,316 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:44,318 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:47,417 - BERTopic - Reduced dimensionality
2022-07-27 15:37:47,458 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:47,460 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:51,005 - BERTopic - Reduced dimensionality
2022-07-27 15:37:51,044 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:51,045 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:54,120 - BERTopic - Reduced dimensionality
2022-07-27 15:37:54,158 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:54,161 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:37:57,263 - BERTopic - Reduced dimensionality
2022-07-27 15:37:57,300 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:37:57,303 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:01,327 - BERTopic - Reduced dimensionality
2022-07-27 15:38:01,364 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:01,367 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:04,431 - BERTopic - Reduced dimensionality
2022-07-27 15:38:04,468 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:04,470 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:07,505 - BERTopic - Reduced dimensionality
2022-07-27 15:38:07,543 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:07,545 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:10,571 - BERTopic - Reduced dimensionality
2022-07-27 15:38:10,607 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:10,609 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:14,190 - BERTopic - Reduced dimensionality
2022-07-27 15:38:14,232 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:14,234 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:17,344 - BERTopic - Reduced dimensionality
2022-07-27 15:38:17,382 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:17,385 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:20,440 - BERTopic - Reduced dimensionality
2022-07-27 15:38:20,478 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:20,480 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:24,060 - BERTopic - Reduced dimensionality
2022-07-27 15:38:24,097 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:24,100 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:27,163 - BERTopic - Reduced dimensionality
2022-07-27 15:38:27,199 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:27,202 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:30,234 - BERTopic - Reduced dimensionality
2022-07-27 15:38:30,278 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:30,280 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:33,889 - BERTopic - Reduced dimensionality
2022-07-27 15:38:33,931 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:33,934 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:37,009 - BERTopic - Reduced dimensionality
2022-07-27 15:38:37,049 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:37,051 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:40,165 - BERTopic - Reduced dimensionality
2022-07-27 15:38:40,203 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:40,205 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:43,879 - BERTopic - Reduced dimensionality
2022-07-27 15:38:43,920 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:43,923 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:47,082 - BERTopic - Reduced dimensionality
2022-07-27 15:38:47,121 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:47,123 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:50,200 - BERTopic - Reduced dimensionality
2022-07-27 15:38:50,239 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:50,241 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:53,870 - BERTopic - Reduced dimensionality
2022-07-27 15:38:53,909 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:53,911 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:38:57,038 - BERTopic - Reduced dimensionality
2022-07-27 15:38:57,077 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:38:57,078 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:00,123 - BERTopic - Reduced dimensionality
2022-07-27 15:39:00,160 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:00,164 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:03,799 - BERTopic - Reduced dimensionality
2022-07-27 15:39:03,843 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:03,845 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:08,909 - BERTopic - Reduced dimensionality
2022-07-27 15:39:08,946 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:08,949 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:12,025 - BERTopic - Reduced dimensionality
2022-07-27 15:39:12,066 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:12,068 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:15,933 - BERTopic - Reduced dimensionality
2022-07-27 15:39:15,972 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:15,974 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:19,582 - BERTopic - Reduced dimensionality
2022-07-27 15:39:19,622 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:19,624 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:22,701 - BERTopic - Reduced dimensionality
2022-07-27 15:39:22,748 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:22,750 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:25,897 - BERTopic - Reduced dimensionality
2022-07-27 15:39:25,934 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:25,937 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:29,748 - BERTopic - Reduced dimensionality
2022-07-27 15:39:29,787 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:29,789 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:32,911 - BERTopic - Reduced dimensionality
2022-07-27 15:39:32,956 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:32,959 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:36,098 - BERTopic - Reduced dimensionality
2022-07-27 15:39:36,135 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:36,137 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:39,773 - BERTopic - Reduced dimensionality
2022-07-27 15:39:39,809 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:39,813 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:42,911 - BERTopic - Reduced dimensionality
2022-07-27 15:39:42,948 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:42,950 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:46,008 - BERTopic - Reduced dimensionality
2022-07-27 15:39:46,052 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:46,059 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:49,645 - BERTopic - Reduced dimensionality
2022-07-27 15:39:49,722 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:49,731 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:39:58,285 - BERTopic - Reduced dimensionality
2022-07-27 15:39:58,351 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:39:58,354 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:02,440 - BERTopic - Reduced dimensionality
2022-07-27 15:40:02,479 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:02,482 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:05,578 - BERTopic - Reduced dimensionality
2022-07-27 15:40:05,623 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:05,626 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:09,333 - BERTopic - Reduced dimensionality
2022-07-27 15:40:09,369 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:09,372 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:12,401 - BERTopic - Reduced dimensionality
2022-07-27 15:40:12,446 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:12,449 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:15,554 - BERTopic - Reduced dimensionality
2022-07-27 15:40:15,592 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:15,594 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:19,381 - BERTopic - Reduced dimensionality
2022-07-27 15:40:19,423 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:19,425 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:23,337 - BERTopic - Reduced dimensionality
2022-07-27 15:40:23,378 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:23,380 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:26,531 - BERTopic - Reduced dimensionality
2022-07-27 15:40:26,572 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:26,573 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:29,679 - BERTopic - Reduced dimensionality
2022-07-27 15:40:29,717 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:29,718 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:33,431 - BERTopic - Reduced dimensionality
2022-07-27 15:40:33,468 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:33,471 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:36,552 - BERTopic - Reduced dimensionality
2022-07-27 15:40:36,592 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:36,599 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:39,716 - BERTopic - Reduced dimensionality
2022-07-27 15:40:39,769 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:39,771 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:43,464 - BERTopic - Reduced dimensionality
2022-07-27 15:40:43,504 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:43,508 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:46,597 - BERTopic - Reduced dimensionality
2022-07-27 15:40:46,635 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:46,641 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:49,789 - BERTopic - Reduced dimensionality
2022-07-27 15:40:49,827 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:49,829 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:52,901 - BERTopic - Reduced dimensionality
2022-07-27 15:40:52,942 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:52,947 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:56,711 - BERTopic - Reduced dimensionality
2022-07-27 15:40:56,753 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:56,754 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:40:59,905 - BERTopic - Reduced dimensionality
2022-07-27 15:40:59,954 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:40:59,957 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:03,081 - BERTopic - Reduced dimensionality
2022-07-27 15:41:03,127 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:03,129 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:06,829 - BERTopic - Reduced dimensionality
2022-07-27 15:41:06,871 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:06,875 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:10,076 - BERTopic - Reduced dimensionality
2022-07-27 15:41:10,121 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:10,123 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:13,268 - BERTopic - Reduced dimensionality
2022-07-27 15:41:13,309 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:13,311 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:16,457 - BERTopic - Reduced dimensionality
2022-07-27 15:41:16,495 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:16,497 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:20,282 - BERTopic - Reduced dimensionality
2022-07-27 15:41:20,326 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:20,327 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:23,453 - BERTopic - Reduced dimensionality
2022-07-27 15:41:23,491 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:23,493 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:26,643 - BERTopic - Reduced dimensionality
2022-07-27 15:41:26,682 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:26,684 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:29,746 - BERTopic - Reduced dimensionality
2022-07-27 15:41:29,786 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:29,788 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:33,495 - BERTopic - Reduced dimensionality
2022-07-27 15:41:33,533 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:33,535 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:36,624 - BERTopic - Reduced dimensionality
2022-07-27 15:41:36,668 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:36,670 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:39,801 - BERTopic - Reduced dimensionality
2022-07-27 15:41:39,839 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:39,842 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:43,546 - BERTopic - Reduced dimensionality
2022-07-27 15:41:43,585 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:43,587 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:46,703 - BERTopic - Reduced dimensionality
2022-07-27 15:41:46,762 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:46,771 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:49,896 - BERTopic - Reduced dimensionality
2022-07-27 15:41:49,936 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:49,939 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:53,067 - BERTopic - Reduced dimensionality
2022-07-27 15:41:53,105 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:53,108 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:41:56,832 - BERTopic - Reduced dimensionality
2022-07-27 15:41:56,871 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:41:56,874 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:00,043 - BERTopic - Reduced dimensionality
2022-07-27 15:42:00,087 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:00,089 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:03,207 - BERTopic - Reduced dimensionality
2022-07-27 15:42:03,246 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:03,248 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:06,372 - BERTopic - Reduced dimensionality
2022-07-27 15:42:06,412 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:06,415 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:10,164 - BERTopic - Reduced dimensionality
2022-07-27 15:42:10,204 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:10,207 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:13,319 - BERTopic - Reduced dimensionality
2022-07-27 15:42:13,358 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:13,360 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:16,499 - BERTopic - Reduced dimensionality
2022-07-27 15:42:16,539 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:16,540 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:20,320 - BERTopic - Reduced dimensionality
2022-07-27 15:42:20,364 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:20,367 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:23,519 - BERTopic - Reduced dimensionality
2022-07-27 15:42:23,556 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:23,558 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:26,699 - BERTopic - Reduced dimensionality
2022-07-27 15:42:26,738 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:26,742 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:29,871 - BERTopic - Reduced dimensionality
2022-07-27 15:42:29,911 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:29,914 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:33,744 - BERTopic - Reduced dimensionality
2022-07-27 15:42:33,795 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:33,797 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:36,928 - BERTopic - Reduced dimensionality
2022-07-27 15:42:36,968 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:36,970 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:41,742 - BERTopic - Reduced dimensionality
2022-07-27 15:42:41,809 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:41,812 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:45,033 - BERTopic - Reduced dimensionality
2022-07-27 15:42:45,072 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:45,075 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:48,884 - BERTopic - Reduced dimensionality
2022-07-27 15:42:48,929 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:48,931 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:52,069 - BERTopic - Reduced dimensionality
2022-07-27 15:42:52,106 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:52,108 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:55,257 - BERTopic - Reduced dimensionality
2022-07-27 15:42:55,298 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:55,301 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:42:58,499 - BERTopic - Reduced dimensionality
2022-07-27 15:42:58,539 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:42:58,541 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:02,342 - BERTopic - Reduced dimensionality
2022-07-27 15:43:02,379 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:02,382 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:05,530 - BERTopic - Reduced dimensionality
2022-07-27 15:43:05,569 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:05,571 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:08,785 - BERTopic - Reduced dimensionality
2022-07-27 15:43:08,827 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:08,829 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:12,000 - BERTopic - Reduced dimensionality
2022-07-27 15:43:12,040 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:12,043 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:15,851 - BERTopic - Reduced dimensionality
2022-07-27 15:43:15,892 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:15,895 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:19,053 - BERTopic - Reduced dimensionality
2022-07-27 15:43:19,090 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:19,092 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:22,269 - BERTopic - Reduced dimensionality
2022-07-27 15:43:22,315 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:22,318 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:26,153 - BERTopic - Reduced dimensionality
2022-07-27 15:43:26,194 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:26,196 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:29,304 - BERTopic - Reduced dimensionality
2022-07-27 15:43:29,347 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:29,348 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:32,595 - BERTopic - Reduced dimensionality
2022-07-27 15:43:32,635 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:32,637 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:35,771 - BERTopic - Reduced dimensionality
2022-07-27 15:43:35,810 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:35,811 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:39,660 - BERTopic - Reduced dimensionality
2022-07-27 15:43:39,707 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:39,710 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:42,904 - BERTopic - Reduced dimensionality
2022-07-27 15:43:42,945 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:42,947 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:46,114 - BERTopic - Reduced dimensionality
2022-07-27 15:43:46,154 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:46,156 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:49,410 - BERTopic - Reduced dimensionality
2022-07-27 15:43:49,451 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:49,453 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:53,470 - BERTopic - Reduced dimensionality
2022-07-27 15:43:53,508 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:53,510 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:56,592 - BERTopic - Reduced dimensionality
2022-07-27 15:43:56,631 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:56,634 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:43:59,720 - BERTopic - Reduced dimensionality
2022-07-27 15:43:59,760 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:43:59,762 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:02,899 - BERTopic - Reduced dimensionality
2022-07-27 15:44:02,940 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:02,942 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:06,767 - BERTopic - Reduced dimensionality
2022-07-27 15:44:06,806 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:06,808 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:09,933 - BERTopic - Reduced dimensionality
2022-07-27 15:44:09,974 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:09,975 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:13,134 - BERTopic - Reduced dimensionality
2022-07-27 15:44:13,172 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:13,173 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:16,246 - BERTopic - Reduced dimensionality
2022-07-27 15:44:16,290 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:16,294 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:20,301 - BERTopic - Reduced dimensionality
2022-07-27 15:44:20,340 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:20,343 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:23,499 - BERTopic - Reduced dimensionality
2022-07-27 15:44:23,536 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:23,539 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:26,667 - BERTopic - Reduced dimensionality
2022-07-27 15:44:26,704 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:26,706 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:29,833 - BERTopic - Reduced dimensionality
2022-07-27 15:44:29,871 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:29,877 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:33,687 - BERTopic - Reduced dimensionality
2022-07-27 15:44:33,725 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:33,727 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:36,863 - BERTopic - Reduced dimensionality
2022-07-27 15:44:36,906 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:36,909 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:39,997 - BERTopic - Reduced dimensionality
2022-07-27 15:44:40,039 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:40,043 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:43,296 - BERTopic - Reduced dimensionality
2022-07-27 15:44:43,335 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:43,338 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:47,444 - BERTopic - Reduced dimensionality
2022-07-27 15:44:47,484 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:47,486 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:50,620 - BERTopic - Reduced dimensionality
2022-07-27 15:44:50,660 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:50,663 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:53,812 - BERTopic - Reduced dimensionality
2022-07-27 15:44:53,851 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:53,854 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:44:57,025 - BERTopic - Reduced dimensionality
2022-07-27 15:44:57,064 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:44:57,066 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:00,152 - BERTopic - Reduced dimensionality
2022-07-27 15:45:00,192 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:00,194 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:04,144 - BERTopic - Reduced dimensionality
2022-07-27 15:45:04,184 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:04,187 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:07,413 - BERTopic - Reduced dimensionality
2022-07-27 15:45:07,452 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:07,455 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:10,598 - BERTopic - Reduced dimensionality
2022-07-27 15:45:10,656 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:10,665 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:13,820 - BERTopic - Reduced dimensionality
2022-07-27 15:45:13,866 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:13,869 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:17,734 - BERTopic - Reduced dimensionality
2022-07-27 15:45:17,776 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:17,778 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:20,949 - BERTopic - Reduced dimensionality
2022-07-27 15:45:20,992 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:20,993 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:24,178 - BERTopic - Reduced dimensionality
2022-07-27 15:45:24,220 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:24,222 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:27,348 - BERTopic - Reduced dimensionality
2022-07-27 15:45:27,386 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:27,388 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:31,320 - BERTopic - Reduced dimensionality
2022-07-27 15:45:31,360 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:31,363 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:34,531 - BERTopic - Reduced dimensionality
2022-07-27 15:45:34,572 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:34,575 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:37,840 - BERTopic - Reduced dimensionality
2022-07-27 15:45:37,884 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:37,886 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:41,067 - BERTopic - Reduced dimensionality
2022-07-27 15:45:41,109 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:41,111 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:44,247 - BERTopic - Reduced dimensionality
2022-07-27 15:45:44,289 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:44,292 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:48,325 - BERTopic - Reduced dimensionality
2022-07-27 15:45:48,366 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:48,368 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:51,557 - BERTopic - Reduced dimensionality
2022-07-27 15:45:51,598 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:51,600 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:54,833 - BERTopic - Reduced dimensionality
2022-07-27 15:45:54,873 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:54,875 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:45:58,076 - BERTopic - Reduced dimensionality
2022-07-27 15:45:58,118 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:45:58,120 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:02,122 - BERTopic - Reduced dimensionality
2022-07-27 15:46:02,162 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:02,164 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:06,180 - BERTopic - Reduced dimensionality
2022-07-27 15:46:06,241 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:06,244 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:10,467 - BERTopic - Reduced dimensionality
2022-07-27 15:46:10,507 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:10,510 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:13,689 - BERTopic - Reduced dimensionality
2022-07-27 15:46:13,730 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:13,734 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:17,692 - BERTopic - Reduced dimensionality
2022-07-27 15:46:17,746 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:17,754 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:20,900 - BERTopic - Reduced dimensionality
2022-07-27 15:46:20,942 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:20,945 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:24,146 - BERTopic - Reduced dimensionality
2022-07-27 15:46:24,189 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:24,191 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:27,418 - BERTopic - Reduced dimensionality
2022-07-27 15:46:27,469 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:27,472 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:30,649 - BERTopic - Reduced dimensionality
2022-07-27 15:46:30,699 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:30,702 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:34,717 - BERTopic - Reduced dimensionality
2022-07-27 15:46:34,755 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:34,757 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:37,974 - BERTopic - Reduced dimensionality
2022-07-27 15:46:38,016 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:38,019 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:41,211 - BERTopic - Reduced dimensionality
2022-07-27 15:46:41,253 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:41,254 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:44,397 - BERTopic - Reduced dimensionality
2022-07-27 15:46:44,438 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:44,440 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:48,602 - BERTopic - Reduced dimensionality
2022-07-27 15:46:48,642 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:48,645 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:51,868 - BERTopic - Reduced dimensionality
2022-07-27 15:46:51,907 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:51,910 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:55,084 - BERTopic - Reduced dimensionality
2022-07-27 15:46:55,123 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:55,126 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:46:58,390 - BERTopic - Reduced dimensionality
2022-07-27 15:46:58,434 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:46:58,436 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:01,648 - BERTopic - Reduced dimensionality
2022-07-27 15:47:01,693 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:01,695 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:05,813 - BERTopic - Reduced dimensionality
2022-07-27 15:47:05,852 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:05,854 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:09,306 - BERTopic - Reduced dimensionality
2022-07-27 15:47:09,347 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:09,349 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:12,558 - BERTopic - Reduced dimensionality
2022-07-27 15:47:12,613 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:12,619 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:15,754 - BERTopic - Reduced dimensionality
2022-07-27 15:47:15,792 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:15,795 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:19,789 - BERTopic - Reduced dimensionality
2022-07-27 15:47:19,829 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:19,831 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:23,046 - BERTopic - Reduced dimensionality
2022-07-27 15:47:23,084 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:23,087 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:26,279 - BERTopic - Reduced dimensionality
2022-07-27 15:47:26,319 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:26,321 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:29,522 - BERTopic - Reduced dimensionality
2022-07-27 15:47:29,561 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:29,563 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:32,759 - BERTopic - Reduced dimensionality
2022-07-27 15:47:32,801 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:32,803 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:36,932 - BERTopic - Reduced dimensionality
2022-07-27 15:47:36,973 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:36,975 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:40,181 - BERTopic - Reduced dimensionality
2022-07-27 15:47:40,234 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:40,235 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:43,435 - BERTopic - Reduced dimensionality
2022-07-27 15:47:43,479 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:43,482 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:46,612 - BERTopic - Reduced dimensionality
2022-07-27 15:47:46,650 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:46,653 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:49,794 - BERTopic - Reduced dimensionality
2022-07-27 15:47:49,838 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:49,840 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:53,919 - BERTopic - Reduced dimensionality
2022-07-27 15:47:53,961 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:53,963 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:47:57,130 - BERTopic - Reduced dimensionality
2022-07-27 15:47:57,179 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:47:57,182 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:00,336 - BERTopic - Reduced dimensionality
2022-07-27 15:48:00,379 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:00,382 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:03,551 - BERTopic - Reduced dimensionality
2022-07-27 15:48:03,590 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:03,594 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:07,633 - BERTopic - Reduced dimensionality
2022-07-27 15:48:07,673 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:07,675 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:10,809 - BERTopic - Reduced dimensionality
2022-07-27 15:48:10,868 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:10,877 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:14,097 - BERTopic - Reduced dimensionality
2022-07-27 15:48:14,137 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:14,139 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:17,343 - BERTopic - Reduced dimensionality
2022-07-27 15:48:17,384 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:17,386 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:20,648 - BERTopic - Reduced dimensionality
2022-07-27 15:48:20,693 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:20,695 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:24,719 - BERTopic - Reduced dimensionality
2022-07-27 15:48:24,759 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:24,761 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:29,127 - BERTopic - Reduced dimensionality
2022-07-27 15:48:29,189 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:29,194 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:32,884 - BERTopic - Reduced dimensionality
2022-07-27 15:48:32,926 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:32,928 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:36,184 - BERTopic - Reduced dimensionality
2022-07-27 15:48:36,228 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:36,230 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:39,466 - BERTopic - Reduced dimensionality
2022-07-27 15:48:39,509 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:39,512 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:43,826 - BERTopic - Reduced dimensionality
2022-07-27 15:48:43,873 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:43,875 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:47,083 - BERTopic - Reduced dimensionality
2022-07-27 15:48:47,123 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:47,125 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 15:48:50,331 - BERTopic - Reduced dimensionality
2022-07-27 15:48:50,371 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 15:48:50,374 - BERTopic - Predicted clusters


[81, 173, 3]
[183, 25, 26, 11]
[81, 69]
[86, 25, 3, 183]
[86, 25, 84]
[14, 1, 25, 82]
[25, 85, 11, 15]
[40, 183, 25, 81]


In [None]:
#topic to life_event refinement
#remove -1 from topics
#remove all topics < 0.5
#map LE to topics
#if topic probability > 0.5

In [107]:
results_df = pd.DataFrame(results, columns = ['university', 'relationships', 'break ups', 'divorce', 'weddings', 'death', 'family', 'friendship'])

In [109]:
results_df.value_counts()

university  relationships  break ups  divorce  weddings  death  family  friendship
False       False          False      False    False     False  False   False         193
                                                         True   False   False           2
            True           False      True     False     False  False   True            2
                                      False    False     False  False   False           1
                                                                True    False           1
dtype: int64

In [110]:
results_df.to_csv('semi_sv_bertopic_results.csv')

In [111]:
!cp semi_sv_bertopic_results.csv "drive/My Drive/"

In [34]:
tops, t_similarity = topic_model.find_topics('wedding')
for i in tops:
  print(topic_model.get_topic_info(i))

   Topic  Count                                Name
0     86     31  86_wedding_jaclyn_weddings_married
   Topic  Count                     Name
0      3    262  3_lady_duke_earl_london
   Topic  Count                           Name
0     25     74  25_love_family_teamwork_novel
   Topic  Count                                   Name
0    183     12  183_emi_contemporary_romance_promises
   Topic  Count                              Name
0     84     31  84_amish_baxter_lancaster_miriam


In [36]:
tops, t_similarity = topic_model.find_topics('friendship')
for i in tops:
  print(topic_model.get_topic_info(i))

   Topic  Count                          Name
0     40     55  40_best_friends_friend_alice
   Topic  Count                                   Name
0    183     12  183_emi_contemporary_romance_promises
   Topic  Count                   Name
0     -1  19432  -1_one_life_new_world
   Topic  Count                           Name
0     25     74  25_love_family_teamwork_novel
   Topic  Count                                    Name
0     81     32  81_school_landaubanks_observation_fern


Finally, we can again extract the topics per class to see if our semi-supervised approach had some effect:

In [37]:
topics_per_class = topic_model.topics_per_class(docs, topics, classes=classes)
fig_semi_supervised = topic_model.visualize_topics_per_class(topics_per_class)
fig_semi_supervised

9it [00:08,  1.12it/s]


In [None]:
test_topic, test_probs = topic_model.transform(test_set['tokens'][700])

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 04:37:39,383 - BERTopic - Reduced dimensionality
2022-07-27 04:37:39,390 - BERTopic - Calculated probabilities with HDBSCAN
2022-07-27 04:37:39,396 - BERTopic - Predicted clusters


In [None]:
test_probs

array([[1.43701307e-20, 8.74104916e-15, 2.72220487e-19, 1.15641793e-14,
        9.64154277e-15, 1.04078157e-14, 9.07016529e-01, 1.50201422e-14,
        1.44848562e-20, 3.89370317e-19, 1.02133808e-14]])

We can clearly see that many more topics about computers were created and that the seperation between those topics are solid. This indicates that even if you do not have all the labels, you can definitely improve the model!

However, there are still some clusters that could be improved with the labels that we have. 

## Supervised

Finally, we are going to be using all labels. These labels help BERTopic understand where most clusters can be found. However, this does not mean that it will only find the 20 clusters that we have defined. If there are sub-clusters to be found, then there is a good chance BERTopic will find them! 

Not only do we see a nice seperation of the topics, there are significantly less outliers which shows that BERTopic has improved in connecting the documents to topics. 

Let's see the results by again visualizing the topic representation per class:

Now that we have used all labels, BERTopic seems to closely match our pre-defined labels. Moreover, it still allows to discover topics that were not previously defined. Thus, you can use this method to find unknown topics in pre-defined topics!

In [None]:
seed_topic_list = ['weddings', 'friendship', 'family', 'break ups', 'relationships', 'death', 'divorce', 'university']
topic_model = BERTopic(seed_topic_list=seed_topic_list,calculate_probabilities=True, verbose=True)
topics, probs = topic_model.fit_transform(docs, y=targets)


Batches:   0%|          | 0/25 [00:00<?, ?it/s]

2022-07-27 04:45:14,554 - BERTopic - Transformed documents to Embeddings


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

2022-07-27 04:45:21,768 - BERTopic - Reduced dimensionality
2022-07-27 04:45:21,836 - BERTopic - Clustered reduced embeddings


In [None]:
probs[0]

array([2.24355509e-308, 3.44345959e-308, 3.18122837e-308, 1.48034627e-308,
       1.00000000e+000, 2.23852284e-308, 2.71970757e-308, 2.02957495e-308,
       2.25570805e-308])

In [None]:
topic_model.get_topic_info()

Unnamed: 0,Topic,Count,Name
0,-1,371,-1_one_new_life_world
1,0,111,0_love_life_shes_school
2,1,107,1_novel_life_love_story
3,2,61,2_killer_one_murder_body
4,3,51,3_queen_kingdom_one_prince
5,4,40,4_stoker_new_one_times
6,5,23,5_earth_world_planet_humankind
7,6,14,6_doctor_mack_rose_book
8,7,12,7_vampire_vampires_saba_blood
9,8,10,8_unbounded_expected_landon_never


In [None]:
topics_per_class = topic_model.topics_per_class(docs, topics, classes=classes)
fig_semi_supervised = topic_model.visualize_topics_per_class(topics_per_class, top_n_topics=10)
fig_semi_supervised

9it [00:00, 14.43it/s]


In [None]:
topic_model.get_topic_info(10)

NameError: ignored

In [None]:
similar_topics, similarity = topic_model.find_topics("university", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('london', 0.05192365395005623), ('city', 0.020787882842927326), ('londons', 0.015272069092610306), ('squid', 0.014952913069499684), ('jas', 0.014952913069499684), ('hannay', 0.012145605798262236), ('streets', 0.011840191293395055), ('richard', 0.01057144067236192), ('tube', 0.010031996644519975), ('malkanis', 0.009064856059962663)]
[('school', 0.013431133094552485), ('best', 0.009823454304689973), ('friends', 0.009464132922901597), ('summer', 0.00925477248264793), ('friend', 0.007521725997622359), ('shes', 0.007428361508460959), ('girl', 0.00669198165047273), ('girls', 0.006624445942547353), ('hes', 0.006591530574952619), ('year', 0.006245443569027486)]
[('kent', 0.0429726320735342), ('war', 0.02448913251228431), ('civil', 0.022706699046543346), ('america', 0.013737854573026915), ('south', 0.013472278520865457), ('kents', 0.013392252290198192), ('american', 0.01212188497955491), ('richmond', 0.012082065340881935), ('sid', 0.011619056906458657), ('philip', 0.010782754203034435)]
[('la

In [None]:
similar_topics, similarity = topic_model.find_topics("weddings", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('wedding', 0.03916593957420759), ('kady', 0.020050764215207333), ('courcy', 0.01981639926858164), ('lavon', 0.019255973011817765), ('laurel', 0.016080266033420866), ('weddings', 0.015425903691891344), ('saoirse', 0.015404236943605848), ('fake', 0.01468915082628304), ('dresses', 0.013383102735489205), ('paddy', 0.012373499051415122)]
[('lady', 0.012961724086009402), ('duke', 0.012770980444443641), ('london', 0.01041792680048587), ('earl', 0.010334892305214215), ('handsome', 0.009158555970069062), ('marry', 0.009114910700602493), ('marriage', 0.008750041093465399), ('lord', 0.007848802719804085), ('sebastian', 0.007683679399235111), ('man', 0.007219229312385458)]
[('leigh', 0.03543143098235519), ('laura', 0.02560559461904678), ('roe', 0.025499112589130388), ('beau', 0.023053917227500064), ('clayborne', 0.022632620821345602), ('retirement', 0.02203703943189772), ('husband', 0.02161702700636077), ('moriarty', 0.020418432812756977), ('veronica', 0.019738595664811454), ('tess', 0.019580747

In [None]:
similar_topics, similarity = topic_model.find_topics("break ups", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('life', 0.009562962604971383), ('sang', 0.009357778784174663), ('heart', 0.007853420914954843), ('im', 0.007665856079708666), ('love', 0.0073053206784513035), ('never', 0.00715828303561882), ('hes', 0.006350655925898168), ('away', 0.006010356821241176), ('everything', 0.005785891180156243), ('didnt', 0.005626585367011441)]
[('life', 0.003676018545439045), ('one', 0.0036720614476845933), ('world', 0.0034660000369633424), ('new', 0.003419656703564767), ('love', 0.0033461391623841523), ('family', 0.00292463995291762), ('story', 0.0028872719143259577), ('time', 0.00287022316312423), ('shes', 0.0028274234966838984), ('find', 0.0027607571398824892)]
[('shes', 0.00958059681358302), ('boyfriend', 0.008298512449777798), ('job', 0.008202797414343876), ('hollywood', 0.008131362487835914), ('movie', 0.0070544027894235235), ('perfect', 0.006878291545082357), ('alex', 0.006701325830191191), ('career', 0.006605880610103581), ('becky', 0.006084279908527788), ('lucy', 0.006078596701183006)]
[('aubrey

In [None]:
similar_topics, similarity = topic_model.find_topics("friendship", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('school', 0.013431133094552485), ('best', 0.009823454304689973), ('friends', 0.009464132922901597), ('summer', 0.00925477248264793), ('friend', 0.007521725997622359), ('shes', 0.007428361508460959), ('girl', 0.00669198165047273), ('girls', 0.006624445942547353), ('hes', 0.006591530574952619), ('year', 0.006245443569027486)]
[('life', 0.003676018545439045), ('one', 0.0036720614476845933), ('world', 0.0034660000369633424), ('new', 0.003419656703564767), ('love', 0.0033461391623841523), ('family', 0.00292463995291762), ('story', 0.0028872719143259577), ('time', 0.00287022316312423), ('shes', 0.0028274234966838984), ('find', 0.0027607571398824892)]
[('love', 0.006861141948113139), ('family', 0.00632674839651223), ('novel', 0.005236338713958875), ('life', 0.004602173525220112), ('story', 0.004353950772237878), ('mother', 0.004266249748517645), ('new', 0.00391445130865658), ('author', 0.003881338566236123), ('taylor', 0.003806573611090984), ('characters', 0.0037501324252181517)]
[('vegetar

In [None]:
similar_topics, similarity = topic_model.find_topics("divorce", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('lady', 0.012961724086009402), ('duke', 0.012770980444443641), ('london', 0.01041792680048587), ('earl', 0.010334892305214215), ('handsome', 0.009158555970069062), ('marry', 0.009114910700602493), ('marriage', 0.008750041093465399), ('lord', 0.007848802719804085), ('sebastian', 0.007683679399235111), ('man', 0.007219229312385458)]
[('leigh', 0.03543143098235519), ('laura', 0.02560559461904678), ('roe', 0.025499112589130388), ('beau', 0.023053917227500064), ('clayborne', 0.022632620821345602), ('retirement', 0.02203703943189772), ('husband', 0.02161702700636077), ('moriarty', 0.020418432812756977), ('veronica', 0.019738595664811454), ('tess', 0.019580747015524435)]
[('love', 0.006861141948113139), ('family', 0.00632674839651223), ('novel', 0.005236338713958875), ('life', 0.004602173525220112), ('story', 0.004353950772237878), ('mother', 0.004266249748517645), ('new', 0.00391445130865658), ('author', 0.003881338566236123), ('taylor', 0.003806573611090984), ('characters', 0.003750132425

In [None]:
similar_topics, similarity = topic_model.find_topics("relationships", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('aubrey', 0.013444589718433698), ('jenny', 0.01145067057633071), ('love', 0.011145459886762916), ('rosie', 0.010972596937458995), ('hopey', 0.010667048810013272), ('relationship', 0.01057607388291274), ('things', 0.010325908158456543), ('lincoln', 0.010305225036077018), ('alex', 0.010171013308748934), ('bertha', 0.010000709776472558)]
[('love', 0.006861141948113139), ('family', 0.00632674839651223), ('novel', 0.005236338713958875), ('life', 0.004602173525220112), ('story', 0.004353950772237878), ('mother', 0.004266249748517645), ('new', 0.00391445130865658), ('author', 0.003881338566236123), ('taylor', 0.003806573611090984), ('characters', 0.0037501324252181517)]
[('life', 0.003676018545439045), ('one', 0.0036720614476845933), ('world', 0.0034660000369633424), ('new', 0.003419656703564767), ('love', 0.0033461391623841523), ('family', 0.00292463995291762), ('story', 0.0028872719143259577), ('time', 0.00287022316312423), ('shes', 0.0028274234966838984), ('find', 0.0027607571398824892)]

In [None]:
similar_topics, similarity = topic_model.find_topics("death", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('zombie', 0.033486816651960336), ('virus', 0.025997025440850807), ('zombies', 0.019739186508940614), ('survivors', 0.015725412605839144), ('dead', 0.014763352191325197), ('plague', 0.011536406850210333), ('infected', 0.011472722316704112), ('walking', 0.009021840062440546), ('apocalypse', 0.008793566271296235), ('disease', 0.008658269074107511)]
[('killer', 0.014232142563428791), ('murder', 0.01137766122305586), ('detective', 0.011061228975776148), ('case', 0.010989466299622266), ('police', 0.008939553520021225), ('crime', 0.00798741800201927), ('investigation', 0.0073132879804698485), ('serial', 0.006808359834051321), ('murdered', 0.005687811497059156), ('victims', 0.005535739160214359)]
[('life', 0.009562962604971383), ('sang', 0.009357778784174663), ('heart', 0.007853420914954843), ('im', 0.007665856079708666), ('love', 0.0073053206784513035), ('never', 0.00715828303561882), ('hes', 0.006350655925898168), ('away', 0.006010356821241176), ('everything', 0.005785891180156243), ('didn

In [None]:
similar_topics, similarity = topic_model.find_topics("family", top_n=5); 
for i in similar_topics:
  print(topic_model.get_topic(i))

[('love', 0.006861141948113139), ('family', 0.00632674839651223), ('novel', 0.005236338713958875), ('life', 0.004602173525220112), ('story', 0.004353950772237878), ('mother', 0.004266249748517645), ('new', 0.00391445130865658), ('author', 0.003881338566236123), ('taylor', 0.003806573611090984), ('characters', 0.0037501324252181517)]
[('life', 0.003676018545439045), ('one', 0.0036720614476845933), ('world', 0.0034660000369633424), ('new', 0.003419656703564767), ('love', 0.0033461391623841523), ('family', 0.00292463995291762), ('story', 0.0028872719143259577), ('time', 0.00287022316312423), ('shes', 0.0028274234966838984), ('find', 0.0027607571398824892)]
[('lucy', 0.022611974004423364), ('father', 0.016254596988697984), ('holly', 0.012558803919365157), ('mother', 0.012084998778411434), ('mothers', 0.009925133982837196), ('alfie', 0.00840786068204534), ('daughter', 0.008394862654365925), ('amy', 0.007827823640422782), ('life', 0.0078250906410042), ('sally', 0.0074682581384666515)]
[('ami

In [None]:
topic_model.visualize_topics()

In [None]:
topic_model.visualize_distribution(probs[200], min_probability=0.015)

NameError: ignored

In [None]:
topic_model.visualize_hierarchy(top_n_topics=50)

In [None]:
topic_model.visualize_barchart(top_n_topics=5)

In [None]:
topic_model.visualize_heatmap(n_clusters=20, width=1000, height=1000)