# Create LDA for Corpus<a id='top'></a>

0. Download an available corpus or create a new one. For the latter, create a JSON file for each subcorpus/group of texts of your corpus; each text is then a line in a JSON file. One way to do this is to crawl websites using [scrapy](https://scrapy.org) with these flags: "-o result.json -t json" (see [sample crawlers](./scripts/scraper/spiders) and [example item file](./scripts/scraper/items.py)). An example JSON file is [here](./scripts/example.json).
1. [Prepare corpus for the LDA](#prepare). This notebook demonstrates how to load a (German) TEI xml, extract metadata and texts and filter unwanted POS (only nouns are left). The result is then saved as a json which can be used in the subsequent cells. You can also prepare your corpus externally, see my [example](./scripts/text.py) which is tailored to Russian texts. It removes all non-cyrillic characters, removes all words which are not nouns and sets all nouns into first person singular using POS tagging. The result is again saved in a json file
2. [Create LDA model for the corpus](#create)
3. [Compute topic distribution for corpus](#compute)
4. [Explore corpus](corpus.ipynb) (different notebook)

Due to copyright reasons I cannot publish the scraped raw data. The results of the smoothing process in step 2 are [here](./projects/); they are used in the examples below.

In [14]:
import os
import sys
from gensim import corpora, models
import logging
import errno
import pandas as pd
from dateutil import parser
import pytz
import numpy as np
import json
import xml.etree.ElementTree as ET
import re
from tqdm import tqdm
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)

# set global path for corpus etc.
project = "russian_literature"
project_path = "projects" + os.path.sep + project


## Create LDA model for corpus<a id='create'></a>

This cell creates the topic model for the specified corpus stored in JSON files

[Back to top](#top)

In [15]:
number_of_topics = 100

# load corpus
corpus = []   
try:
    # load prepared corpus
    corpus = corpora.MmCorpus(os.path.join(project_path, project + ".corp"))
    dictionary = corpora.Dictionary.load(os.path.join(project_path, project + ".dict"))
except FileNotFoundError:
    json_data = open(os.path.join(project_path, project + ".json"))
    data = json.load(json_data)
    json_data.close()
    for entry in data:
        corpus.append(entry["text"])

    print("File extraction complete.")

    dictionary = corpora.Dictionary(corpus)
    dictionary.save(os.path.join(project_path, project + ".dict"))

    corpus = [dictionary.doc2bow(text) for text in corpus]
    corpora.MmCorpus.serialize(os.path.join(project_path, project + ".corp"), corpus)    

lda = models.ldamodel.LdaModel(corpus=corpus, id2word=dictionary, num_topics=number_of_topics, alpha='auto', eval_every=5, passes=20)

start = 1
while os.path.isfile(os.path.join(project_path, project + "_" + +str(start)+ ".lda")):
    start += 1

lda.save(os.path.join(project_path, project + "_" +str(start)+ ".lda"))

print("LDA saved as", os.path.join(project, project + "_" +str(start)+ ".lda"))

2021-09-17 09:46:51,387 : INFO : loaded corpus index from projects/russian_literature/russian_literature.corp.index
2021-09-17 09:46:51,388 : INFO : initializing cython corpus reader from projects/russian_literature/russian_literature.corp
2021-09-17 09:46:51,392 : INFO : accepted corpus with 975 documents, 488602 features, 2354218 non-zero entries
2021-09-17 09:46:51,393 : INFO : loading Dictionary object from projects/russian_literature/russian_literature.dict
2021-09-17 09:46:51,796 : INFO : loaded projects/russian_literature/russian_literature.dict
2021-09-17 09:46:51,888 : INFO : using autotuned alpha, starting with [0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01

KeyboardInterrupt: 

## Compute topic distribution for corpus<a id='compute'></a>

[Back to top](#top)

In [13]:
# load LDA model and dictionary
dictionary = corpora.Dictionary.load(os.path.join(project_path, project + ".dict"))
#model = models.LdaModel.load(os.path.join(project_path, file_name))
model = models.LdaModel.load("projects/russian_literature/russian_literature_mallet.lda")

columns = ['group', 'url', 'date', 'comment_count', 'words']
columns.extend([str(topic) for topic in range(model.num_topics)])

result = []

with open(os.path.join(project_path, project + ".json")) as json_data:
     data = json.load(json_data)

too_short = 0

for entry in data:
    # get topic distribution for entry
    line = {}
                
    # filter too short entries
    if len(entry["text"]) < 5:
        too_short += 1
        continue

    topics = [0] * model.num_topics
    for (topic, prop) in model[dictionary.doc2bow(entry["text"])]:
        topics[topic] = prop
    line["group"] = entry["volume"]+":"+entry["issue"]
    line["url"] = entry["url"]
    line["date"] = entry['date']
    line["words"] = len(entry["text"])
    try:
        line["comment_count"] = entry["comment_count"]
    except KeyError:
        line["comment_count"] = 0
    for counter in range(len(topics)):
        line[str(counter)] = topics[counter]
    result.append(line)

print("Total number of entries:", len(data))
print("Removed because too short: ", too_short)
            
frame = pd.DataFrame(result)
print(columns)
frame = frame[columns]
number = 1
if number > 0:
    file_name = project + "_topics_" + str(number) + ".json"
else: 
    file_name = project + "_topics.json"

frame.to_json(os.path.join(project_path, file_name), orient='split')
print ("Created", os.path.join(project_path, file_name))

2021-09-16 23:42:41,229 : INFO : loading Dictionary object from projects/russian_literature/russian_literature.dict
2021-09-16 23:42:41,597 : INFO : loaded projects/russian_literature/russian_literature.dict
2021-09-16 23:42:41,647 : INFO : loading LdaModel object from projects/russian_literature/russian_literature_mallet.lda
2021-09-16 23:42:41,963 : INFO : loading id2word recursively from projects/russian_literature/russian_literature_mallet.lda.id2word.* with mmap=None
2021-09-16 23:42:41,964 : INFO : loading word_topics from projects/russian_literature/russian_literature_mallet.lda.word_topics.npy with mmap=None
2021-09-16 23:42:42,101 : INFO : loading wordtopics from projects/russian_literature/russian_literature_mallet.lda.wordtopics.npy with mmap=None
2021-09-16 23:42:42,330 : INFO : loaded projects/russian_literature/russian_literature_mallet.lda
2021-09-16 23:42:42,331 : INFO : dtype was not set in saved LdaMallet file projects/russian_literature/russian_literature_mallet.lda,

2021-09-16 23:44:00,133 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:44:05,588 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:44:05,593 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:44:08,543 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-16 23:45:40,519 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:45:40,529 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:45:43,596 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:45:49,109 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:45:49,114 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-16 23:47:20,926 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:47:26,317 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:47:26,322 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:47:29,627 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-16 23:49:03,175 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:49:03,181 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:49:06,476 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:49:12,142 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:49:12,149 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-16 23:50:46,763 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:50:53,092 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:50:53,100 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:50:56,331 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-16 23:52:38,590 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:52:38,592 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:52:41,508 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:52:47,067 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:52:47,072 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-16 23:54:21,692 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:54:27,341 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:54:27,354 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:54:30,708 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-16 23:56:03,901 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:56:03,906 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:56:06,947 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:56:12,532 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:56:12,539 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-16 23:57:44,274 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:57:49,953 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:57:49,960 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:57:52,934 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-16 23:59:41,789 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:59:41,799 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-16 23:59:45,046 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-16 23:59:50,735 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-16 23:59:50,742 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:01:38,625 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:01:44,263 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:01:44,270 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:01:47,751 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:03:24,240 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:03:24,246 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:03:27,453 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:03:33,759 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:03:33,765 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:05:13,511 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:05:19,327 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:05:19,333 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:05:22,490 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:06:56,790 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:06:56,801 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:06:59,996 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:07:05,707 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:07:05,712 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:08:41,878 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:08:47,423 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:08:47,428 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:08:50,688 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:10:24,138 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:10:24,146 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:10:27,389 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:10:32,870 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:10:32,875 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:12:03,886 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:12:09,430 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:12:09,436 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:12:12,512 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:13:47,539 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:13:47,544 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:13:50,912 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:13:57,146 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:13:57,154 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:15:34,291 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:15:40,329 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:15:40,333 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:15:43,473 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:17:22,387 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:17:22,397 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:17:25,796 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:17:32,697 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:17:32,710 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:19:10,081 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:19:16,221 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:19:16,230 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:19:19,691 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:21:02,493 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:21:02,500 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:21:06,065 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:21:12,551 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:21:12,561 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:22:51,876 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:22:57,746 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:22:57,751 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:23:01,158 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:24:36,468 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:24:36,470 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:24:39,829 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:24:45,584 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:24:45,591 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:26:21,561 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:26:27,127 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:26:27,133 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:26:30,716 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:28:06,466 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:28:06,475 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:28:09,999 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:28:16,036 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:28:16,041 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:29:53,684 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:30:00,183 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:30:00,193 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:30:04,661 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:31:49,800 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:31:49,805 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:31:52,967 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:31:58,443 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:31:58,449 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:33:36,038 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:33:42,016 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:33:42,026 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:33:45,489 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:35:33,107 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:35:33,113 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:35:36,278 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:35:41,664 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:35:41,668 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:37:16,431 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:37:22,161 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:37:22,166 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:37:25,510 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:39:01,074 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:39:01,083 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:39:04,254 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:39:09,725 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:39:09,730 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:40:44,260 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:40:50,319 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:40:50,330 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:40:54,131 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:42:37,353 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:42:37,361 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:42:41,192 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:42:47,360 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:42:47,367 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:44:26,349 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:44:31,835 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:44:31,841 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:44:34,965 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:46:06,057 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:46:06,064 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:46:09,201 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:46:14,578 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:46:14,583 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:47:41,109 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:47:46,394 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:47:46,400 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:47:49,485 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:49:18,419 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:49:18,423 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:49:21,497 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:49:26,703 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:49:26,711 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:50:53,759 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:50:59,009 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:50:59,017 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:51:02,054 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:52:30,939 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:52:30,946 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:52:33,983 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:52:39,404 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:52:39,409 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:54:06,632 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:54:12,063 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:54:12,072 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:54:15,119 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:55:44,284 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:55:44,288 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:55:47,355 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:55:52,622 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:55:52,630 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 00:57:18,634 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:57:23,881 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:57:23,886 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:57:26,896 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 00:58:55,440 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:58:55,445 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 00:58:58,587 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 00:59:03,776 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 00:59:03,782 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:00:29,561 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:00:34,871 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:00:34,887 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:00:38,037 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:02:06,700 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:02:06,704 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:02:09,766 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:02:15,040 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:02:15,046 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:03:41,898 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:03:47,244 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:03:47,252 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:03:50,299 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:05:19,289 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:05:19,298 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:05:22,368 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:05:27,812 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:05:27,819 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:06:54,486 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:06:59,933 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:06:59,939 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:07:02,975 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:08:31,422 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:08:31,427 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:08:34,555 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:08:39,766 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:08:39,771 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:10:06,328 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:10:11,537 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:10:11,545 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:10:14,617 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:11:43,309 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:11:43,315 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:11:46,353 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:11:51,725 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:11:51,730 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:13:17,597 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:13:22,945 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:13:22,950 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:13:25,927 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:14:54,354 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:14:54,359 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:14:57,486 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:15:02,902 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:15:02,907 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:16:29,091 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:16:34,479 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:16:34,481 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:16:37,308 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:18:06,014 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:18:06,019 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:18:09,103 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:18:14,348 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:18:14,354 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:19:41,405 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:19:46,766 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:19:46,775 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:19:49,707 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:21:18,386 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:21:18,390 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:21:21,329 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:21:26,500 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:21:26,506 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:22:52,769 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:22:57,958 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:22:57,965 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:23:01,012 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:24:29,620 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:24:29,625 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:24:32,753 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:24:37,969 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:24:37,975 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:26:04,338 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:26:09,481 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:26:09,485 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:26:12,399 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:27:41,237 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:27:41,243 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:27:44,242 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:27:49,575 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:27:49,579 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:29:15,327 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:29:20,551 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:29:20,555 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:29:23,419 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:30:51,842 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:30:51,864 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:30:54,941 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:31:00,561 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:31:00,566 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:32:27,262 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:32:32,368 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:32:32,373 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:32:35,382 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:34:03,738 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:34:03,744 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:34:06,739 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:34:11,958 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:34:11,962 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:35:38,010 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:35:43,230 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:35:43,238 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:35:46,352 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:37:15,228 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:37:15,231 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:37:18,148 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:37:23,421 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:37:23,423 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:38:48,814 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:38:54,157 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:38:54,162 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:38:57,078 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:40:24,791 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:40:24,794 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:40:27,664 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:40:32,819 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:40:32,829 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:41:59,525 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:42:04,714 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:42:04,720 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:42:07,655 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:43:35,555 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:43:35,561 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:43:38,412 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:43:43,690 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:43:43,693 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:45:10,103 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:45:15,548 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:45:15,557 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:45:18,568 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:46:46,677 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:46:46,684 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:46:49,607 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:46:54,859 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:46:54,864 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:48:21,885 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:48:27,131 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:48:27,137 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:48:30,112 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:49:58,875 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:49:58,879 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:50:01,847 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:50:07,086 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:50:07,093 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:51:32,594 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:51:37,933 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:51:37,939 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:51:40,991 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:53:10,153 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:53:10,161 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:53:13,087 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:53:18,467 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:53:18,476 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:54:44,794 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:54:50,039 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:54:50,043 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:54:53,066 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:56:20,988 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:56:20,996 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:56:23,908 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:56:29,255 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:56:29,259 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 01:57:55,333 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:58:00,590 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:58:00,598 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:58:03,501 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 01:59:31,769 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:59:31,779 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 01:59:34,827 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 01:59:40,264 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 01:59:40,269 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

2021-09-17 02:01:05,997 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 02:01:11,217 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 02:01:11,222 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 02:01:14,150 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num

2021-09-17 02:02:41,912 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 02:02:41,914 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/b011e4_corpus.txt --output /tmp/b011e4_corpus.mallet.infer --use-pipe-from /tmp/b011e4_corpus.mallet
2021-09-17 02:02:44,819 : INFO : inferring topics with MALLET LDA '/home/ghowa/working/mallet-2.0.8/bin/mallet infer-topics --input /tmp/b011e4_corpus.mallet.infer --inferencer /tmp/b011e4_inferencer.mallet --output-doc-topics /tmp/b011e4_doctopics.txt.infer --num-iterations 100 --doc-topics-threshold 0.0 --random-seed 0'
2021-09-17 02:02:50,068 : INFO : serializing temporary corpus to /tmp/b011e4_corpus.txt
2021-09-17 02:02:50,074 : INFO : converting temporary corpus to MALLET format with /home/ghowa/working/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --rem

Total number of entries: 975
Removed because too short:  0
['group', 'url', 'date', 'comment_count', 'words', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99']
Created projects/russian_literature/russian_literature_topics.json
