# What other Topic Models are There?

*What other Topic Models are There?*
___

## Contents:

- [What we (often) use](#What-we-(often)-use)
    - LDA
    - STM
- [Beyond the BOW approach](#Beyond-the-Bag-of-Words-approach)
    - CTM
- [Textual Information as Networks](#Textual-Information-as-Networks)
    - TopSBM
- [Sources](#Sources)

# What we (often) use

*What we (often) use*
___

## A very short introduction to LDA and STM

- basic idea: use text as data and try to understand what a text is about
- three main components and a "target": words, documents, corpora and *topics*
- closely related to dimensionality reduction
    - tf-idf
    - LSI

*What we (often) use - LDA and STM*
___
- LDA [[1]](#Sources) assumes a set of underlying topics for a corpus of documents and a distribution of all words over those topics


- this way we get 
    - probabilities for documents to belong to certain topics
    - a characterization of topics by frequent words
    - information about the topic proportions in our corpus


- STM [[2]](#Sources) extends LDA
    - introduction of a linear term for topic probabilities
    - covariates (e.g. publication date and/or source) can be used to to get a better representation of topic prevalence
    
- [ ] Can an STM be reduced to an LDA?
- [ ] Does STM also have a dirichlet prior?

*What we (often) use - LDA and STM*
___
## Pros and Cons

- LDA is widely applied and can be used in R and Python
- does not allow covariates


- STM is only implemented in R
- covariates (supposedly) make the model more interpretable
- not as widely used as LDA (yet)


- both rely on the BOW approach
- both are questionable for short documents

# Beyond the Bag of Words approach

*Beyond the BOW approach*
___
## Contextualized Topic Modeling (CTM):
- CTM [[3]](#Sources) uses pre-trained language models to overcome the BOW approach by using semantic and syntactic context information
- The main point of interest: pre-trained language models specifically, **BERT** [[4]](#Sources) (**B**idirectional **E**ncoder **R**epresentations from **T**ransformers)

*Beyond the BOW approach - CTM*
___
- Transformers are deep learning algorithms that can predict outcomes* from contextual information
- used e.g. for translation tasks
- computationally expensive to train but relatively cheap to implement once trained
- competitive if not even state of the art performance in top language modelling tasks
- no one really knows why

___
\* E.g.: what is the next sentence *y* if we have sentence *x* before and sentence *z* after.

*Beyond the BOW approach - CTM*
___
### why should we care about CTM?
- context leads to an increase in coherence compared to LDA 
- can use pre-trained models for different domains and languages
- multi-language topic modeling [[5]](#Sources)
- there are already implementations (at least for Python) [[6]](#Sources)

# Textual Information as Networks

*Textual Information as Networks*
___
## TopSBM - Topic models based on Stochastic Block Models

- Block Modeling is a method of community detection used in social network analysis (SNA) [[7]](#Sources)
- the used network structure is a (weighted) bipartite network based on the word-document matrix*

___
\* Words and documents are nodes that are connected if a word occurs within a document. This way, words can be linked via documents and vice versa. The word frequency is reflected in weighted ties.

# Sources

[[1]](https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf?ref=https://githubhelp.com) Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of machine Learning research 3.Jan (2003): 993-1022.


- [[2]](https://www.jstatsoft.org/article/view/v091i02) Roberts, Margaret E., Brandon M. Stewart, and Dustin Tingley. "Stm: An R package for structural topic models." Journal of Statistical Software 91 (2019): 1-40.


- [[3]](https://arxiv.org/abs/2004.03974) Bianchi, Federico, Silvia Terragni, and Dirk Hovy. "Pre-training is a hot topic: Contextualized document embeddings improve topic coherence." arXiv preprint arXiv:2004.03974 (2020).


- [[4]](https://arxiv.org/abs/1810.04805v2) Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).


- [[5]](https://arxiv.org/abs/2004.07737) Bianchi, Federico, et al. "Cross-lingual contextualized topic models with zero-shot learning." arXiv preprint arXiv:2004.07737 (2020).


- [[6]](https://github.com/MilaNLProc/contextualized-topic-models) Contextualized Topic Modeling on github.


- [[7]](https://methods.sagepub.com/book/the-sage-handbook-of-social-network-analysis/n31.xml) Van Duijn, Marijtje AJ, and Mark Huisman. "Statistical models for ties and actors." The SAGE handbook of social network analysis (2011): 459-483.

- [[]](https://www.science.org/doi/10.1126/sciadv.aaq1360) Gerlach, Martin, Tiago P. Peixoto, and Eduardo G. Altmann. "A network approach to topic models." Science advances 4.7 (2018): eaaq1360.


- [[]](https://topsbm.github.io/) Topic Models based on Stochastic Block Models Blog on github


- [[]](https://github.com/martingerlach/hSBM_Topicmodel) hSBM Topic Model on github

# Overall to dos:

- [ ] Examples and images
- [ ] 'Hands on' examples with code
- [ ] Open Questions