Implementation of the Topical State Space LSTM model for text sequence analysis, as described in the paper "State Space LSTM Models with Particle MCMC Inference" by Xun Zheng et al.
This repository contains the implementation of the Topical State Space LSTM model, which combines the interpretability of state space models with the power of LSTMs for text sequence analysis. The model introduces topics into the LSTM framework, allowing for improved understanding of latent structures in sequential data.
- Topical State Space LSTM Model: Implementation of the model proposed in the paper.
- Efficient Gibbs Inference: Utilizes Sequential Monte Carlo (SMC) method for joint posterior sampling.
- NLP Applications: Adaptable for various natural language processing (NLP) tasks.
- Python (>=3.6)
- Other dependencies (specified in
requirements.txt
)
git clone https://github.com/yanisrem/SSM-Project
cd src
pip install -r requirements.txt
IMDB dataset having 25K movie reviews for NLP or text analytics. For more dataset information, please go through the following link
The goal is to create a generative model. Given a sequence
- P = 10, number of particules
- K
$\in { 10,50, 100}$ , number of topics - n_epochs = 5, number of epochs
- The evaluation metric is the perplexity