Skip to content

robert-giaquinto/dynamic-author-persona-topic-model

Repository files navigation

Dynamic Author-Persona Topic Model (DAP)

Introduction

See /docs/dap_2018_arxiv.pdf for technical information on the dynamic author-persona topic model (DAP).

Getting Started

  1. Clone the repo:

    cd ~
    git clone https://github.com/robert-giaquinto/dynamic-author-persona-topic-model.git
  2. Virtual environments.

    It may be easiest to install dependencies into a virtual environment. To create the virtual environment for python 2.7 run:

    cd dynamic-author-persona-topic-model
    virtualenv venv

    For python 3+ run:

    cd dynamic-author-persona-topic-model
    python -m venv ./venv

    To activate the virtualenv run:

    source ~/dynamic-author-persona-topic-model/venv/bin/activate
  3. Installing the necessary python packages.

    A requirements.txt file, listing all packages used for this project is included in the repository. To install them first make sure your virtual environment is activated, then run the following line of code:

    pip install --upgrade pip
    pip install -r ~/dynamic-author-persona-topic-model/requirements.txt
  4. Install dap package.

    This is done to allow for absolute imports, which make it easy to load python files can be spread out in different folders. To do this navigate to the ~/dynamic-author-persona-topic-model directory and run:

    python setup.py develop
  5. Preparing data for the model

    See Signal Media 1M to download the Signal Media dataset.

    See /src/preprocessing/preprocess_signalmedia.py for tools to prepare the Signal Media data. Or use the already preprocessed data included in this repository.

  6. Running the model

    See /scripts/ for examples of running the model and setting various model parameters.

Project Structure

  • docs/ - Documentation on the model, including derivation and papers related to the model.
  • log/ - Log files from running the programs.
  • scripts/ - Bash scripts for running programs.
  • dap/ - Directory containing various sub-packages of the project and any files shared across sub-packages.

About

Base implemenation of the DAP model using a variational EM algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published