Skip to content

lanchuhuong/Dynamic-topic-modelling

Repository files navigation

This projects uses UN General Debate Corpus to study dynamic topic modelling. If you wish to understand how dynamic topic modeling work, details can be found in the original paper.

This article uses a dataset composed of the corpus of texts of UN General Debate. It contains all the statements made by each country's presentative at UN General Debate from 1970 to 2020. The data set is open data and is available online in the this link.You will need to request access to the data. The data set is provided in the form of text files. To create raw data in a dataframe format, please run the processing_textfile.py. Feel free to download the read-to-use raw data set that I created from the text files in this folder.

To run the preprocessing pipeline to cleanse the data, please run Dataprocessing.py. Feel free to use the preprocessed dataset in this link.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published