Skip to content
NLP: An Application for Public Policy, PyCon Ireland 2018
Jupyter Notebook
Branch: master
Clone or download
Latest commit e4cae0a Jun 27, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Add FOMC statements data Nov 7, 2018
dictionaries Add PDF dictionaries for uncertainty and negativity Nov 7, 2018
images Rerun during the session Nov 11, 2018
results Add results from topic modelling Nov 7, 2018
.gitignore Update gitignore Nov 10, 2018 Update readme Jun 27, 2019
nlp_fed_statements.ipynb Rerun during the session Nov 11, 2018
requirements.txt Add requirements.txt Nov 10, 2018

Natural Language Processing: An Application in Public Policy

This repository contains the files used for the Natural Language Processing: An Application in Public Policy talk presented at PyCon Ireland 2018 by Ancil Crayton.

Slides to the presentation can be found HERE.


In this talk, I aim to show an application of NLP to enhance public policy analysis. Specifically, we will focus on using text analysis for analyzing important documents and being able to provide both a qualitative and quantitative analysis.

If you are not familiar with public policy, you may be wondering exactly what I mean by this. In short, I am referring to any efforts taken by government authorities that may result in action, regulation, or laws related to aspects of societal well-being. From an academic perspective, public policy can nest many fields of the social sciences such as economics, sociology, political economy, and public management. This is the view I take in this talk. Hence, I make references to public policy as studying applications in these fields which can relate to the motivation behind, evolution of, or effectiveness of government policy.

So, how can we use text analysis to aid in public policy analysis? Here are a few pointers for possible usages:

  • Analyze historical government documents
  • Understand evolution of ideologies through transcriptions of speeches
  • Understand responses to public policy by analyzing social media
  • Summarize collection of documents to find the most important topics related to a certain policy

What's covered?

In general, I cover basics of text analysis and the relevant tools in Python to perform this analysis. In more detail, we will briefly look at:

Text preprocessing: This is where we will clean text of extraneous information.

Feature extraction: Transform raw text into numerical measures to feed into machine learning algorithms and perform regression analysis.

Topic modelling: ML algorithms that can learn to summarize documents into "topics".

Regression analysis: Statistical models that allows us to perform a quantitative policy analysis.

Tools: We will cover basic tools within Python libraries such as gensim, nltk, scikit learn, wordcloud, and statsmodels.


The notebook provided in this repository was written in Python 3.6.

Dependencies can be installed by running the following command in the terminal: pip install -r requirements.txt

You can’t perform that action at this time.