Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.

Unsupervised learning for illicit activity


Docker container on VM

Credit to @syoh for setting up the initial Docker image and for help setting up the computational environment. See the GitHub repo :octocat: for a demo on spinning up Docker containers that work with Binder.

  • Start a Jupyter Notebook environment with docker-compose.yml
  • The Docker image will be created from Dockerfile and includes the necessary packages to run Jupyter Book
  • will download a utility to create a password and encryption keys for your Jupyter notebook

Goals of the project

This project works with the latest trade mis-invoicing estimates of the United Nations Economic Commission for Africa: Lépissier, Alice, Davis, William, & Ibrahim, Gamal. (2019). Trade Mis-Invoicing Dataset (Version 1). DOI

While generating estimates of the dollar value of illicit trade has been helpful to shed light on the severity of the problem, the next step in the analysis is to further understand the nature of the illicit activity in terms of its origins, destinations, and sectors.

Therefore, the goal of this project is to extract meaningful insights on illicit trade using unsupervised machine learning techniques. By doing so, I can identify analytically relevant categories and dimensions of variation, in order to generate hypotheses and guide further work.

This project will apply the following techniques to the data:

  1. Dimension reduction using Principal Components Analysis (PCA)
  2. Clustering
  3. Graph analysis

Jupyter Book

A Jupyter Book write-up of this project is available at