Skip to content
/ taro Public

Project Repository for arms-tracker, an interactive web-application visualising the flow of arms ex- and imports and the impact on global conflict.

Notifications You must be signed in to change notification settings

Kafkaese/taro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 

Repository files navigation

Arms-tracker

Welcome to arms-tracker, an interactive web-application visualising the flow of arms ex- and imports and the impact on global conflict.

You can visit the app at www.arms-tracker.app

The app is both a passion-project as well as a project for my portfolio. If you are interested in the deails on how I created and maintain this project, this is where you can find all the details. If you are only interested in the finished product, you can stop reading now.

Structure

This give a little idea about the structure of the project. The individual components are discussed below.

This should be a really nice diagram of the project structure and workflow

Repositories

The app consists of three repositiories, plus this very repo you are reading right now.

TL;DR:

Taro

This very repository that serves as dcoumentation and a landing page. This repository contains the backend of the arms-tracker app. This includes:

EDA Jupyter Notebooks

Every data projects starts with the collection and exploration of data. Here you can find EDA notebooks as well as notebooks containing the first code for data pipelines. Since some of the additional data is scraped from Wikipedia, you can also find notebooks with web scrapers in this section.

API Code

The REST API was written in python using the FastAPI package. It is served by a Uvicorn Web Server. The API serves as the interface between the Postgresql Database and the Frontend. It contains various endpoints for get requests to retrieve data in a safe way and encapsulates the complexity of the sql queries. The whole API is containerized with Docker for fast and easy deployment.

Pipeline

In order to get the raw data preprocessed into the Postgresql Database that the API queries, there are various data pipelines. They are in the form of a custom python package named 'taro'. This way they can easily be containerized and quickly run from said container.

Tests

In order to ensure a good devlopment workflow, the code for both the API, as well as the Pipelines, comes with a number of tests. The Pytest package was used to write these. The tests are crucial for the CI workflow discussed later.

Development Environment

For local development there are a number of files for a convinient development environment. This includes a Docker Compose configuration. The configuration consists of services for the API, a postgresql server and the pipelines. Optionally, a frontend container can be enabled, but it it often turned out to be more convinient to have the frontend run on a seperate development server. In order to be able to have ssl encryption in the development stage already, the neccesary files for this are also in this repository. This ssl certificate and key are locally trusted only and serve allow the development of ssl-encrypted content locally. The cerficate and key were created with mkcert.

Continious Integration

Due to the microservice architecture of the appllication and the subsequent splitting of the code into several repositories, part of the CI pipeline is in this repository. Specifically, a Test Environment that is run every time a non-draft pull-request into the main branch of the taro-data respository is opened or synchronized. For this purpose a Github Actions workflow is used. The workflow uses Terraform to provision a Test Environment on Microsoft Azure. This includes:

  • A Resource Group.
  • A Container Registry.
  • A Postgresql Flexible Server
  • A Container Group

After the environment has been provisioned, the images for the API and the Data Pipelines are build and pushed to the container registry and the Pipeline is run. Then the Container Group starts an instance of the API image and the tests can be run. In a final step, no matter what the outcome of any previous steps, the Test Environment is destroyed. This is in order to minimize the costs of the infrastructure.

This repository contains the frontend of the arms-tracker app.

React.js Application

The frontend was written in javascript using the React Frameworl. It is containerized with Docker and served by an Nginx web server.

Continious Integration

Similarily to the backend repository, the frontend repository also contains some elements of the CI pipeline. Again, a Test Workflow, using Github Actions, is run every time a non-draft pull-request into the main branch of the taro-map respository is opened or synchronized. The workflow uses Terraform to provision a Test Environment on Microsoft Azure, which includes:

After the environment has been provisioned, the image for the frontend is build and pushed to the container registry. Then the Container Group starts an instance of the Frontend image and the tests can be run. In a final step, no matter what the outcome of any previous steps, the Test Environment is destroyed.

Repository containing Terraform IaaC for provisioning the production environment on Microsoft Azure. The structure of the infrastructure is visualized in the following graph.

This should be a really nice diagram of the infrastructure of the production environment

The indiviudal components of the infrastructure are listed below:

Resource Group, Storage, Registry and Network

All resources on Azure must be part of a Resource Group, so the production environment has a dedicated Resource Group. Part of the Resource Group is a Storage Account, that stores the backend for all the Terraform Configurations, including the one for the production environment. For this reason the Resource Group and the Storage Account are marked as indestructible.

A dedicated Virtual Network for the prodcution environment is also created, as well as a Container Registry for all docker images needed.

Postgres Server

An Azure Postgresql Flexible Server. The server is initiated with a database for the backend. It also comes with a private DNS zone that assigns a FQDN within the Virtual Network to the Postgres Server. The server has a dedicated Subnet with a servide delegation to 'Azure Postgres Flexible Server'.

API and Data Pipeline

The API and the Data Pipeline are both containerized and deployed as part of a Container Group. The Data Pipeline is deployed as an init container that is run exactly once during creation of the Container Group. Then the API container is deployed in the same Container Group. The API container runs a Uvicorn server, serving a FastAPI application. Like most resources, the Container Group has a dedicated Subnet with a service delegation.

Frontend

The React frontend is also containerized and deployed in a dedicated Container Group, again with its own subnet with a service delegation. The container runs an Nginx webserver that serves the built React application.

Reverse Proxy

The Application is reachable via a single public IP address that has a number of DNS records on Google Domains. The public IP is associated with a Network Interface connected to a Virtual Machine. The Virtual Machine runs an Nginx Reverse Proxy Server. The server is configured to listen on the 443 HTTPS port. For this purpose the ssl certificate and keychain are stored on the VM. The certifcate is valid for arms-tracker.app as well as api.arms-tracker.app, so all traffic is ssl encrypted. The Network Interface has a Network Security Group attached, with Inbound Rules for HTTPS for regular encrypted traffic to the API and the Frontend, and also for SSH for development and maintenance purposes. The reverse proxy server is configured to send traffic to arms-tracker.app to the Frontend Container Groups private IP. All traffic to api.arms-tracker.app is directed to the API Container Groups private IP.

Staging Environment

The staging environment is currently being re-worked, to more closely resemble the final production environment. There is still code for the older, currently inactive staging environment in the repositories. These include github action workflows and terraform configurations. In the future you will find more information here.

Roadmap / Future

There are many more features I want to implement in the future, both in terms of app features, as well as additions to the infrastracture. Here is a list of future to-dos, roughly in order of priority.

  • Staging Environment (see above)
  • Continous Deployment
  • Monitoring and Logging (see also structure diagram)
  • Additional app features, including:
    • Linking currently ongoing armed conflicts to arms im- and exports
    • Include more data about relative importance of the military in a nations economy, e.g military budget as percentage of GDP

Contributing

If you are interested in contributing to the project there are several ways:

  • Open an issue about a bug you found or a feature/enhancement you would like to suggest
  • Pick an open issue and fix it.
  • Start coding and make a PR with your enhancement.

All the code is open source. I greatly appreciate any input and help!

Why taro?

Gerda Taro was a war photographer during the Spanish Civil War. She was immortalized in the Song Taro by indie band Alt-J. A taro is also a root vegetable not too dissimilar to a potato, but that is neither here nor there.

About

Project Repository for arms-tracker, an interactive web-application visualising the flow of arms ex- and imports and the impact on global conflict.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published