From 8d56ce836c36b6d9dab61f5393d308e95ee7329b Mon Sep 17 00:00:00 2001 From: Clemens Wolff Date: Wed, 31 Jan 2018 13:30:47 -0500 Subject: [PATCH] Split README into multiple documents --- README.md | 208 +----------------- project-fortis-pipeline/docs/background.md | 33 +++ .../docs/development-setup.md | 131 +++++++++++ .../docs/production-setup.md | 51 +++++ 4 files changed, 218 insertions(+), 205 deletions(-) create mode 100644 project-fortis-pipeline/docs/background.md create mode 100644 project-fortis-pipeline/docs/development-setup.md create mode 100644 project-fortis-pipeline/docs/production-setup.md diff --git a/README.md b/README.md index 1b627759..da930dc2 100644 --- a/README.md +++ b/README.md @@ -1,211 +1,9 @@ # project-fortis -[![Travis CI status](https://api.travis-ci.org/CatalystCode/project-fortis.svg?branch=master)](https://travis-ci.org/CatalystCode/project-fortis) - -## Background - -### Overview - Project Fortis is a data ingestion, analysis and visualization pipeline. The Fortis pipeline collects social media conversations and postings from the public web and darknet data sources. -Learn more about Fortis in our [article](https://aka.ms/fortis-story) and in our -[dashboard walkthrough (in Spanish)](http://aka.ms/fortis-colombia-demo). - -![Overview of the Fortis project workflow](https://user-images.githubusercontent.com/1086421/35058654-a326c8e8-fb86-11e7-9dbb-f3e719aabf48.png) - -### Monitoring - -Fortis is a flexible project and can be configured for many situations, e.g.: -* Ingesting from multiple data sources, including: - - Twitter - - Facebook - - Public Web (via Bing) - - RSS - - Reddit - - Instagram - - Radio Broadcasts - - ACLED -* Fortis also comes with pre-configured terms to monitor sites of these types: - - Humanitarian - - Climate Change - - Health - -### Architecture - -![Overview of the Fortis pipeline architecture](https://user-images.githubusercontent.com/1086421/33353437-d1ed7fc8-d47b-11e7-9f05-818723f8c09c.png) - -## Deployment - -### Local deployment - -#### One-time setup - -First, you need to get the code: - -```sh -git clone https://github.com/CatalystCode/project-fortis.git -cd project-fortis -``` - -Then, you need to set up some services in Azure by running the following Bash -script (e.g. via the [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/about)): - -```sh -./project-fortis-pipeline/localdeploy/fortis-deploy.sh \ - -i YOUR_SUBSCRIPTION_ID_HERE \ - -l YOUR_CLOSEST_AZURE_LOCATION_HERE \ - -o .env-secrets -``` - -This script will deploy to Azure a number of services used by Fortis, such as -ServiceBus, EventHubs, Cognitive Services, and so forth. The secrets to access -these services are stored in a `.env-secrets` file which the rest of the -development setup will leverage. All the services are stored inside of a single -resource group whose name is stored under the `FORTIS_RESOURCE_GROUP_NAME` key -in the secrets file. - -Next, you need to create a Mapbox access token. If you don't have one yet, you -can create a new one for free by [signing up](https://www.mapbox.com/signup/). -Once you have the token, append it to the `.env-secrets` file like so: - -``` -MAPBOX_ACCESS_TOKEN=your_mapbox_access_token_here -``` - -You may also want to add your email address to the `USERS` or `ADMINS` list in -the `.env` file; otherwise you wouldn't be able to log into your Fortis site! -Alternatively, you can also clear out the value for the key `AD_CLIENT_ID` in -the `.env` file in order to disable authentication entirely. This is for example -useful if you want to talk directly to the GraphQL server via the GraphiQL tool -instead of accessing the API endpoints via the UI. - -#### Preparing Docker - -This project runs entirely inside of Docker containers orchestrated by -docker-compose, so please ensure that you have installed Docker on your system, -e.g. [Docker for Windows](https://docs.docker.com/docker-for-windows/install/). - -We're using a volume mount to enable support for code hot-reload. As such, -please ensure that you've shared the drive on which your code resides with -Docker via the "Shared Drives" tab in the [Docker settings](https://docs.docker.com/docker-for-windows/#docker-settings). - - - -The containers created for this project use quite a lot of resources, so if any -of the services die with exit code 137, please give more memory to Docker via -the "Advanced" tab in the [Docker settings](https://docs.docker.com/docker-for-windows/#docker-settings). - - - -#### Running the service - -Now you can start the full Fortis pipeline with one command: - -```sh -docker-compose up --build -``` - -This will start all the Docker services and gather logs in the terminal. After -all the Docker services started, head over to the following URLs to play with -the services: - -* Frontend - - http://localhost:8888/#/site/mta/admin - - http://localhost:8888/#/site/mta - - http://localhost:8888/#/site/food/admin - - http://localhost:8888/#/site/food -* Backend - - http://localhost:8889/api/edges/graphiql - - http://localhost:8889/api/messages/graphiql - - http://localhost:8889/api/settings/graphiql - - http://localhost:8889/api/tiles/graphiql -* Spark - - http://localhost:7777/jobs/ - -After making changes, you can re-build and re-start the affected services using: - -```sh -docker-compose up --build -d -``` - -Note that any changes to the React code in project-fortis-interfaces folder will -be automatically detected and re-loaded so the re-build step above won't be -necessary for changes to the frontend. - -#### Accessing Cassandra - -If you need more low-level access to the Cassandra database, you can execute the -following command to log into a CQL shell: - -``` -docker-compose exec project_fortis_services /app/cqlsh -``` - -#### Too many Twitter connections - -If you're getting an error from project-fortis-spark that there are too many -simultaneous Twitter connections, please follow these steps: - -1. Create a new set of [Twitter credentials](https://apps.twitter.com/app/new). -2. Make a copy of the [seed-data-twitter.tar.gz](https://github.com/CatalystCode/project-fortis/blob/master/project-fortis-pipeline/ops/storage-ddls/seed-data-twitter.tar.gz) archive, e.g. suffixing it with your name. -3. Update the `streams.csv` file in your copy of the archive with your Twitter credentials. -4. Commit and push your copy of the archive. -5. Edit the `CASSANDRA_SEED_DATA_URL` variable in the `.env` file to point to your copy of the archive. - -### Production deployment - -#### Prerequisites - -* First and foremost, you'll need an Azure subscription. You can create one for - free [here](https://azure.microsoft.com/en-us/free/). - -* Generate an SSH key pair following [these](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/) - instructions. The contents from the generated `MyKey.pub` file will be used - for the `SSH Public Key` field in the Azure deployment. - -* You'll need an Azure service principal. You can follow these [instructions](https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal) - if you need to generate a new service principal. During the Azure deployment, - the `Application ID` will be used for the `Service Principal App ID` field - and the `Authentication Key` will be used for the `Service Principal App Key`. - -* You'll need a Mapbox access token. If you don't have one yet, [sign up](https://www.mapbox.com/signup/) - to create a new token for free. - -#### Setting up a new Azure deployment - -Hit the deploy to Azure button below: - -[![Deploy to Azure](http://azuredeploy.net/deploybutton.svg)](https://deploy.azure.com/?repository=https://github.com/catalystcode/project-fortis/tree/master?ptmpl=azuredeploy.parameters.json) - -Fill in the wizard that comes up: - -![Screenshot of ARM wizard](https://user-images.githubusercontent.com/7635865/27882830-e785819c-6193-11e7-9b27-5fc452f23b1a.png) - -Now grab a large cup of coffee as the deployment can take north of an hour to -complete. - -Once the deployment has finished, click on the `Manage your resources` -(highlighted below). - -![Screenshot of ARM template after successful deployment with highlight of management link to access the newly created resource group](https://user-images.githubusercontent.com/1086421/33331326-4437a7fe-d42f-11e7-8b4a-19b968b4705b.png) - -Now click on the `Tags` tab in the Azure Portal (highlighted below) and find the -`FORTIS_ADMIN_INTERFACE_URL` (also highlighted below). - -![Screenshot of Azure portal with highlight of the Fortis admin site URL accessed via Azure Portal tags](https://user-images.githubusercontent.com/1086421/33331249-1b1ce1f4-d42f-11e7-8341-0100660e9e74.png) - -Point your browser to the admin interface URL. Once the Fortis admin portal -loads, you can now finalize the setup of your Fortis deployment using the portal: - -![Screenshot showing the Fortis admin interface](https://user-images.githubusercontent.com/1086421/33331562-e9e589be-d42f-11e7-870c-6b758ec2141a.png) - -Once you've completed all the admin configuration, your deployment is ready to -be used. - -For more detailed information on the admin page, refer to this [guide](project-fortis-pipeline/docs/user/admin.md). +- [Find out more about the project](project-fortis-pipeline/docs/background.md) +- [Learn how to set up Fortis in Azure](project-fortis-pipeline/docs/production-setup.md) +- [Onboarding guide for developers](project-fortis-pipeline/docs/development-setup.md) diff --git a/project-fortis-pipeline/docs/background.md b/project-fortis-pipeline/docs/background.md new file mode 100644 index 00000000..311e9ecf --- /dev/null +++ b/project-fortis-pipeline/docs/background.md @@ -0,0 +1,33 @@ +# Fortis background information + +## Overview + +Project Fortis is a data ingestion, analysis and visualization pipeline. The +Fortis pipeline collects social media conversations and postings from the public +web and darknet data sources. + +Learn more about Fortis in our [article](https://aka.ms/fortis-story) and in our +[dashboard walkthrough (in Spanish)](http://aka.ms/fortis-colombia-demo). + +![Overview of the Fortis project workflow](https://user-images.githubusercontent.com/1086421/35058654-a326c8e8-fb86-11e7-9dbb-f3e719aabf48.png) + +## Monitoring + +Fortis is a flexible project and can be configured for many situations, e.g.: +* Ingesting from multiple data sources, including: + - Twitter + - Facebook + - Public Web (via Bing) + - RSS + - Reddit + - Instagram + - Radio Broadcasts + - ACLED +* Fortis also comes with pre-configured terms to monitor sites of these types: + - Humanitarian + - Climate Change + - Health + +## Architecture + +![Overview of the Fortis pipeline architecture](https://user-images.githubusercontent.com/1086421/33353437-d1ed7fc8-d47b-11e7-9f05-818723f8c09c.png) diff --git a/project-fortis-pipeline/docs/development-setup.md b/project-fortis-pipeline/docs/development-setup.md new file mode 100644 index 00000000..f4df6ed6 --- /dev/null +++ b/project-fortis-pipeline/docs/development-setup.md @@ -0,0 +1,131 @@ +# Fortis development setup + +[![Travis CI status](https://api.travis-ci.org/CatalystCode/project-fortis.svg?branch=master)](https://travis-ci.org/CatalystCode/project-fortis) + +## One-time setup + +### Getting the code + +First, you need to get the code: + +```sh +git clone https://github.com/CatalystCode/project-fortis.git +cd project-fortis +``` + +### Setting up Azure resources + +Then, you need to set up some services in Azure by running the following Bash +script (e.g. via the [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/about)): + +```sh +./project-fortis-pipeline/localdeploy/fortis-deploy.sh \ + -i YOUR_SUBSCRIPTION_ID_HERE \ + -l YOUR_CLOSEST_AZURE_LOCATION_HERE \ + -o .env-secrets +``` + +This script will deploy to Azure a number of services used by Fortis, such as +ServiceBus, EventHubs, Cognitive Services, and so forth. The secrets to access +these services are stored in a `.env-secrets` file which the rest of the +development setup will leverage. All the services are stored inside of a single +resource group whose name is stored under the `FORTIS_RESOURCE_GROUP_NAME` key +in the secrets file. + +### Generating Mapbox access token + +Next, you need to create a Mapbox access token. If you don't have one yet, you +can create a new one for free by [signing up](https://www.mapbox.com/signup/). +Once you have the token, append it to the `.env-secrets` file like so: + +``` +MAPBOX_ACCESS_TOKEN=your_mapbox_access_token_here +``` + +### Authentication + +You may also want to add your email address to the `USERS` or `ADMINS` list in +the `.env` file; otherwise you wouldn't be able to log into your Fortis site! +Alternatively, you can also clear out the value for the key `AD_CLIENT_ID` in +the `.env` file in order to disable authentication entirely. This is for example +useful if you want to talk directly to the GraphQL server via the GraphiQL tool +instead of accessing the API endpoints via the UI. + +### Preparing Docker + +This project runs entirely inside of Docker containers orchestrated by +docker-compose, so please ensure that you have installed Docker on your system, +e.g. [Docker for Windows](https://docs.docker.com/docker-for-windows/install/). + +We're using a volume mount to enable support for code hot-reload. As such, +please ensure that you've shared the drive on which your code resides with +Docker via the "Shared Drives" tab in the [Docker settings](https://docs.docker.com/docker-for-windows/#docker-settings). + + + +The containers created for this project use quite a lot of resources, so if any +of the services die with exit code 137, please give more memory to Docker via +the "Advanced" tab in the [Docker settings](https://docs.docker.com/docker-for-windows/#docker-settings). + + + +## Running the service + +Now you can start the full Fortis pipeline with one command: + +```sh +docker-compose up --build +``` + +This will start all the Docker services and gather logs in the terminal. After +all the Docker services started, head over to the following URLs to play with +the services: + +* Frontend + - http://localhost:8888/#/site/mta/admin + - http://localhost:8888/#/site/mta + - http://localhost:8888/#/site/food/admin + - http://localhost:8888/#/site/food +* Backend + - http://localhost:8889/api/edges/graphiql + - http://localhost:8889/api/messages/graphiql + - http://localhost:8889/api/settings/graphiql + - http://localhost:8889/api/tiles/graphiql +* Spark + - http://localhost:7777/jobs/ + +After making changes, you can re-build and re-start the affected services using: + +```sh +docker-compose up --build -d +``` + +Note that any changes to the React code in project-fortis-interfaces folder will +be automatically detected and re-loaded so the re-build step above won't be +necessary for changes to the frontend. + +## Tips and tricks + +### Accessing Cassandra + +If you need more low-level access to the Cassandra database, you can execute the +following command to log into a CQL shell: + +``` +docker-compose exec project_fortis_services /app/cqlsh +``` + +### Too many Twitter connections + +If you're getting an error from project-fortis-spark that there are too many +simultaneous Twitter connections, please follow these steps: + +1. Create a new set of [Twitter credentials](https://apps.twitter.com/app/new). +2. Make a copy of the [seed-data-twitter.tar.gz](https://github.com/CatalystCode/project-fortis/blob/master/project-fortis-pipeline/ops/storage-ddls/seed-data-twitter.tar.gz) archive, e.g. suffixing it with your name. +3. Update the `streams.csv` file in your copy of the archive with your Twitter credentials. +4. Commit and push your copy of the archive. +5. Edit the `CASSANDRA_SEED_DATA_URL` variable in the `.env` file to point to your copy of the archive. diff --git a/project-fortis-pipeline/docs/production-setup.md b/project-fortis-pipeline/docs/production-setup.md new file mode 100644 index 00000000..232e1586 --- /dev/null +++ b/project-fortis-pipeline/docs/production-setup.md @@ -0,0 +1,51 @@ +# Fortis production setup + +## Prerequisites + +* First and foremost, you'll need an Azure subscription. You can create one for + free [here](https://azure.microsoft.com/en-us/free/). + +* Generate an SSH key pair following [these](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/) + instructions. The contents from the generated `MyKey.pub` file will be used + for the `SSH Public Key` field in the Azure deployment. + +* You'll need an Azure service principal. You can follow these [instructions](https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal) + if you need to generate a new service principal. During the Azure deployment, + the `Application ID` will be used for the `Service Principal App ID` field + and the `Authentication Key` will be used for the `Service Principal App Key`. + +* You'll need a Mapbox access token. If you don't have one yet, [sign up](https://www.mapbox.com/signup/) + to create a new token for free. + +## Setting up a new Azure deployment + +Hit the deploy to Azure button below: + +[![Deploy to Azure](http://azuredeploy.net/deploybutton.svg)](https://deploy.azure.com/?repository=https://github.com/catalystcode/project-fortis/tree/master?ptmpl=azuredeploy.parameters.json) + +Fill in the wizard that comes up: + +![Screenshot of ARM wizard](https://user-images.githubusercontent.com/7635865/27882830-e785819c-6193-11e7-9b27-5fc452f23b1a.png) + +Now grab a large cup of coffee as the deployment can take north of an hour to +complete. + +Once the deployment has finished, click on the `Manage your resources` +(highlighted below). + +![Screenshot of ARM template after successful deployment with highlight of management link to access the newly created resource group](https://user-images.githubusercontent.com/1086421/33331326-4437a7fe-d42f-11e7-8b4a-19b968b4705b.png) + +Now click on the `Tags` tab in the Azure Portal (highlighted below) and find the +`FORTIS_ADMIN_INTERFACE_URL` (also highlighted below). + +![Screenshot of Azure portal with highlight of the Fortis admin site URL accessed via Azure Portal tags](https://user-images.githubusercontent.com/1086421/33331249-1b1ce1f4-d42f-11e7-8341-0100660e9e74.png) + +Point your browser to the admin interface URL. Once the Fortis admin portal +loads, you can now finalize the setup of your Fortis deployment using the portal: + +![Screenshot showing the Fortis admin interface](https://user-images.githubusercontent.com/1086421/33331562-e9e589be-d42f-11e7-870c-6b758ec2141a.png) + +Once you've completed all the admin configuration, your deployment is ready to +be used. + +For more detailed information on the admin page, refer to this [guide](user/admin.md).