-
Vagrant (
2.2.19
): https://www.vagrantup.com/downloads -
VirtualBox (
6.1.32r149290
): https://download.virtualbox.org/virtualbox/6.1.32/
Assuming vagrant
and VirtualBox (e.g., vboxmanage
, vboxheadless
) commands are accessible from the command line:
-
Download the repository
-
If
git
installed on your machine, run:git clone git@github.com:game-sales-analytics/game-sales-analytics.git game-sales-analytics cd game-sales-analytics
-
Otherwise, to download the project use the following URL:
https://github.com/game-sales-analytics/game-sales-analytics/archive/refs/heads/main.zip
Then extract the archive file.
-
-
In the project directory, start the machines
vagrant up
This command will download the base box image which is about ~130MB, and depending on your internet speed, it might take few minutes to complete. After downloading the box, it will spin up the machines with the help of VirtualBox.
-
Initialize the Docker Swarm
vagrant docker-swarm-init
It:
-
Creates the Docker Swarm in
manager
machine. -
Joins all the worker machines to the Swarm.
-
Uploads configuration files to
manager
machine.
-
-
Setup Sentry:
-
Connect to
sentry
vm:vagrant ssh sentry
-
Install Sentry
cd self-hosted-21.12.0/ ./install.sh
-
Configure
sed -i s/"^# mail.backend: 'smtp'"/"mail.backend: 'dummy'"/ sentry/config.yml sed -i s/"^mail.host: 'smtp'$"/"# mail.host: 'smtp'"/ sentry/config.yml
-
Start
docker-compose up -d
-
Setup
Once Sentry becomes available, head over to http://localhost:9000. It first asks for the first start configurations and options. Leave them as their defaults and proceed.
Go to Projects > internal > Settings (gear icon) > DSN. Copy the DSN and use it for configuring other services. Remember to replace the
localhost
in the DSN withsentry.internal
when used for configuring other services.
You can later see performance metrics of services in Performance view of the project.
-
-
Configure
Set values for different application services. Copy each one of the following templates to their respective
.env
file as shown below:-
Copy
swarm/dbs/.env.core.template
toswarm/dbs/.env.core
-
Copy
swarm/dbs/.env.users.template
toswarm/dbs/.env.users
-
Copy
swarm/dba/.env.core.template
toswarm/dba/.env.core
-
Copy
swarm/dba/.env.users.template
toswarm/dba/.env.users
-
Copy
swarm/gsa/.env.cache.template
toswarm/gsa/.env.cache
-
Copy
swarm/gsa/.env.coresrv.template
toswarm/gsa/.env.coresrv
-
Copy
swarm/gsa/.env.userssrv.template
toswarm/gsa/.env.userssrv
Each one of these files is a
KEY=VALUE
pair of options. Fill each provided key with the proper value.Consult their commented documentation for further information on what each field is and how they will be used.
-
-
Start Docker services
-
Connect to
manager
machine:vagrant ssh manager
-
Go to swarm configuration directory:
cd swarm
-
Deploy GSA stack
docker stack deploy --compose-file compose.gsa.yaml gsa
-
Deploy Monitoring stack
docker stack deploy --compose-file compose.mon.yaml mon
-
Deploy Telegraf metrics collector
docker stack deploy --compose-file compose.tel.yaml tel
These commands download and run all the Docker images, and depending on your internet connection speed, it might take a while for stacks to become available and healthy. Meanwhile, you can check the state of stacks using the following commands:
-
List of stacks:
docker stack ls
-
List GSA stack services:
docker stack services gsa
-
List Monitoring stack services:
docker stack services mon
-
List Telegraf collector stack services:
docker stack services tel
-
List of all created services:
docker service ls
You can see the deployment status, health status, and number of replicas of services using above list services commands. Once all the deployed replicas of services are ready and healthy, you can move to the next step.
-
-
Feed dataset
The
swarm/run-prepper.sh
script contains job service which uploads the dataset to Core service. On first run, it pulls its Docker image, and retries 10 times before giving up (simply re-run it if it fails even after retries).You will only need to run it only once during each deployment.
-
Accessing the services
Using your favorite browser, you can reach following addresses:
-
http://localhost:3000: Grafana monitoring dashboard. Use the credentials set in
swarm/mon/.env.grafana
to login into the dashboard. Create a Prometheus connection and once connected to Prometheus, you can create dashboards as you need. Also, you can start by importing available dashboards at https://grafana.com/grafana/dashboards/. -
http://localhost:8086: InfluxDB dashboard. Use the credentials set in
swarm/mon/.env.influxdb
to login into the dashboard. Create a Telegraf connection API Token from Data > Telegraf. Click on the + Create Configuration button, and activate System configuration. Click on continue, choose a name (and an optional description) for the Telegraf configuration. Click the Create And Verify button. Copy the generated token shown in formatexport INFLUX_TOKEN=HERE_MUST_BE_THE_TOKEN
. Once you copied the Telegraf API Token, set it forINFLUXDB_TELEGRAF_TOKEN
inswarm/mon/.env.telegraf
. Upload it tomanager
machine usingvagrant upload-swarm-files
, then (re)start the Telegraf stack using the command mentioned above. After successful run, Telegraf will send the metrics from all nodes to InfluxDB, and you can reach the dashboards from InfluxDB Boards section. -
http://localhost:8181: Users database admin dashboard
Use credentials set in
swarm/dba/.env.users
to login into the dashboard. -
http://localhost:8585: Core database admin dashboard
Use credentials set in
swarm/dba/.env.core
to login into the dashboard.After first successful login, create a server from the panel with the following configurations:
-
Server name:
CoreDB
-
Host:
coredb
-
Database Name: Database name you have set in
swarm/dbs/.env.core
(POSTGRESQL_DATABASE
) -
Username: Application user's username you have set in
swarm/dbs/.env.core
(POSTGRESQL_USERNAME
) -
Password: Application user's password you have set in
swarm/dbs/.env.core
(POSTGRESQL_PASSWORD
)
-
-
http://localhost:8383: Docker Swarm Visualizer service. You can see live graphical representation of the swarm nodes, and running services on each node here. Use the username and un-encrypted version of the password you've already set for
MONITORING_ADMIN_PASSWORD
inswarm/mon/.env.caddy
to log in. -
http://localhost:8888: Chronograf dashboard. It is a simpler version of InfluxDB dashboard with the only purpose of viewing metrics in dashboards. It connects to InfluxDB using an API Token and you can create visualization dashboards in Chronograf.
-
http://localhost:9000: Sentry dashboard.
-
http://localhost:9292: GSA API interface. You can use Postman to interact with the APIs.
-
http://localhost:9393: Prometheus dashboard. Use the username and un-encrypted version of the password you've already set for
MONITORING_ADMIN_PASSWORD
inswarm/mon/.env.caddy
to log in.
-
To cleanup all the state created by the swarm:
-
If you have run
prepper
service script, run the following command (inmanager
VM):-
Connect to
manager
VM:docker service rm prepper
-
Remove
prepper service
:docker service rm prepper
-
-
Stop and remove Sentry containers (in
sentry
vm):cd self-hosted-21.12.0/ docker-compose down --rmi local docker container prune -f docker network prune -f docker volume prune -f
-
Remove Telegraf stack (in
manager
VM):docker stack rm tel
-
Remove Monitoring stack (in
manager
VM):docker stack rm mon
-
Remove GSA stack (in
manager
VM):docker stack rm gsa
-
(Optional) Remove any remaining Docker data from VMs (in host machine):
vagrant docker-swarm-prune
-
Leave nodes from the swarm:
vagrant docker-swarm-leave
-
Finally, to delete all the VMs:
vagrant destroy --graceful --force
Postman collection for REST APIs is available at: https://www.postman.com/xeptore/workspace/gsa/collection/6663032-e7ea02bf-4666-4820-a8ff-dfa3ecbf3fbe
With default setup, a 4 core CPU and ~25GB memory would be enough. If you want to decrease the amount of memory, or number of CPU cores allocated to each virtual machine, you can do it in Vagrantfile
. Of course there is no guarantee that the application works correctly after those changes!
-
Waiting for a long time, but there are still services or replicas waiting to run without any changes
You waited for a relatively long time, watching stack services list, and still there are services or replicas not being started. This might be due to slow download speed which should be fixed by waiting more until all necessary Docker images gets downloaded and run on virtual machines.
If you noticed there is nothing being downloaded (e.g., by checking your system network internet usage), but still there are services not being started, it might be because of reaching maximum retires for downloading Docker images. In this case you can simply remove the stack(s) using
docker stack rm STACK [STACK...]
command, and re-deploy it using the commands explained above. -
Services are deployed and they are ready, but I cannot access one or some of them from my machine.
If listing stack services shows all the services are successfully deployed and in ready state, but you cannot reach some or any of them by hitting their URLs (e.g., receiving connection reset error), there might be a bug with VirtualBox. One solution is to re-deploy the stack(s) which contain the service(s). For example, if accessing application APIs returns connection reset error after some amount of time, remove the stack from
manager
machine, usingdocker stack rm gsa
, wait about 1-2 minute for the stack to be completely removed from all swarm nodes, and re-deploy it using the command explained above.
- Fix automatic monitoring stack
$DNS_SERVER_IP
variable setup - Fix service startup ordering
- Listen docker swarm manager only on private interface
- Revise swarm services restart policy condition (shutting down a service due to service health check timeout, will result in
0
status code exit) - Use internal network for swarm internal communication
- Run swarm nodes vagrant commands in parallel
- Enable secure access to admin dashboards
- Add docker swarm visualizer service health check test command
- Add prepper job executor command
- Add Sentry setup section
- Add gRPC logo as the inter-service communication mechanism to the diagram
- Add APM (Sentry) VM to the diagram
- Fix DMZ nodes Caddy server Prometheus metrics scrap errors