New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some docs #29
Some docs #29
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# Configuration | ||
|
||
## What does terraboot do | ||
|
||
It generates [terraform]()-readable json. Make sure you have latest terraform installed, they do tend to fix bugs from one version to the other | ||
|
||
## Modules | ||
|
||
terraboot is divided in 4 modules at the moment: | ||
|
||
* vpc (generated by the _vpc-vpn-infra_ function): for vpc, subnets, NAT, vpn, ELK, monitoring and alerting boxes (the setup of the boxes is not automated at this point, see install-dns, install-icinga, install-influx, install-logstash) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use backtick to isolate fn names. |
||
* vpc dns (generated by the _vpc-public-dns_ function): for a number of public DNS names on the VPC (requires having a public domain on Route53). | ||
* cluster (geenrated by _cluster-infra_ function): for individual clusters - the idea being that you can have several clusters per vpn. | ||
* cluster dns (generated by _cluster-publlic-dns_ function): for cluster-specific DNS (also requires having a public domain on Route53) | ||
|
||
There are two main configuration files: a edn files per module for fixed details, and more variable parameters (like instance types or open ports) in the calls to terraboot. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Find this sentence a little confusing. |
||
|
||
## edn files | ||
|
||
This should contain details which are mostly fixed over the whole infrastructure. One of these file is passed in as an argument to `lein run`, for instance `lein run resources/terraboot-staging.edn`. | ||
|
||
{:region "your-aws-region" | ||
:bucket-name "your-s3-bucket" | ||
:aws-profile "your-aws-profile" | ||
:account-number "your-aws-account-number" ;; to generate ARN | ||
:azs [:a :b :c] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add examples for |
||
:target "your-target"} | ||
|
||
|
||
|
||
## Variable configuration | ||
|
||
The more variable configuration goes in the clojure code. By more variable I mean IP address ranges, names, open ports, instance types, disk sizes. | ||
As this is likely to be a moving target, it's best to check the signature of the various functions to work out which parameters should be used. The scenario we're going for now is a 'main' function calling the module functions with both a parameter EDN file and parameters | ||
|
||
``` | ||
(ns my-setup.infra | ||
(:require [terraboot.core :refer :all] | ||
[terraboot.vpc :as vpc] | ||
[terraboot.public-dns :as dns] | ||
[terraboot.cluster :as cluster] | ||
[clojure.edn :as edn] | ||
[clojure.java.io :as io])) | ||
|
||
(defn get-config | ||
"Gets info from a config file." | ||
[url] | ||
(edn/read-string (slurp url))) | ||
|
||
(defn generate-json [edn-path] | ||
(let [{:keys [account-number | ||
region | ||
azs | ||
bucket-name | ||
aws-profile | ||
target]}) (get-config (edn-path) | ||
mesos-ami "" ;; desired CoreOS AMI | ||
default-ami "" ;; desired Ubuntu AMI | ||
vpc-cidr-block "" ;; VPC address range | ||
dns-zone "mastodonc.net" ;; if public dns is to be used | ||
dns-zone-id ""] ;; AWS zone id | ||
(condp = target | ||
"vpc" (do (to-file (vpc/vpc-vpn-infra {... parameters}) "vpc/vpc.tf") | ||
(to-file (dns/vpc-public-dns {... parameters}) "vpc/vpc-dns.tf")) | ||
"dataplatform" (do (to-file (cluster/cluster-info {... parameters}) "dataplatform/dataplatform.tf") | ||
(to-file (dns/cluster-public-dns {... parameters}) "dataplatform/dataplatform-dns.tf"))))) | ||
|
||
(defn -main [edn-path] | ||
(generate-json edn-path)) | ||
``` | ||
|
||
The *.tf files referred to in the code are the terraform json file terraform will consume. It's recommended to put them in their own directory, since terraform reads all tf file in a directory. | ||
|
||
## Running it all | ||
|
||
How to run a component: generate relevant configuration file | ||
|
||
``` | ||
lein run resources/terraboot-vpc.edn # takes edn path | ||
``` | ||
In relevant directory (where tf files live) | ||
``` | ||
terraform plan . | ||
``` | ||
If planning result looks like what you'd expect (green, number of resources to plan) | ||
``` | ||
terraform apply . | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Index | ||
|
||
This is a list of the documentation in this repository. | ||
|
||
How to use terraboot | ||
|
||
* [Configuration](cluster-configuration.md) | ||
|
||
For day-to-day use of a terraboot-generated DC/OS cluster | ||
|
||
* [Useful links](useful-links.md) | ||
* [Troubleshooting](troubleshooting.md) | ||
|
||
Some extra information of things that haven't quite gotten automated yet (so TODO). | ||
|
||
* [To install logstash](install-logstash.md) | ||
* [To install influxdb and grafana](install-influx.md) | ||
* [To install the Kibana proxy](install-kibana.md) | ||
* [To install Icinga2 for alerting](install-icinga.md) | ||
|
||
## External documentation | ||
|
||
* [DC/OS documentation](https://dcos.io/docs/1.8/) | ||
* [Terraform for AWS](https://www.terraform.io/docs/providers/aws/index.html) | ||
* [mesos](https://mesos.apache.org/) | ||
* [CoreOS](https://coreos.com/) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# Troubleshooting | ||
|
||
## Starting the cluster | ||
|
||
### DC/OS console is not starting | ||
|
||
ssh into a master box and run | ||
|
||
journalctl | ||
|
||
To see if there are any errors in the DC/OS setup | ||
|
||
|
||
## deployment | ||
|
||
### Is the slave disk full (without mesos knowning) because of docker? | ||
|
||
Log on into the slave (ssh core@<ip> when on the VPN) | ||
Check for disk space `df -h` - if the main disk is at 100%, this may be your problem. | ||
Solution | ||
|
||
docker rmi $(docker images -a -q) | ||
docker rm $(docker ps -a -q)` | ||
|
||
If it doesn't release space properly, stop docker, rm -rf /var/lib/docker and restart (but this means pulling all images again). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use backticks around bash command There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. indented by 4 spaces has same effect for markdown (https://github.com/MastodonC/terraboot/blob/master/doc/troubleshooting.md) |
||
|
||
|
||
### Is it not even starting to deploy (staying in staging mode, 0 of 1) | ||
|
||
There may be a resource contention: whether memory, CPU or ports. Check whether the ports your application requires are free, and whether the resources are present on one of the slaves. This can be done by checking the mesos state-summary. | ||
|
||
curl http://staging-masters.sandpit-vpc.kixi/mesos/state-summary | ||
|
||
(View the output with formatted json) | ||
|
||
### It's deploying and starting but doesn't turn green | ||
|
||
Does the health check work? The health check should work from the masters. It can be TCP, UDP or HTTP, documentation here <https://mesosphere.github.io/marathon/docs/health-checks.html>. | ||
|
||
ssh into either the slave or the master box, and attempt to check manually. | ||
|
||
HTTP: with curl | ||
TCP, UDP: | ||
|
||
netstat -a | grep LISTEN | ||
|
||
to see all the listening ports. | ||
|
||
## Marathon | ||
|
||
### After starting a marathon framework and stopping it, it sometimes keeps a new one from starting (C*, kafka) | ||
|
||
Sometimes just removing a process from Marathon doesn't completely remove all the traces of a process. Sometimes the framework needs torn down. | ||
|
||
curl -d@delete.txt -X POST http://staging-masters.sandpit-vpc.kixi/mesos/master/teardown | ||
|
||
with delete.txt containing a string which is frameworkId=xyz | ||
|
||
Then all traces must be removed in Zookeeper and similar (described [here](https://docs.mesosphere.com/1.7/usage/managing-services/uninstall/)). | ||
For isntance for cassandra: | ||
|
||
``` | ||
docker run mesosphere/janitor /janitor.py -r cassandra-role -p cassandra-principal -z dcos-service-cassandra | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Useful links | ||
|
||
## prerequisites | ||
|
||
Having DNS setup to use the amazon DNS. | ||
The openvpn starting script will do this if you're on linux, on mac you might need a manual intervention to add the second address of the VPC address range. | ||
Say if your VPC has CIDR 172.20.0.0/20 | ||
Then your nameserver should be 172.20.0.2 | ||
|
||
## Getting on the VPN | ||
|
||
the VPN on the kixi cluster is | ||
|
||
vpn.mastodonc.net | ||
|
||
(but the public IP address for the <vpc>-vpn box would also work) | ||
|
||
## DC/OS Links | ||
|
||
These are all set up by the DC/OS installation process, and live behind an nginx proxy. | ||
|
||
DC/OS console | ||
|
||
http://<cluster-name>-masters.<vpc>-vpc.kixi | ||
|
||
Mesos console | ||
|
||
http://<cluster-name>-masters.<vpc>-vpc.kixi/mesos | ||
|
||
Marathon console | ||
|
||
http://<cluster-name>-masters.<vpc>-vpc.kixi/marathon | ||
|
||
Exhibitor | ||
|
||
http://<cluster-name>-masters.<vpc>-vpc.kixi | ||
|
||
For Cassandra API calls (and other services when set up vi dcos cli possibly) | ||
|
||
http://<cluster-name>-masters.<vpc>-vpc.kixi/service/cassandra/ | ||
|
||
|
||
## Monitoring, alerting | ||
|
||
Logstash DNS: for posting to logstash, internal DNS | ||
|
||
logstash.<vpc>-vpc.kixi | ||
|
||
Kibana | ||
|
||
ELB link, need to create DNS | ||
|
||
|
||
Influxdb: for posting to influx, internal DNS | ||
|
||
|
||
influxdb.<vpc>-vpc.kixi | ||
|
||
Icinga2 | ||
|
||
https://alerts.mastodonc.net/icingaweb2/dashboard | ||
|
||
|
||
Grafana | ||
|
||
https://grafana.mastodonc.net/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wants a link :)