Skip to content

getmissionctrl/openpaas

 
 

Repository files navigation

OpenPaaS

Single-command: openpaas sync --config.file config.yaml that sets up a full "Hashistack" cluster consisting of:

  • Infrastructure
    • Currently supports Hetzner
    • Other clouds coming
    • Infrastructure:
      • Load Balancer
      • Firewalls
      • Private Networks
      • Servers
      • Volumes
  • Hashistack cluster, secured with TLS & token ACL:
    • Consul (networking, service discovery, service mesh)
    • Nomad (compute scheduling)
    • Vault (secrets management)
    • Fabio LB ingress.
  • Observability:
    • Grafana
    • Loki (logs)
    • Tempo (traces)
    • Prometheus (metrics)
    • Mimir (soon, long term metrics storage)

Networking is setup so your bastion host (allowed_ips) have full access to the cluster, while Cloudflare IPs have access to 443, exposed by Fabio LB.

Below is a reference architecture of what is created, and how it should be used. OpenPaaS concerns itself with the left side of the diagram, GitOps repo and operator is for you to implement (we may build something for this later):

Hashistack

Pre-requisites

Pre-requisite software

Your machine/operator node will need the following pre-installed (openpaas will check for presence before execution):

  • nomad
  • consul
  • vault
  • ansible(Important! 2.13 or higher, please verify, as lower versions are installed by some package managers)
  • terraform
  • cfssl & cfssljson

You probably want to also use git secret to protect your [base_dir]/secrets directory in the generated files. Additionally, direnv will make life easier, as openpaas genenv --config.file [config] will generate a direnv compatible .envrc file for you.

To ensure other env variables are preserved with openpaas genenv, just add this line into an existing .envrc file:

### GENERATED CONFIG BELOW THIS LINE, DO NOT EDIT!

Other requirements

  • An SSL certificate with cert and key-file uploaded to your Hetzner project.
    • ID of the SSL certificate, which can be extracted with Hetzners hcloud-CLI, with hcloud certificate list
  • An SSH key and project already setup in Hetzner (when using Hetzner).
  • The following 4 environment variables set in your environment (S3 settings can be any S3 compatible store, including Cloudflare R2, this is used for Observability stack long-term storage):
    • S3_ENDPOINT (root domain, without https:// prefixes or / path suffixes).
    • S3_ACCESS_KEY
    • S3_SECRET_KEY
    • HETZNER_TOKEN (generated from your Hetzner account)
  • a config.yaml file. Please review the file with similar name in the root of this directory for options. Ensure that the IP of your machine/bastion host is in the allowed_ips section.
  • S3 compatible buckets pre-setup as per your config.yaml.

Setup

Once all of the above steps are setup, just run openpaas sync --config.file [config file]. If no cluster exists, it will be setup for you. If one exists, it will be synced with your config, setting up the entire cluster.

IMPORTANT! If you intend to use an SSH key other than your system-default one, please run the following first:

ssh-agent && ssh-add <path-to-key>

Non-automated steps requiring manual steps

DNS

  • Create an A-record for your domain pointing at the public IP of the generated Load Balancer.
  • Create CNAME sub-domain records for grafana and consul sub-domains.

By default, we will setup ingresses for:

  • grafana.[management_domain]
  • consul.[management_domain]

Once, you have setup DNS records for these, you'll be able to login. Please change the default Grafana password immediately! (default is admin/admin).

Consul login is consul/[CONSUL ROOT-TOKEN], the root-token can be found in your .envrc after running openpaas genenv.

Observability

Add data sources in Grafana

Loki: http://loki-http.service.consul:3100

Prometheus: http://prometheus.service.consul:9090

Tempo: http://tempo.service.consul:3200

Add Nomad & Node dashboards

  • Nomad: add dashboard ID 15764
  • Nodes: add dashboard ID 12486

Link Loki to Tempo traces

This linking assumes your app is setup as this example: demo-app. Important is that logs are also in json format. Add derived fields:

Name: trace_id
Regex: "trace_id":"([A-Za-z0-9]+)" // this is for json format
Query: ${__value.raw}
Url label: Trace
Internal link: Tempo

Nomad jobs, using consul & vault

There are examples in the examples/ folder of this repo.

Use vault from nomad job (example)

to make specific app policies:

access.hcl

path "secret/*" { #some path in secrets
    capabilities = ["read"]
}
vault policy write backend access.hcl

in nomad task definition:

      vault {
        policies = ["backend"] # policy given above

        change_mode   = "signal"
        change_signal = "SIGUSR1"
      }

Kill orphaned nomad mounts if killing a client node

export NOMAD_DATA_ROOT=«Path to your Nomad data_dir»

for ALLOC in `ls -d $NOMAD_DATA_ROOT/alloc/*`; do for JOB in `ls ${ALLOC}| grep -v alloc`; do umount ${ALLOC}/${JOB}/secrets; umount ${ALLOC}/${JOB}/dev; umount ${ALLOC}/${JOB}/proc; umount ${ALLOC}/${JOB}/alloc; done; done

TODO

  • Harden servers
    • Add SSH Key login
    • Setup UFW firewall rules
    • Template to allow hosts in cluster access to all ports
    • Restart firewall
    • Disable password login
    • Run firewall script
  • Install all required software
  • Consul setup
    • Setup cluster secrets
    • Template configs
    • Add configs to cluster
    • Systemctl script & startup
    • Verify cluster setup
    • Automate consul ACL bootstrap
    • Allow anonymous DNS access and default Consul as DNS for .consul domains
  • Nomad setup
    • Setup cluster secrets
    • Template configs
    • Add configs to cluster
    • Systemctl scripts and startup
  • Nomad & consul bootstrap expects based on inventory
  • Vault setup
    • setup cluster secrets
    • template configs
    • Systemctl script & startup
    • Auto-unlock with script/ansible/terraform
    • Integrate with Nomad
  • Observability
    • Server health
      • CPU monitor
      • RAM usage monitor
      • HD usage monitor
    • Nomad metrics
    • Consul metrics
    • Log aggregation of jobs
    • Metrics produced by jobs
    • Job tracing
    • Host monitoring (disk, cpu, memory)
  • Networking
    • Understand service mesh/ingress etc from consul
    • Ingress to outside world with http/https
    • Use consul as DNS
    • Pull private docker images
    • Observability ingress
    • Auto-accept server signatures on first time connect
  • Overall setup
    • Terraform var generation
    • Generate Ansible inventory from Terraform output
  • Grafana/Dashboards
    • Dashboards
      • Consul health
      • Nomad health
      • Vault health
      • Host health
    • SLO templates
      • Web/api service
      • Headless backend service
    • Alerts
      • Consul health
      • Nomad health
      • Vault health
      • Host health (CPU, memory, disk)

About

Open Platform-as-a-Service built on the “Hashistack” (Nomad, Consul, Vault) with observability (Grafana, Tempo, Loki, Prometheus) and Fabio LB ingress.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Go 77.3%
  • HCL 17.1%
  • Jinja 3.7%
  • Shell 1.6%
  • Makefile 0.3%