Skip to content

Commit

Permalink
Merged in dev/gideon/initial_linode_ems_terraform_2021.10.29 (pull re…
Browse files Browse the repository at this point in the history
…quest elastic#287)

Initial terraform for Linode EMS cluster

* Add terraform script to deploy a variable number of EMS servers and one
  redis server in Linode. Servers are protected by linode firewalls
* Deploy a load balancer in front of the EMS servers. Currently at the TCP
  layer because linode does not support HTTPS between LB and nodes
* Some updates to EMS rc.local to support AWS or Linode hosting.
* Add ExecCondition to EMS systemd script to abort if env.sh is not available.
* Also: cleanup installations of awscli apt package - it installs v1, and v2
  installed by provision-common.sh
* Get TLS private key from AWS secrets manager and cert chain from tfvars

Approved-by: Can Yildiz
  • Loading branch information
Gideon Avida committed Nov 5, 2021
1 parent d99cf2a commit e7bc98f
Show file tree
Hide file tree
Showing 12 changed files with 701 additions and 51 deletions.
136 changes: 92 additions & 44 deletions instance-files/ems/etc/rc.local
Expand Up @@ -5,17 +5,48 @@ set -e

EMS_DIR="/usr/local/engageli-media-server"
ENV_FILE=/etc/engageli/env.sh
HOSTING_ENV_FILE=/etc/engageli/hosting.env
MISC_EMS_CONFIG_FLAGS=()

mkdir -p $(dirname $ENV_FILE)

REGION=$(ec2metadata --availability-zone | sed "s/.$//")
INSTANCE_ID=$(ec2metadata --instance-id)
if [ -f $HOSTING_ENV_FILE ]
then
. $HOSTING_ENV_FILE
fi

if [[ "$HOSTING" == "linode" ]]
then
# env file is created by terraform file provisioner.
# wait until it's available.
COUNT=0
while [ ! -f $ENV_FILE ]
do
if (( $COUNT > 100 ))
then
echo "Giving up on $ENV_FILE"
exit 1
fi
COUNT=$(( COUNT + 1 ))
echo "==> $ENV_FILE not there. COUNT=$COUNT"
sleep 5
done
REGION="" # FIXME: currently region implies AWS
# Currently linode based EMS uses different host name than stack
MISC_EMS_CONFIG_FLAGS+=(--client-uses-hostname)
else
# default to AWS - AKA amazon-ebs (packer speak)
HOSTING=${HOSTING:="amazon-ebs"}
REGION=$(ec2metadata --availability-zone | sed "s/.$//")
INSTANCE_ID=$(ec2metadata --instance-id)

# save tags to env file
aws ec2 describe-tags \
--region $REGION \
--filters "Name=resource-id,Values=$INSTANCE_ID" \
--query "Tags[*].{Key:Key,Value:Value}" \
--output text | sed "s/:/_/g" | sed "s/\s/=/" > $ENV_FILE
# save tags to env file
aws ec2 describe-tags \
--region $REGION \
--filters "Name=resource-id,Values=$INSTANCE_ID" \
--query "Tags[*].{Key:Key,Value:Value}" \
--output text | sed "s/:/_/g" | sed "s/\s/=/" > $ENV_FILE
fi

# set default:
ClusterID=""
Expand All @@ -25,21 +56,27 @@ ClusterID=""
# Add authorized users
/usr/local/engageli-utils/populate-os-users.sh -e $StackEnv

# Wait for local and public IPs to be assigned
# This (and other things) would fail if we ever disable public IP
IPS_ASSIGNED=false
for (( i = 0; i < 12; i++ )); do
LOCAL_IP=$(ec2metadata --local-ipv4)
PUBLIC_IP=$(ec2metadata --public-ipv4)
if [[ "$LOCAL_IP" =~ ^[0-9]+ ]] && [[ "$PUBLIC_IP" =~ ^[0-9]+ ]]; then
IPS_ASSIGNED=true
break
if [[ "$HOSTING" == "amazon-ebs" ]]
then
AUTH_TOKEN_SECRET="asm:$EMS_AUTH_SEC"
STREAM_TOKEN_SECRET="asm:$EMS_STREAM_SEC"

# Wait for local and public IPs to be assigned
# This (and other things) would fail if we ever disable public IP
IPS_ASSIGNED=false
for (( i = 0; i < 12; i++ )); do
LOCAL_IP=$(ec2metadata --local-ipv4)
PUBLIC_IP=$(ec2metadata --public-ipv4)
if [[ "$LOCAL_IP" =~ ^[0-9]+ ]] && [[ "$PUBLIC_IP" =~ ^[0-9]+ ]]; then
IPS_ASSIGNED=true
break
fi
sleep 10
done
if ! $IPS_ASSIGNED; then
echo "IPs are not assigned to instance after 2 minutes. Exiting with error!"
exit 1
fi
sleep 10
done
if ! $IPS_ASSIGNED; then
echo "IPs are not assigned to instance after 2 minutes. Exiting with error!"
exit 1
fi

# need to remove trailing . from FQDN
Expand All @@ -58,42 +95,53 @@ fi

# TODO(?): Move config to pre-start script?
/usr/local/engageli-media-server/utils/generate_local_config.js \
-H $FQDN \
--ip $LOCAL_IP \
--announced-ip $PUBLIC_IP \
--auth-token-secret asm:$EMS_AUTH_SEC \
--stream-token-secret asm:$EMS_STREAM_SEC \
--base-path /ems$ClusterID \
--region $REGION \
-H "$FQDN" \
--ip "$LOCAL_IP" \
--announced-ip "$PUBLIC_IP" \
--auth-token-secret "$AUTH_TOKEN_SECRET" \
--stream-token-secret "$STREAM_TOKEN_SECRET" \
--base-path "/ems$ClusterID" \
--region "$REGION" \
${MISC_EMS_CONFIG_FLAGS[*]} \
${REDIS_FLAGS[*]} \
--force
# Adding --force we stopped using EIPs since switching to autoscaling. Perhaps
# move reading the IPs into the service?

# FIXME: Better way to get certs?
# FIXME: This will go away when we start using load balancers for linode
# Get certificates using default nginx config
# Only exec if certs do not already exist
# if sudo [ -e /etc/letsencrypt/live/$FQDN/cert.pem ]
# then
# echo "Certificate already exists"
# else
# sudo sudo certbot --agree-tos --email 'administrator@engageli.com' --nginx -n -d $FQDN
# fi
if sudo [ -e /etc/letsencrypt/live/$FQDN/fullchain.pem ] && [[ "$HOSTING" == "linode" ]]
then
echo "Certificate already exists"
else
# TODO: maybe use "--manual --preferred-challenges dns" to skip the need of
# nginx? will require some work to get the DNS entry into route53 the first
# time...
# Ignore certbot errors (in case of rate limiting)
set +e
sudo certbot --agree-tos --email 'administrator@engageli.com' --nginx -n -d $FQDN
set -e
fi
# Disable nginx, and make sure that not running
sudo systemctl disable nginx
sudo systemctl stop nginx
set +e
sudo killall -9 nginx
set -e

sudo mkdir -p /etc/letsencrypt/live/$FQDN
sudo openssl req \
-new -sha256 -nodes -batch -x509 \
-newkey rsa:4096 \
-keyout /etc/letsencrypt/live/$FQDN/privkey.pem \
-out /etc/letsencrypt/live/$FQDN/fullchain.pem \
-subj /CN=$FQDN \
-days +365
if sudo [ ! -e /etc/letsencrypt/live/$FQDN/fullchain.pem ]
then
echo "Generating self signed cert"
sudo mkdir -p /etc/letsencrypt/live/$FQDN
sudo openssl req \
-new -sha256 -nodes -batch -x509 \
-newkey rsa:4096 \
-keyout /etc/letsencrypt/live/$FQDN/privkey.pem \
-out /etc/letsencrypt/live/$FQDN/fullchain.pem \
-subj /CN=$FQDN \
-days +365
fi

# Ensure the directory exists
mkdir -p $EMS_DIR/ems-static/
Expand Down
35 changes: 35 additions & 0 deletions linode/README.md
Expand Up @@ -6,3 +6,38 @@
1. Optionally configure `linode.env` to replace command line options.
1. Run `provision-new-ems-instance.sh` to install and configure EMS on the node.
1. Update your engageli sandbox to set `ms.hostname` to the DNS from above.

## Experiment 2: use packer and terraform to do the thing.

* To build (from `packer` dir):
```
./build_and_distribute_ami.py --build --type ems -v v1.11.11-1026-dev --builder linode
```

* See ems/variables.tf for required and options variables

## Experiment 3: add load balancing and clustering

1. generate certificate with certbot:\
`certbot certonly --manual --preferred-challenges dns -d linode.gideon.engageli-dev.com`\
To automate we can consider [certbot shell hooks](https://certbot.eff.org/docs/using.html?highlight=hooks#pre-and-post-validation-hooks)
or [Route53 plugin](https://certbot-dns-route53.readthedocs.io/en/stable/)
1. upload private key into AWS Secrets Manager
```
aws secretsmanager create-secret \
--name dev-linode-key \
--description "Private key for linode.gideon.engageli-dev.com" \
--secret-string file://privkey1.pem \
--tags "Key=FQDN,Value=linode.gideon.engageli-dev.com" "Key=StackName,Value=dev"
```
3. Set variables for cert and key to `tfvars` file
```
lb_privkey_secret_name = "dev-linode-key"
lb_cert_pem = <<EOT
-----BEGIN CERTIFICATE-----
...
...
-----END CERTIFICATE-----
EOT
```
4. Run TF apply...

0 comments on commit e7bc98f

Please sign in to comment.