The purpose of this repo is to:
- Automate the provision of a Kubernetes cluster in GKE only, using Terraform.
- Provision elastic stack(Elasticsearch, Kibana) using ECK(if user requests it) on the same K8s cluster.
- Deploy metricbeat/filebeat/standalone_agent on the K8s cluster.
- Stress test the cluster by deploying multiple pods in various namespaces using a cli tool.
- Take statistics of elasticsearch target indices, storage size, docs counts and execution time of query 12 by executing es_bench script
- configured gcloud SDK
- kubectl >= 1.7.0
- terraform >= 0.14
- helm > 2.4.1
- golang >= 1.17.0
- jq >= 1.6
- elasticsearch cluster reachable by gcp(only in case
provision_elasticsearch
is set tofalse
)
cd infra
terraform init
- Set the google cloud
project_id
, k8scluster_name
, k8s nodesmachine_type
and the clusterregion
interraform.tfvars
file.- For
project_id
,region
andmachine_type
defaults can be used.cluster_name
has to be unique.
- For
- Configure ElasticSearch Cluster in
terraform.tfvars
file. There are two options available:-
In case user wants a new elasticsearch cluster to be provisioned using ECK then
provision_elasticsearch
must be set totrue
. In that case variableses_password
andes_host
can be left empty.es_user
should keep the default value andimageTag
should be set to the version required. -
In case user already has an elasticsearch cluster deployed and reachable by gcp (elastic cloud) then
provision_elasticsearch = false
must be set as well as the right values to variableses_host, es_user, es_password, imageTag
-
- Set the size of the cluster by setting the required nodes number in variables
gke_num_nodes
andgke_max_num_nodes
insideterraform.tfvars
file. As the cluster is regional with 3 zones per region, the value set in those variables will result to 3x number of nodes created (gke_num_nodes
* (number of zones in region)).gke_max_num_nodes
enables cluster autoscaling in case there is a need for more resources. - Configure Monitoring.
User can select if they want their cluster to be monitored by either metricbeat/filebeat or elastic-agent in standalone mode by setting the appropriate values in variables
deployBeat
,deployAgent
. Both options can be used. terraform apply
- Configure
kubectl
by runninggcloud container clusters get-credentials <cluster-name> --zone europe-west1 --project elastic-obs-integrations-dev
The correct command can be obtained from Kubernetes Engine in GCP. - Check the cluster
kubectl get node
,kubectl get pod -A
- Bring up a kubernetes cluster with 3 nodes and no autoscaling, without provisioning Elasticsearch and without monitoring
- Example configuration:
project_id = "elastic-obs-integrations-dev" region = "europe-west1" cluster_name = "test-k8s-cluster-simple" machine_type = "e2-standard-4" gke_num_nodes = 1 gke_max_num_nodes = 1 provision_elasticsearch = false es_password = "" es_user = "elastic" es_host = "" deployBeat = false deployAgent = false imageTag = "" namespace = "kube-system"
- Bring up a kubernetes cluster with 3 nodes and autoscaling up to 18 nodes, without provisioning Elasticsearch and with Beats monitoring version 8.3.0. Prerequisite is the existence of an elastic stack.
- Example configuration:
project_id = "elastic-obs-integrations-dev" region = "europe-west1" cluster_name = "test-k8s-cluster-autoscaling-beats" machine_type = "e2-standard-4" gke_num_nodes = 1 gke_max_num_nodes = 6 provision_elasticsearch = false es_password = "mypassword" es_user = "elastic" es_host = "https://bxxxxxed.europe-west1.gcp.cloud.es.io:9243" deployBeat = true deployAgent = false imageTag = "8.3.0" namespace = "kube-system"
- Bring up a kubernetes cluster with 3 nodes and autoscaling up to 18 nodes, with Elasticsearch provisioning and with elastic-agent monitoring version 8.3.0.
- Example configuration:
project_id = "elastic-obs-integrations-dev" region = "europe-west1" cluster_name = "test-k8s-cluster-autoscaling-elasticsearch-agent" machine_type = "e2-standard-4" gke_num_nodes = 1 gke_max_num_nodes = 6 provision_elasticsearch = true es_password = "" es_user = "elastic" es_host = "" deployBeat = false deployAgent = true imageTag = "8.3.0" namespace = "kube-system"
NOTE. The command may end with an error like this but everything should be successfully deployed:
Error: Kubernetes cluster unreachable: Get "https://35.239.222.162/version?timeout=32s": dial tcp 35.239.222.162:443: connect: connection refused
cd scripts
go build
./stress_test_k8s --kubeconfig=/Users/<username>/.kube/config --deployments=20 --namespaces=10 --podlabels=4 --podannotations=4
The above command will create 10 namespaces and deploy one demo nginx deployment in each one with
as many 20 replicas as indicated in the deployments
flag. Each pod will have 4 labels and 4 annotations.
By default, no logs are being produced. If you want your pods to create logs run stress_test tool with --logs argument:
./stress_test_k8s --kubeconfig=/Users/andreasgkizas/.kube/config --deployments=1 --namespaces=2 --podlabels=2 --podannotations=2 --logs --periodoflogs 2
--Periodoflogs (in sec): Default value is 1 sec
####Prerequisite: Existence of 2 Elasticsearch Clusters. One with metricbeat index TSDB enabled and one without.
In order to get a quick estimation of the status of the 2 Elasticsearch indices(one simple and one TSDB enabled) one can execute scripts/es_bench
. By now the script can only be executed manually. More specifically the script provides the following information about the cluster:
pri.store.size
docs.count
This information would be available through_cat/indices?v=true&s=index
API. In addition to this, the script also executes q12 which is considered as "expensive" for our use case. The query is executed 20 times sequentially for each ES cluster and provides the median of the execution times.
Execution example: TSDB_ES_URL="https://35.157.42.42:9200/" TSDB_ES_PASS="passpasstsdb" SIMPLE_ES_URL="https://104.199.42.42:9200/" SIMPLE_ES_PASS="passpasssimple" TSDB_INDEX=".ds-metricbeat-tsdb-8.3.0-2022.05.24-000001" SIMPLE_INDEX=".ds-metricbeat-8.3.0-2022.05.24-000001" go run main.go Example output:
Executing against new ES cluster
Client: 8.2.0 Server: 8.3.0-SNAPSHOT
index name: .ds-metricbeat-tsdb-8.3.0-2022.05.24-000001
pri.store.size: 5.8gb
docs.count: 25635493
median query time is: 2ms
Executing against new ES cluster
Client: 8.2.0 Server: 8.3.0-SNAPSHOT
index name: .ds-metricbeat-8.3.0-2022.05.24-000001
pri.store.size: 23gb
docs.count: 39051417
median query time is: 333ms