This is a project to demonstrate Kubernetes Cluster logging integrated with Quine.
Discover the series of related events that lead to a Kubernetes autoscaling operation using thatDot Quine connected to streaming log data.
- Is the scale operation due to a need for optimization?
- Is the scale operation due to a bug such as a memory leak?
- Is the scale operation due to high traffic on popular container service?
- Is the scale operation due to high traffic from a possible unusual hacker-like activity?
- Should the operations leading to the autoscaling be shut down?
- Kind - A tool for running local Kubernetes clusters.
- Kubernetes - Container Orchestration System.
- Fluentbit - A scalable logging and metrics processor and forwarder.
- Kafka - A platform for handling real-time data feeds used to stream logs messages.
- Quine - Streaming graph system that connects to the log data streams and builds high-volume data into a stateful graph for analysis.
Below are the outcomes from the project.
Below is sample data used to form the Quine "standing query".
Here is a 📦 link to a compressed sample data file exported from an active Kafka log stream.
Here are sample log file snippets exported from an active Kafka log stream.
{
"@timestamp": 1646239194.285511,
"log": "2022-03-02T16:39:18.641434142Z stdout F {\"@timestamp\":1646239156.170701,\"log\":\"2022-03-02T16:39:09.141771216Z stdout F {\\\"@timestamp\\\":1646239049.969076,\\\"log\\\":\\\"2022-03-02T15:40:09.606002004Z stderr F I0302 15:40:09.605799 1 controllermanager.go:574] Started \\\\\\\"horizontalpodautoscaling\\\\\\\"\\\"}\"}"
}
{"@timestamp":1646644110.080202,"log":"2022-03-07T09:08:30.051273291Z stderr F 2022-03-07 09:08:30.051054 W | etcdserver: read-only range request \"key:\\\"/registry/horizontalpodautoscalers/\\\" range_end:\\\"/registry/horizontalpodautoscalers0\\\" count_only:true \" with result \"range_response_count:0 size:8\" took too long (158.268848ms) to execute"}
Quine configurations, outputs and findings here.
This project was based on projects samples from the "References" below. These were extended with Fluentbit, Kafka and Quine. Details are below.
This project is setup on a single-node Kubernetes cluster for the complete stack hosted on Ubuntu. This is for demonstration purposes only, in production this would be setup in a multi-Node Kubernetes cluster.
This can be setup on any cloud-based VM or locally in a *nix compatible OS.
For the example provisioning code to work below an apt
-based package manager is needed. This can easily be ported over to whatever package manager is being used.
Kind is used to manage the single-node cluser in this project.
-
Install Kind from binaries on Linux - Once complete you will have a working version of Kind running on your server. This can be used to manage a sample cluster on a single-node instance of Kubernetes.
COPY curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.11.1/kind-linux-amd64 chmod +x ./kind mv ./kind /some-dir-in-your-PATH/kind
-
Create a sample Kubernetes Cluster using Kind
kind create cluster --name hpa --image kindest/node:v1.18.4
-
Setup Logging namespace in Kubernetes. This is used by a number of containers.
kubectl create namespace logging
Follow this guide:
This setup is based on this Fluentbit Guide
kubectl -n logging create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-service-account.yaml
kubectl -n logging create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/kafka/fluent-bit-configmap.yaml
kubectl -n logging create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/output/kafka/fluent-bit-ds.yaml
The Kafka setup is based on the Strimzi project, "Kafka on Kubernetes in a few minutes".
kubectl -n logging create -f 'https://strimzi.io/install/latest?namespace=logging'
kubectl -n logging apply -f https://strimzi.io/examples/latest/kafka/kafka-persistent-single.yaml
kubectl -n logging wait kafka/my-cluster --for=condition=Ready --timeout=300s
Find the Kafka Service name and port on Kubernetes and store for later use.
Hint: something like "my-cluster-kafka-bootstrap:9092", Kafka generally runs on port "9092".
kubectl -n logging get svc
Set the above Kafka Service name and port in the Fluentbit configuration.
kubectl -n logging edit configmap fluent-bit-config
- Set the
Brokers
toBrokers=my-cluster-kafka-bootstrap:9092
.
kubectl -n logging rollout restart ds/fluent-bit
The below command will create the quine service in our Kubernetes cluster from the open source thatdot/quine Docker image.
kubectl -n logging create -f ./src/kubernetes/quine/quine-service-deployment.yaml
Find the topic name which FluentBit will use for sending data, store for later use when configuring the Quine ingest. See setting for topics=
.
kubectl -n logging get pods -l k8s-app=fluent-bit-logging --no-headers -o custom-columns=NAME:metadata.name | xargs kubectl -n logging logs | grep output:kafka
Hint: Something like "ops.kube-logs-fluentbit.stream.json.001"
Set above topic name to the --topic
entry in the below command. You should see a stream of data that is coming from Fluentbit.
kubectl -n logging run kafka-consumer -ti --image=quay.io/strimzi/kafka:0.27.1-kafka-3.0.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic ops.kube-logs-fluentbit.stream.json.001 --from-beginning
Using the above topic name setup a quine ingest.
NOTE: This can be achieved using the API or to the Quine GUI using an ingress server proxying requests to the Web UI of Quine.
See Makefile
for helpful commands using make help
. These should help to manage the running Kuberntes cluser for this project.
- YouTube: Kubernetes pod autoscaling for beginners with kind - Uses
Kind
. - Github: Kubernetes pod autoscaling for beginners with kind
- K8s System Logs - System component logs record events happening in cluster, which can be very useful for debugging. You can configure log verbosity to see more or less detail. Logs can be as coarse-grained as showing errors within a component, or as fine-grained as showing step-by-step traces of events (like HTTP access logs, pod state changes, controller actions, or scheduler decisions).
- K8s JSON Log Format - The
--logging-format=json
flag changes the format of logs from klog native - K8s API Server Configuration - The Kubernetes API server validates and configures data for the api objects which include pods, services, replicationcontrollers, and others.
- Kind Runtime Config - The Kubernetes API server validates and configures data for the api objects which include pods, services, replicationcontrollers, and others.
- JSON Lines Format - JSON Lines text format, also called newline-delimited JSON. JSON Lines is a convenient format for storing structured data that may be processed one record at a time. It works well with unix-style text processing tools and shell pipelines. It's a great format for log files. It's also a flexible format for passing messages between cooperating processes.