Skip to content
master
Switch branches/tags
Code

Files

Permalink
Failed to load latest commit information.

Build CodeQL Go Report Card FOSSA Status Contributor Covenant


kubequery powered by Osquery

kubequery is a Osquery extension that provides SQL based analytics for Kubernetes clusters

kubequery will be packaged as docker image available from dockerhub. It is expected to be deployed as a Kubernetes Deployment per cluster. A sample deployment template is available here

kubequery tables schema is available here


Build

Go 1.16 and make are required to build kubequery. Run: make

Container image for master branch will be available on dockerhub

docker pull uptycs/kubequery:latest

For production, tagged container images should be used instead of latest.


Deployment

kubequery.yaml is a template that creates the following Kubernetes resources:

kubequery Namespace will be the placeholder for all resources that are namespaced.

kubequery-sa is ServiceAccount that is associated with the kubequery deployment pod specification. The container uses the service account token to authenticate with the API server.

kubequery-clusterrole is a ClusterRole that allows get and list operations on all resources in the following API groups:

  • "" (core)
  • admissionregistration.k8s.io
  • apps
  • autoscaling
  • batch
  • networking.k8s.io
  • policy
  • rbac.authorization.k8s.io
  • storage.k8s.io

kubequery-clusterrolebinding is a ClusterRoleBinding that binds the cluster role with the service account.

kubequery-config is a ConfigMap that will be mounted inside the container image as a directory. The contents of this config map should be similar to /etc/osquery. For example, kubequery.flags, kubequery.conf, etc. should be part of this config map.

kubequery is the Deployment that creates one replica pod. The container launched as a part of the pod is run as non-root user.

By default pod resource requests and limits are set to 500m (half a core) and 200MB. kubequery.yaml file should be tweaked to suite your needs before applying:

kubectl apply -f kubequery.yaml

Check the status of the pod using the following command. Pod should be in Running Status.

kubectl get pods -n kubequery

Validate the installation was successful by first executing:

kubectl exec -it $(kubectl get pods -n kubequery -o jsonpath={.items[0].metadata.name}) -n kubequery -- kubequeryi '.tables'

Which should produce the following output:

  => kubernetes_api_resources
  => kubernetes_cluster_role_binding_subjects
  => kubernetes_cluster_role_policy_rule
  => kubernetes_config_maps
  => kubernetes_cron_jobs
  => kubernetes_csi_drivers
  => kubernetes_csi_node_drivers
  => kubernetes_daemon_set_containers
  ...

Queries can be run using kubequeryi on the deployed container:

kubectl exec -it $(kubectl get pods -n kubequery -o jsonpath={.items[0].metadata.name}) -n kubequery -- kubequeryi --line 'SELECT * FROM kubernetes_pods'

Pod logs can be viewed using:

kubectl logs $(kubectl get pods -n kubequery -o jsonpath={.items[0].metadata.name}) -n kubequery

FAQ

Use kubequery instead of Osquery in Kubernetes?

No. kubequery should to be deployed as a Kubernetes Deployment. Which means there will be one Pod of kubequery running per Kubernetes cluster. Osquery should be deployed to every node in the cluster. Querying most Osquery tables from an ephemeral pod does not provide much value. kubequery container image also runs as non-root user, which means most of the Osquery tables will either return an error or partial data.

Deployment

Why are some columns JSON?

Normalizing nested JSON data like Kubernetes API responses will create an explosion of tables. So some of the columns in kuberenetes tables are left as JSON. Data is eventually processed by SQLite with-in Osquery. SQLite has very good JSON support.

For example if run_as_user in kubernetes_pod_security_policies table looks like the following:

{"rule": "MustRunAsNonRoot"}

To get the value of rule, the following query can be used:

SELECT value AS 'rule'
FROM kubernetes_pod_security_policies, json_tree(kubernetes_pod_security_policies.run_as_user)
WHERE key = 'rule';

+------------------+
| rule             |
+------------------+
| MustRunAsNonRoot |
+------------------+

json_each can be used to explode JSON array types. For example if volumes in kubernetes_pod_security_policies table looks like the following:

{"volumes": ["configMap","emptyDir","projected","secret","downwardAPI","persistentVolumeClaim"]}

To get a separate row for each volume, the following query can be used:

SELECT value
FROM kubernetes_pod_security_policies, json_each(kubernetes_pod_security_policies.volumes);

+-----------------------+
| value                 |
+-----------------------+
| configMap             |
| emptyDir              |
| projected             |
| secret                |
| downwardAPI           |
| persistentVolumeClaim |
+-----------------------+

Osquery logger's like TLS, Kafka loggers can be used to export scheduled query data to remove fleet management/security analytics platforms. Lamba like functions can be applied on rows of streaming data in these platforms. These lamba functions can extract necessary fields from embedded JSON to detect compliance issues or security concerns. If tables are normalized and are streamed at different schedules, it will not be trivial to JOIN across tables and trigger events/alerts.