Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add owner references to EnhancedEvent, consolidate calls to apiserver, make cache size configurable and add metrics for reads #144

Merged
merged 5 commits into from
Nov 17, 2023

Conversation

ronaknnathani
Copy link

@ronaknnathani ronaknnathani commented Nov 17, 2023

Details

There are a few changes in this PR.

  • OwnerReferences is added to the exported events.

  • Deleted field in exported events
    This is to capture whether the object is deleted. This helps receivers identify whether a resource is deleted and create rules for it when needed.

  • Consolidate the number of read calls to kube-apiserver
    Currently, every time there's an event, the events exporter runs GetObject method for metadata like labels and annotations independently. This results in the multiple read calls to kube-apiserver for the same object. These number of read call grow as we want to look up additional information about the object like ownerReferences.

    So, in this change, a struct called ObjectMetadata is created to capture all the pieces of information about the object that need to be added to the EnhancedEvent. Every time there's an event, the object is fetched from kube-apiserver if it's not in the cache already and all pieces of metadata require only 1 call. The metadata is cached so repeated events about the same object don't result in more calls.

    Additionally, UID + ResourceVersion is used the cache key so if the object changes, it's looked up again.

  • Cache size is made configurable with a default size of 1024 as before

  • Metrics are added to capture reads served from cache and apiserver

Addresses issues

Testing done

Tests
go test -cover -mod=mod -v ./...
?   	github.com/resmoio/kubernetes-event-exporter	[no test files]
=== RUN   TestSimpleWriter
--- PASS: TestSimpleWriter (0.00s)
=== RUN   TestCorrectnessManyTimes
--- PASS: TestCorrectnessManyTimes (0.07s)
=== RUN   TestLargerThanBatchSize
--- PASS: TestLargerThanBatchSize (0.00s)
=== RUN   TestSimpleInterval
--- PASS: TestSimpleInterval (0.06s)
=== RUN   TestIntervalComplex
--- PASS: TestIntervalComplex (0.06s)
=== RUN   TestIntervalComplexAfterFlush
--- PASS: TestIntervalComplexAfterFlush (0.06s)
=== RUN   TestRetry
--- PASS: TestRetry (0.20s)
PASS
	github.com/resmoio/kubernetes-event-exporter/pkg/batch	coverage: 100.0% of statements
ok  	github.com/resmoio/kubernetes-event-exporter/pkg/batch	(cached)	coverage: 100.0% of statements
?   	github.com/resmoio/kubernetes-event-exporter/pkg/metrics	[no test files]
?   	github.com/resmoio/kubernetes-event-exporter/pkg/version	[no test files]
=== RUN   Test_ParseConfig
--- PASS: Test_ParseConfig (0.00s)
=== RUN   TestValidate_IsCheckingMaxEventAgeSeconds_WhenNotSet
{"level":"info","time":"2023-11-17T10:45:44-05:00","message":"setting config.maxEventAgeSeconds=5 (default)"}
{"level":"warn","time":"2023-11-17T10:45:44-05:00","message":"metrics name prefix is empty, setting config.metricsNamePrefix='event_exporter_' is recommended"}
--- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenNotSet (0.00s)
=== RUN   TestValidate_IsCheckingMaxEventAgeSeconds_WhenThrottledPeriodSet
--- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenThrottledPeriodSet (0.00s)
=== RUN   TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsSet
--- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsSet (0.00s)
=== RUN   TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsAndThrottledPeriodSet
--- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsAndThrottledPeriodSet (0.00s)
=== RUN   TestValidate_MetricsNamePrefix_WhenEmpty
--- PASS: TestValidate_MetricsNamePrefix_WhenEmpty (0.00s)
=== RUN   TestValidate_MetricsNamePrefix_WhenValid
--- PASS: TestValidate_MetricsNamePrefix_WhenValid (0.00s)
=== RUN   TestValidate_MetricsNamePrefix_WhenInvalid
--- PASS: TestValidate_MetricsNamePrefix_WhenInvalid (0.00s)
=== RUN   TestSetDefaults
--- PASS: TestSetDefaults (0.00s)
=== RUN   TestEngineNoRoutes
--- PASS: TestEngineNoRoutes (0.00s)
=== RUN   TestEngineSimple
--- PASS: TestEngineSimple (0.00s)
=== RUN   TestEngineDropSimple
--- PASS: TestEngineDropSimple (0.00s)
=== RUN   TestEmptyRoute
--- PASS: TestEmptyRoute (0.00s)
=== RUN   TestBasicRoute
--- PASS: TestBasicRoute (0.00s)
=== RUN   TestDropRule
--- PASS: TestDropRule (0.00s)
=== RUN   TestSingleLevelMultipleMatchRoute
--- PASS: TestSingleLevelMultipleMatchRoute (0.00s)
=== RUN   TestSubRoute
--- PASS: TestSubRoute (0.00s)
=== RUN   TestSubSubRoute
--- PASS: TestSubSubRoute (0.00s)
=== RUN   TestSubSubRouteWithDrop
--- PASS: TestSubSubRouteWithDrop (0.00s)
=== RUN   Test_GHIssue51
--- PASS: Test_GHIssue51 (0.00s)
=== RUN   TestEmptyRule
--- PASS: TestEmptyRule (0.00s)
=== RUN   TestBasicRule
--- PASS: TestBasicRule (0.00s)
=== RUN   TestBasicNoMatchRule
--- PASS: TestBasicNoMatchRule (0.00s)
=== RUN   TestBasicRegexRule
--- PASS: TestBasicRegexRule (0.00s)
=== RUN   TestLabelRegexRule
--- PASS: TestLabelRegexRule (0.00s)
=== RUN   TestOneLabelMatchesRule
--- PASS: TestOneLabelMatchesRule (0.00s)
=== RUN   TestOneLabelDoesNotMatchRule
--- PASS: TestOneLabelDoesNotMatchRule (0.00s)
=== RUN   TestTwoLabelMatchesRule
--- PASS: TestTwoLabelMatchesRule (0.00s)
=== RUN   TestTwoLabelRequiredRule
--- PASS: TestTwoLabelRequiredRule (0.00s)
=== RUN   TestTwoLabelRequiredOneMissingRule
--- PASS: TestTwoLabelRequiredOneMissingRule (0.00s)
=== RUN   TestOneAnnotationMatchesRule
--- PASS: TestOneAnnotationMatchesRule (0.00s)
=== RUN   TestOneAnnotationDoesNotMatchRule
--- PASS: TestOneAnnotationDoesNotMatchRule (0.00s)
=== RUN   TestTwoAnnotationsMatchesRule
--- PASS: TestTwoAnnotationsMatchesRule (0.00s)
=== RUN   TestTwoAnnotationsRequiredOneMissingRule
--- PASS: TestTwoAnnotationsRequiredOneMissingRule (0.00s)
=== RUN   TestComplexRuleNoMatch
--- PASS: TestComplexRuleNoMatch (0.00s)
=== RUN   TestComplexRuleMatches
--- PASS: TestComplexRuleMatches (0.00s)
=== RUN   TestComplexRuleAnnotationsNoMatch
--- PASS: TestComplexRuleAnnotationsNoMatch (0.00s)
=== RUN   TestComplexRuleMatchesRegexp
--- PASS: TestComplexRuleMatchesRegexp (0.00s)
=== RUN   TestComplexRuleNoMatchRegexp
--- PASS: TestComplexRuleNoMatchRegexp (0.00s)
=== RUN   TestMessageRegexp
--- PASS: TestMessageRegexp (0.00s)
=== RUN   TestCount
--- PASS: TestCount (0.00s)
PASS
	github.com/resmoio/kubernetes-event-exporter/pkg/exporter	coverage: 68.9% of statements
ok  	github.com/resmoio/kubernetes-event-exporter/pkg/exporter	0.516s	coverage: 68.9% of statements
=== RUN   TestEnhancedEvent_DeDot
=== RUN   TestEnhancedEvent_DeDot/nothing
=== RUN   TestEnhancedEvent_DeDot/dedot
--- PASS: TestEnhancedEvent_DeDot (0.00s)
    --- PASS: TestEnhancedEvent_DeDot/nothing (0.00s)
    --- PASS: TestEnhancedEvent_DeDot/dedot (0.00s)
=== RUN   TestEnhancedEvent_DeDot_MustNotAlternateOriginal
--- PASS: TestEnhancedEvent_DeDot_MustNotAlternateOriginal (0.00s)
=== RUN   TestEventWatcher_EventAge_whenEventCreatedBeforeStartup
--- PASS: TestEventWatcher_EventAge_whenEventCreatedBeforeStartup (0.00s)
=== RUN   TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndBeforeMaxAge
--- PASS: TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndBeforeMaxAge (0.00s)
=== RUN   TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndAfterMaxAge
--- PASS: TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndAfterMaxAge (0.00s)
=== RUN   TestOnEvent_WithObjectMetadata
--- PASS: TestOnEvent_WithObjectMetadata (0.00s)
=== RUN   TestOnEvent_DeletedObjects
--- PASS: TestOnEvent_DeletedObjects (0.00s)
PASS
	github.com/resmoio/kubernetes-event-exporter/pkg/kube	coverage: 30.4% of statements
ok  	github.com/resmoio/kubernetes-event-exporter/pkg/kube	0.739s	coverage: 30.4% of statements
=== RUN   Test_ParseConfigFromBites_ExampleConfigIsCorrect
--- PASS: Test_ParseConfigFromBites_ExampleConfigIsCorrect (0.00s)
=== RUN   Test_ParseConfigFromBites_NoErrors
--- PASS: Test_ParseConfigFromBites_NoErrors (0.00s)
=== RUN   Test_ParseConfigFromBites_ErrorWhenCurlyBracesNotEscaped
--- PASS: Test_ParseConfigFromBites_ErrorWhenCurlyBracesNotEscaped (0.00s)
=== RUN   Test_ParseConfigFromBites_OkWhenCurlyBracesEscaped
--- PASS: Test_ParseConfigFromBites_OkWhenCurlyBracesEscaped (0.00s)
=== RUN   Test_ParseConfigFromBites_ErrorErrorNotWithCurlyBraces
--- PASS: Test_ParseConfigFromBites_ErrorErrorNotWithCurlyBraces (0.00s)
PASS
	github.com/resmoio/kubernetes-event-exporter/pkg/setup	coverage: 100.0% of statements
ok  	github.com/resmoio/kubernetes-event-exporter/pkg/setup	1.160s	coverage: 100.0% of statements
=== RUN   TestOpsCenterSink_Send
=== RUN   TestOpsCenterSink_Send/Simple_Create
=== RUN   TestOpsCenterSink_Send/Invalid_Priority:_Want_err
--- PASS: TestOpsCenterSink_Send (0.00s)
    --- PASS: TestOpsCenterSink_Send/Simple_Create (0.00s)
    --- PASS: TestOpsCenterSink_Send/Invalid_Priority:_Want_err (0.00s)
=== RUN   TestTeams_Send
--- PASS: TestTeams_Send (0.00s)
=== RUN   TestTeams_Send_WhenTeamsReturnsRateLimited
--- PASS: TestTeams_Send_WhenTeamsReturnsRateLimited (0.00s)
=== RUN   TestLayoutConvert
--- PASS: TestLayoutConvert (0.00s)
PASS
	github.com/resmoio/kubernetes-event-exporter/pkg/sinks	coverage: 13.6% of statements
ok  	github.com/resmoio/kubernetes-event-exporter/pkg/sinks	1.435s	coverage: 13.6% of statements
Events
# Object that's being deleted and has an owner
{
  "metadata": {
    "name": "coredns-5dd5756b68-wnj8t.17988034c78ed0a4",
    "namespace": "kube-system",
    "uid": "4c313ac2-5b4c-47a0-874c-eb55988573c7",
    "resourceVersion": "2363871",
    "creationTimestamp": "2023-11-17T19:37:03Z"
  },
  "reason": "Killing",
  "message": "Stopping container coredns",
  "source": {
    "component": "kubelet",
    "host": "kind-control-plane"
  },
  "firstTimestamp": "2023-11-17T19:37:03Z",
  "lastTimestamp": "2023-11-17T19:37:03Z",
  "count": 1,
  "type": "Normal",
  "eventTime": null,
  "reportingComponent": "kubelet",
  "reportingInstance": "kind-control-plane",
  "clusterName": "my-super-local-cluster",
  "involvedObject": {
    "kind": "Pod",
    "namespace": "kube-system",
    "name": "coredns-5dd5756b68-wnj8t",
    "uid": "968ca75e-681c-4f76-8ddd-61ddc2667afc",
    "apiVersion": "v1",
    "resourceVersion": "2354080",
    "fieldPath": "spec.containers{coredns}",
    "labels": {
      "k8s-app": "kube-dns",
      "pod-template-hash": "5dd5756b68"
    },
    "ownerReferences": [
      {
        "apiVersion": "apps/v1",
        "kind": "ReplicaSet",
        "name": "coredns-5dd5756b68",
        "uid": "86ca77ef-fce0-4033-99b3-35ee9a45a1d7",
        "controller": true,
        "blockOwnerDeletion": true
      }
    ],
    "deleted": true
  }
}

# Object that's not being deleted and has an owner
{
  "metadata": {
    "name": "coredns-5dd5756b68.17988034c833eed3",
    "namespace": "kube-system",
    "uid": "be1a2437-3dc0-407c-8d1a-482b407bbf19",
    "resourceVersion": "2363876",
    "creationTimestamp": "2023-11-17T19:37:03Z"
  },
  "reason": "SuccessfulCreate",
  "message": "Created pod: coredns-5dd5756b68-2vg7q",
  "source": {
    "component": "replicaset-controller"
  },
  "firstTimestamp": "2023-11-17T19:37:03Z",
  "lastTimestamp": "2023-11-17T19:37:03Z",
  "count": 1,
  "type": "Normal",
  "eventTime": null,
  "reportingComponent": "replicaset-controller",
  "reportingInstance": "",
  "clusterName": "my-super-local-cluster",
  "involvedObject": {
    "kind": "ReplicaSet",
    "namespace": "kube-system",
    "name": "coredns-5dd5756b68",
    "uid": "86ca77ef-fce0-4033-99b3-35ee9a45a1d7",
    "apiVersion": "apps/v1",
    "resourceVersion": "2363841",
    "labels": {
      "k8s-app": "kube-dns",
      "pod-template-hash": "5dd5756b68"
    },
    "annotations": {
      "deployment.kubernetes.io/desired-replicas": "2",
      "deployment.kubernetes.io/max-replicas": "3",
      "deployment.kubernetes.io/revision": "1"
    },
    "ownerReferences": [
      {
        "apiVersion": "apps/v1",
        "kind": "Deployment",
        "name": "coredns",
        "uid": "0f39b1dd-8cb4-4374-a95d-11d96c0b9d6a",
        "controller": true,
        "blockOwnerDeletion": true
      }
    ],
    "deleted": false
  }
}

# Object doesn't have an owner
{
  "metadata": {
    "name": "web-server.17988093f5b84ae4",
    "namespace": "rnathani",
    "uid": "9805e637-a970-4a8a-8edf-1c727d0a52d8",
    "resourceVersion": "2364431",
    "creationTimestamp": "2023-11-17T19:43:52Z"
  },
  "reason": "Killing",
  "message": "Stopping container web-server",
  "source": {
    "component": "kubelet",
    "host": "kind-control-plane"
  },
  "firstTimestamp": "2023-11-17T19:43:52Z",
  "lastTimestamp": "2023-11-17T19:43:52Z",
  "count": 1,
  "type": "Normal",
  "eventTime": null,
  "reportingComponent": "kubelet",
  "reportingInstance": "kind-control-plane",
  "clusterName": "my-super-local-cluster",
  "involvedObject": {
    "kind": "Pod",
    "namespace": "rnathani",
    "name": "web-server",
    "uid": "2e4ea176-6733-44f6-aacb-5f50801c0dd2",
    "apiVersion": "v1",
    "resourceVersion": "2212679",
    "fieldPath": "spec.containers{web-server}",
    "labels": {
      "app": "web-server",
      "version": "0.0.1"
    },
    "annotations": {
      "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"web-server\",\"version\":\"0.0.1\"},\"name\":\"web-server\",\"namespace\":\"rnathani\"},\"spec\":{\"containers\":[{\"command\":[\"/root/web-server\"],\"image\":\"lnkdin.cr/temp/web-server:0.0.1\",\"livenessProbe\":{\"httpGet\":{\"path\":\"/\",\"port\":8080},\"periodSeconds\":5},\"name\":\"web-server\",\"ports\":[{\"containerPort\":8080,\"name\":\"tcp\"}],\"resources\":{\"limits\":{\"cpu\":\"200m\",\"memory\":\"300Mi\"},\"requests\":{\"cpu\":\"200m\",\"memory\":\"100Mi\"}},\"startupProbe\":{\"failureThreshold\":1,\"httpGet\":{\"path\":\"/\",\"port\":8080},\"periodSeconds\":5}}]}}\n"
    },
    "deleted": true
  }
}
New metrics
✗ curl -sSL localhost:2112/metrics | rg kube_api
# HELP kube_api_read_cache_hits The total number of read requests served from cache when looking up object metadata
# TYPE kube_api_read_cache_hits counter
kube_api_read_cache_hits 3
# HELP kube_api_read_cache_misses The total number of read requests served from kube-apiserver when looking up object metadata
# TYPE kube_api_read_cache_misses counter
kube_api_read_cache_misses 5

ronaknnathani and others added 5 commits November 15, 2023 11:08
This commit uses the same approach as labels and annotations and adds ownerReferences to the EnhancedEvent struct.
The flow is as follows:
* use an LRU cache to store the ownerReferences with object UID as the key
* if the object doesn't exist in cache, look up using dynamic client and store it in cache
* if the object exists in cache, return the value from cache
… all labels, annotations and ownerReferences

Currently, every time there's an event, the events exporter runs GetObject for metadata like labels and annotations
independently. This results in the same object being looked up multiple times for different pieces of the metadata.
These number of calls grow as we want to look up additional information about the object like ownerReferences.
So, in this change, a struct called `ObjectMetadata` is created to capture all the pieces of information that need to be added
to the EnhancedEvent. And every time there's an event, the object is fetched from the kube-apiserver if it's not in the cache already
and all pieces of metadata require only 1 call. The metadata is cached so repeated events about the same object don't
result in more calls.

Additionally, UID + ResourceVersion is used the cacheKey so if the object changes, it's looked up again.

One more change here is introduction of a `deleted` field in the `EnhancedEvent.InvolvedObject` to capture whether the object
is deleted. This helps receivers identify whether a resource is deleted and create rules for it when needed.

Tests are added for these updates and the mock functions are moved to the test files.
@mustafaakin
Copy link

This looks great, thanks for the contribution.

@mustafaakin mustafaakin merged commit f48d5bf into resmoio:master Nov 17, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants