Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Fix hint-based monitor gen #34376

Merged
merged 8 commits into from Feb 1, 2023

Conversation

emilioalvap
Copy link
Collaborator

@emilioalvap emilioalvap commented Jan 24, 2023

What does this PR do?

Make hint generator more flexible by removing host/port matching requirements.
Closes #17771.
Closes #34214.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Deploy the following config and check that exactly one monitor is generated with an unmatched host/port:

apiVersion: apps/v1
kind: Deployment
metadata:
  name:  nginx-svc
  namespace: default
  labels:
    app:  nginx-svc
spec:
  selector:
    matchLabels:
      app: nginx-svc
  replicas: 1
  template:
    metadata:
      labels:
        app:  nginx-svc
    spec:
      containers:
      - name:  nginx-svc
        image: nginx:1.23.1-alpine
        imagePullPolicy: Always
        resources:
          requests:
            cpu: 300m
            memory: 200Mi
          limits:
            cpu: 300m
            memory: 200Mi
        ports:
        - containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-svc
  namespace: default
  annotations:
    co.elastic.monitor/1.hosts: 127.0.0.1:30000
    co.elastic.monitor/1.schedule: '@every 5s'
    co.elastic.monitor/1.type: tcp
spec:
  selector:
    app: nginx-svc
  type: NodePort
  ports:
  - protocol: TCP
    port: 80
    nodePort: 30000
    name: main

@emilioalvap emilioalvap added bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.7.0 labels Jan 24, 2023
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jan 24, 2023
@mergify
Copy link
Contributor

mergify bot commented Jan 24, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @emilioalvap? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 24, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-02-01T18:30:10.606+0000

  • Duration: 48 min 44 sec

Test stats 🧪

Test Results
Failed 0
Passed 1889
Skipped 25
Total 1914

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

}

if p != port {
// Different static port, skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line would not solve an extra bug that I discovered while investigating this issue.

we could have a use case where a container export a port 80 that is remapped to a nodePort 30000 (or any other port different from 80) like in the following snippet

apiVersion: v1
kind: Service
metadata:
  name: nginx-svc
  namespace: default
  annotations:
    co.elastic.monitor/1.hosts: 127.0.0.1:30000
    co.elastic.monitor/1.schedule: '@every 5s'
    co.elastic.monitor/1.type: tcp
spec:
  selector:
    app: nginx-svc
  type: NodePort
  ports:
  - protocol: TCP
    port: 80
    nodePort: 30000
    name: main

In that case the hint port (here 30000) needs to match the node port (here 30000) and not the container port (here 80).
If the previous code is not change 80 != 30000 will skip adding this heartbeat in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this extra check for the containerPort (aka port here) and the hints port is just not useful and it might cause issues when that containerPort is mapped to another port

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I'm not too worried about this scenario, considering the limitations autodiscovery actually has

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what limitations you are referring to but this is clearly a bug to me.

if mType == "icmp" && strings.Contains(h, ":") {
hb.logger.Warnf("ICMP scheme does not support port specification: %s", h)
continue
} else if strings.Contains(h, ":") && strings.Contains(h, "data.port") && port == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we removing the check for data.host altogether?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I don't think we should check for data.port in any part of the string. We should check it for :${data.port}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we removing the check for data.host altogether?

IMO, the complicated thing here is building a general case. Pods are usually addresed via ip, services can be addressed by dns, some monitors might make sense for a single pod, others will be broken for a replica set.

@andrewvc I think we have two paths to solve this, wdyt?:

  • allow any value combination in without trying to match host-port, should fix errors like the ones listed here with the downside that it can have unexpected results to be handled by the user, eg: generating a unique monitor config for a replica set will delete the monitor if a pod is removed from the set.
  • Fix the empty host error and effectively have an even stricter matching host-port approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide some more detail on the stricter matching approach? What would that look like?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it would involve making sure that the monitor generated is dynamic in relation to the hint event data. It's basically what we do now by checking ${data.host} and ${data.port} but fixing the empty config scenario we have.

The problem with this approach is that it prevents some common use cases, that's why there're so many request to remove the restriction altogether.

@emilioalvap emilioalvap marked this pull request as ready for review January 31, 2023 18:21
@emilioalvap emilioalvap requested a review from a team as a code owner January 31, 2023 18:21
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

Copy link
Contributor

@andrewvc andrewvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending changelog

@emilioalvap emilioalvap merged commit 9d9f8dc into elastic:main Feb 1, 2023
1 check passed
@gsantoro gsantoro added the backport-v7.17.0 Automated backport with mergify label Mar 7, 2023
mergify bot pushed a commit that referenced this pull request Mar 7, 2023
* ignore matching event port with hints port

* fixed compilation error

* Refactor hint-based monitor gen

* Update heartbeat/autodiscover/builder/hints/monitors.go

* Remove host-port matching from hint generator

* Add changelog

---------

Co-authored-by: gsantoro <giuseppe.santoro@elastic.co>
Co-authored-by: Andrew Cholakian <andrewvc@elastic.co>
(cherry picked from commit 9d9f8dc)

# Conflicts:
#	heartbeat/autodiscover/builder/hints/monitors.go
#	heartbeat/autodiscover/builder/hints/monitors_test.go
gsantoro pushed a commit to gsantoro/beats that referenced this pull request Mar 7, 2023
* ignore matching event port with hints port

* fixed compilation error

* Refactor hint-based monitor gen

* Update heartbeat/autodiscover/builder/hints/monitors.go

* Remove host-port matching from hint generator

* Add changelog

---------

Co-authored-by: gsantoro <giuseppe.santoro@elastic.co>
Co-authored-by: Andrew Cholakian <andrewvc@elastic.co>
(cherry picked from commit 9d9f8dc)
emilioalvap added a commit that referenced this pull request Mar 10, 2023
* [Heartbeat] Fix hint-based monitor gen (#34376)

* ignore matching event port with hints port

* fixed compilation error

* Refactor hint-based monitor gen

* Update heartbeat/autodiscover/builder/hints/monitors.go

* Remove host-port matching from hint generator

* Add changelog

---------

Co-authored-by: gsantoro <giuseppe.santoro@elastic.co>
Co-authored-by: Andrew Cholakian <andrewvc@elastic.co>
(cherry picked from commit 9d9f8dc)

* removed dependency to `elastic-agent-autodiscover`

* linter fixes from PR

* new line

* Fix linter

---------

Co-authored-by: Emilio Alvarez Piñeiro <95703246+emilioalvap@users.noreply.github.com>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
* ignore matching event port with hints port

* fixed compilation error

* Refactor hint-based monitor gen

* Update heartbeat/autodiscover/builder/hints/monitors.go

* Remove host-port matching from hint generator

* Add changelog

---------

Co-authored-by: gsantoro <giuseppe.santoro@elastic.co>
Co-authored-by: Andrew Cholakian <andrewvc@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.17.0 Automated backport with mergify bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.7.0
Projects
None yet
4 participants