Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statsd_input: Fix config to take host instead of hosts #6592

Merged
merged 3 commits into from
Jun 26, 2023

Conversation

shmsr
Copy link
Member

@shmsr shmsr commented Jun 16, 2023

What does this PR do?

While debugging why the client is not able to send data to the statsd_input's spawned server, I found that the configuration used is incorrect making the server listen on a default IP which is not always correct unless someone uses the standalone elastic-agent. Also, accepting multiple hosts for a server doesn't make sense and that's why also only one host is considered in metricbeat module side of beats. Also, the IP shouldn't have http:// prefix and it should be without it.

Found this when trying to send data using a statsd client from a different container which is part of the same network as of elastic-agent's container. Also made a toy UDP server to check if it receiving the packets in the same container but metricbeat was not receiving them. When checking with tcpdump, I noticed that UDP packets are reaching the correct destination but the server always running on localhost where the packet is destined for another IP and that is the reason for metricbeat module not receiving the packets. But if we run the client on the same container as elastic-agent then it's possible to send it to its localhost directly and it is even true locally and that's why it was working there before.

Ref:

metricbeat's statsd module is dependent on the UDP helper package and from this we can see that it accepts host and hosts.

Update:

  • Based on review comments from @andrewkroh, made it more like UDP input because field names, defaults, description etc. there makes more sense.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

  • Start the integration
  • Try sending using a statsd_client from a container or host
  • Check logs of elastic-agent if the server is spawned on the correct host and port
  • Check if data is coming to Elasticsearch

@shmsr shmsr requested a review from a team as a code owner June 16, 2023 03:26
@shmsr shmsr changed the title statsd_input: Fix config to take host instead of hosts statsd_input: Fix config to take host instead of hosts Jun 16, 2023
@shmsr shmsr self-assigned this Jun 16, 2023
@shmsr shmsr added the bugfix Pull request that fixes a bug issue label Jun 16, 2023
@elasticmachine
Copy link

elasticmachine commented Jun 16, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-06-16T15:27:01.081+0000

  • Duration: 15 min 40 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@shmsr shmsr added bug Something isn't working, use only for issues and removed bugfix Pull request that fixes a bug issue labels Jun 16, 2023
@ishleenk17
Copy link
Contributor

With a single host also, the default remains the same.
In case you are not using a standalone, ideally a change of host from localhost to the expected address should work.
How is this different ?

@ali786XI
Copy link
Contributor

@ishleenk17 I think it's working in standalone because I am running the Node JS client from the same machine. Hence there are no issues in collecting the metrics. But when it is running in some other container and if I give the host of my agent in the client script, it does not recognize my host. By the way, I am able to run the system tests with this change.

Copy link
Contributor

@ali786XI ali786XI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shmsr
Copy link
Member Author

shmsr commented Jun 16, 2023

With a single host also, the default remains the same. In case you are not using a standalone, ideally a change of host from localhost to the expected address should work. How is this different?

No. Change would not reflect for hosts as it is expected to have host. If you give any other host, still the default (default on metricbeat side is taken) is considered i.e., localhost. I've linked the parts of code of metricbeat in description itself.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a system test that starts the input and streams in a few StatsD events over UDP. Assuming the protocol is simply text based then you could follow the same setup as the udp package. Like drop data like this into a file and have elastic/stream send it

python.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.07200241|mspython.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.29730797|mspython.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.30040741|mspython.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.29134750|ms

This would verify that the input is using 0.0.0.0 as the bind address because the test is performed between containers. And it would be mandatory for 0.0.0.0 to be working property for this test to pass.

required: true
show_user: true
default:
- http://127.0.0.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually like to have the user opt-in to having the Agent listen on all interfaces for security reasons. I don't know that there is precedence for using 0.0.0.0 as the default. The UDP input would be a good example to follow. IIRC the description explains when to change to the wildcard address.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I kept it 0.0.0.0 so that by default it listens on all interfaces but yes if we are usually keeping it as an opt-in then it makes sense. I'll make similar changes here. Thanks for pointing that out!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh

Did some more changes based on udp package:

  • s/host/listen_address
  • s/port/listen_port

and so on.

Because it makes more sense. What do you think?

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @shmsr

@shmsr
Copy link
Member Author

shmsr commented Jun 16, 2023

We should add a system test that starts the input and streams in a few StatsD events over UDP. Assuming the protocol is simply text based then you could follow the same setup as the udp package. Like drop data like this into a file and have elastic/stream send it

python.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.07200241|mspython.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.29730797|mspython.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.30040741|mspython.counter:1|cpython.gauge.foo:10|gpython.average.avg:5|apython.timer.total:0.29134750|ms

This would verify that the input is using 0.0.0.0 as the bind address because the test is performed between containers. And it would be mandatory for 0.0.0.0 to be working property for this test to pass.

So, @aliabbas-elastic is adding some system tests as part of this PR (waiting for this PR to be merged). Should I work with him to add the required tests you suggested in that PR itself?

Thanks for suggesting elastic/stream as I didn't know about this before. I was also thinking of having something similar because it's ultimately plaintext and the format of the metric that statsd expects is very simple and we do not require to build a statsd client (using any xyz programming language) for testing. But the simplest option that I had was nc and that again cannot stream and that's why I built one toy client in Go for testing. I gave stream a try and it works as per expectation. Thanks again!

cc: @aliabbas-elastic (you can remove the node.js script that you are using as a statsd client and instead use elastic/stream)

packages/statsd_input/manifest.yml Show resolved Hide resolved
- http://127.0.0.1
- name: port
default: localhost
- name: listen_port
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using the statsd/server input, let's keep it in sync with that.
If we think this change is required we should change in the beats code as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should go with listen_{address,port} because:

  • In packages/statsd_input/agent/input/input.yml.hbs, the mappings are done correctly so that it works well with the beats code. I tested the current changes and they are working fine.
  • Changes in the beats code involve the following breakage:
    • statsd uses a common udp package used by even other modules in metricbeat.
    • Changes in the configuration will impact all modules which is even fine but if any users upgrade the module they are using then their older configuration would break as we'd moving from host -> listen_address.
  • Also, I think we should go with this because it conveys the message better:
    • listen_address compared to host makes it clear that statsd package is listening on an address whereas host is not implying that clear as listen_address. This sets the expectation clear even if some newbie is starting with integrations and statsd.

What do you think? I even asked the same here.

Copy link
Contributor

@ishleenk17 ishleenk17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ali786XI ali786XI merged commit 4db004d into elastic:main Jun 26, 2023
@elasticmachine
Copy link

Package statsd_input - 0.2.1 containing this change is available at https://epr.elastic.co/search?package=statsd_input

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working, use only for issues Integration:statsd_input StatsD Input
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants