Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(inputs.conntrack): Parse conntrack stats #8958

Merged
merged 1 commit into from
Oct 12, 2022
Merged

feat(inputs.conntrack): Parse conntrack stats #8958

merged 1 commit into from
Oct 12, 2022

Conversation

deric
Copy link
Contributor

@deric deric commented Mar 8, 2021

Current conntrack module is quite limited, it supports only single metric ip_conntrack_count (the other stats usually don't change over time).

This PR adds support for collecting conntrack stats while relying on github.com/shirou/gopsutil module. By default no stats will be collected, must be explicitly enabled:

[[inputs.conntrack]]
  ## all - aggregated statistics
  ## percpu - include detailed statistics with cpu tag
  collect = ["all", "percpu"]

The idea is to support netfilter stats that are available on Linux systems via conntrack -S / --stats command:

$ conntrack -S
cpu=0           found=0 invalid=41 insert=0 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=17 
cpu=1           found=0 invalid=14 insert=0 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=12 
cpu=2           found=0 invalid=11772 insert=0 insert_failed=6 drop=6 early_drop=0 error=233 search_restart=391 

Relates issues:

TODO: Currently is not possible to collect only stats and no other metric because of len(fields) == 0 check.

Required for all PRs:

  • Associated README.md updated.
  • Has appropriate unit tests.

Copy link
Contributor

@telegraf-tiger telegraf-tiger bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤝 ✅ CLA has been signed. Thank you!

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Mar 8, 2021
Copy link
Contributor

@sspaink sspaink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for submitting a pull request! I've added some comments for you to review, also looks like you will need to re-base against the latest master for the CI to pass.

plugins/inputs/conntrack/README.md Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack_test.go Show resolved Hide resolved
@sspaink sspaink added the waiting for response waiting for response from contributor label Mar 22, 2022
@deric deric changed the title Support parsing conntrack stats (#8955) feat(inputs.conntrack): Parse conntrack stats (#8955) Mar 23, 2022
@deric deric changed the title feat(inputs.conntrack): Parse conntrack stats (#8955) feat(inputs.conntrack): Parse conntrack stats Mar 23, 2022
@deric
Copy link
Contributor Author

deric commented Mar 23, 2022

@sspaink Thanks for comments!

The percpu syntax was inspired by cpu input that is using also shirou/gopsutil. But it's probably not the best way how to approach this in telegraf.

Suggest config directive allows adding future features, which is nice.

extra_stats = ["conntrack", "percpu"]

It might be better to call these "features"

collect = ["stats_all", "stats_percpu"]

stats_all input

> conntrack,cpu=all,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=2i,early_drop=0i,entries=6552i,expect_create=0i,expect_delete=0i,expect_new=0i,found=1556i,icmp_error=6315i,ignore=1911479058i,insert=0i,insert_failed=8i,invalid=212671651i,new=0i,search_restart=1457459i,searched=0i 1648052724000000000

stats_percpu input:

> conntrack,cpu=cpu0,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=59i,icmp_error=0i,ignore=904833019i,insert=0i,insert_failed=3i,invalid=15616394i,new=0i,search_restart=63091i,searched=0i 1648052905000000000
> conntrack,cpu=cpu1,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=70i,icmp_error=0i,ignore=865728837i,insert=0i,insert_failed=0i,invalid=17950581i,new=0i,search_restart=62340i,searched=0i 1648052905000000000
> conntrack,cpu=cpu2,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=73i,icmp_error=400i,ignore=1211717610i,insert=0i,insert_failed=1i,invalid=20188665i,new=0i,search_restart=112730i,searched=0i 1648052905000000000
> conntrack,cpu=cpu3,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=324i,icmp_error=1089i,ignore=2273071456i,insert=0i,insert_failed=0i,invalid=17114035i,new=0i,search_restart=186352i,searched=0i 1648052905000000000
> conntrack,cpu=cpu4,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=98i,icmp_error=0i,ignore=877931988i,insert=0i,insert_failed=0i,invalid=19777350i,new=0i,search_restart=59594i,searched=0i 1648052905000000000
> conntrack,cpu=cpu5,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=90i,icmp_error=0i,ignore=862837152i,insert=0i,insert_failed=0i,invalid=20104525i,new=0i,search_restart=57898i,searched=0i 1648052905000000000
> conntrack,cpu=cpu6,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=57i,icmp_error=0i,ignore=908892157i,insert=0i,insert_failed=1i,invalid=23140832i,new=0i,search_restart=58805i,searched=0i 1648052905000000000
> conntrack,cpu=cpu7,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=1i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=208i,icmp_error=1564i,ignore=3016991626i,insert=0i,insert_failed=1i,invalid=18514840i,new=0i,search_restart=260460i,searched=0i 1648052905000000000
> conntrack,cpu=cpu8,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=163i,icmp_error=1111i,ignore=2705022169i,insert=0i,insert_failed=0i,invalid=13915833i,new=0i,search_restart=175942i,searched=0i 1648052905000000000
> conntrack,cpu=cpu9,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=1i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=85i,icmp_error=467i,ignore=1420448635i,insert=0i,insert_failed=1i,invalid=14982275i,new=0i,search_restart=119368i,searched=0i 1648052905000000000
> conntrack,cpu=cpu10,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=105i,icmp_error=0i,ignore=905537001i,insert=0i,insert_failed=1i,invalid=16795435i,new=0i,search_restart=58677i,searched=0i 1648052905000000000
> conntrack,cpu=cpu11,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=0i,early_drop=0i,entries=527i,expect_create=0i,expect_delete=0i,expect_new=0i,found=224i,icmp_error=1684i,ignore=3138352738i,insert=0i,insert_failed=0i,invalid=14570886i,new=0i,search_restart=242207i,searched=0i 1648052905000000000

cpu=all is just aggregated version of stats_percpu. It might generate data with high cardinality, though someone might find it useful.

Also, I was wandering of renaming the input conntrack to conntrack_stats, what do you think?

> conntrack_stats,cpu=all,fqdn=host.example.com,host=host delete=0i,delete_list=0i,drop=2i,early_drop=0i,entries=6552i,expect_create=0i,expect_delete=0i,expect_new=0i,found=1556i,icmp_error=6315i,ignore=1911479058i,insert=0i,insert_failed=8i,invalid=212671651i,new=0i,search_restart=1457459i,searched=0i 1648052724000000000
conntrack,fqdn=host.example.com,host=host ip_conntrack_count=527 1648052905000000000

Currently the stats are collected using ConntrackStats method. Also conntrackStatsFromFile or ConntrackStatsWithContext might be used in the future in order to provide more detailed metrics.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Mar 23, 2022
@sspaink
Copy link
Contributor

sspaink commented Mar 23, 2022

Thanks for the quick and thorough response!

It seems redundant to prepend stats_ to the configuration options. What do you think of this:

collect = ["all", "percpu"]
and I assume it would support collect = ["percpu"] or collect = ["all"] right?

I'm also not sure if renaming the input plugin to conntrack_stats helps make it clearer seeing as the stats it gathers are from Netfilter's conntrack-tools which doesn't mention stats. Although I don't personally use this plugin, how would renaming it help you?

@deric
Copy link
Contributor Author

deric commented Mar 23, 2022

@sspaink Just trying to make the config more readable to the end user. Currently the metrics naming seems to be unique, there shouldn't be any name clashes.

The collect keyword in config is just a suggestion, if you have a better idea (or a convention for naming), we can definitely change it.

I guess the number of users of this plugin is very limited. As the readme suggests there's just ip_conntrack_count metric. While ip_conntrack_max and many other are just constants that might differ between different distributions but not really exciting to observe in real-time.

tail /proc/sys/net/netfilter/*
==> /proc/sys/net/netfilter/nf_conntrack_acct <==
0
==> /proc/sys/net/netfilter/nf_conntrack_buckets <==
65536
==> /proc/sys/net/netfilter/nf_conntrack_checksum <==
1
==> /proc/sys/net/netfilter/nf_conntrack_count <==
64922
==> /proc/sys/net/netfilter/nf_conntrack_events <==
1
==> /proc/sys/net/netfilter/nf_conntrack_expect_max <==
1024
==> /proc/sys/net/netfilter/nf_conntrack_frag6_high_thresh <==
4194304
==> /proc/sys/net/netfilter/nf_conntrack_frag6_low_thresh <==
3145728
==> /proc/sys/net/netfilter/nf_conntrack_frag6_timeout <==
60
==> /proc/sys/net/netfilter/nf_conntrack_generic_timeout <==
600
==> /proc/sys/net/netfilter/nf_conntrack_helper <==
0
==> /proc/sys/net/netfilter/nf_conntrack_icmp_timeout <==
30
==> /proc/sys/net/netfilter/nf_conntrack_icmpv6_timeout <==
30
==> /proc/sys/net/netfilter/nf_conntrack_log_invalid <==
0
==> /proc/sys/net/netfilter/nf_conntrack_max <==
262144
==> /proc/sys/net/netfilter/nf_conntrack_tcp_be_liberal <==
0
==> /proc/sys/net/netfilter/nf_conntrack_tcp_loose <==
1
==> /proc/sys/net/netfilter/nf_conntrack_tcp_max_retrans <==
3
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close <==
10
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close_wait <==
60
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established <==
432000
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_fin_wait <==
120
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_last_ack <==
30
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_max_retrans <==
300
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_syn_recv <==
60
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_syn_sent <==
120
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_time_wait <==
120
==> /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_unacknowledged <==
300
==> /proc/sys/net/netfilter/nf_conntrack_timestamp <==
0
==> /proc/sys/net/netfilter/nf_conntrack_udp_timeout <==
30
==> /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream <==
180

Eventually the conntrack input might be used for monitoring conntrack -L connections. But that might be too verbose. It would require filtering for tcp/udp/icmp, connection state (ESTABLISHED, TIME_WAIT, etc.), source IP, destination IP and ports.

On the other hand conntrack -S are just high-level metrics.

@sspaink
Copy link
Contributor

sspaink commented Sep 28, 2022

@deric sorry for not responding for a while, are you still interested to work on this pull request? Looking over it again the way it is seems fine to me, were there any other changes you wanted to make?

@sspaink sspaink added the waiting for response waiting for response from contributor label Sep 28, 2022
@deric
Copy link
Contributor Author

deric commented Sep 29, 2022

@sspaink Sorry for late response. Hopefully all issues are now addressed.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Sep 29, 2022
Copy link
Contributor

@sspaink sspaink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your continued work. There were changes how the sample configuration is managed, I added some requested changes adding the logic to embed it into the plugin back in. Also a concern about handling errors.

plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
@deric deric requested a review from sspaink September 30, 2022 08:25
@sspaink sspaink added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Oct 5, 2022
Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deric for this nice contribution! I have some comments additional to the one of @sspaink. Only minor ones though.

plugins/inputs/conntrack/README.md Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack_test.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/conntrack_test.go Outdated Show resolved Hide resolved
plugins/inputs/conntrack/sample.conf Outdated Show resolved Hide resolved
@deric
Copy link
Contributor Author

deric commented Oct 6, 2022

@srebhan Thanks for comments, it should be fixed.

@deric deric requested review from srebhan and sspaink and removed request for srebhan October 6, 2022 07:06
@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Oct 6, 2022

Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for your work @deric!

@sspaink sspaink merged commit 0087a5d into influxdata:master Oct 12, 2022
dba-leshop pushed a commit to dba-leshop/telegraf that referenced this pull request Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants