Skip to content

in_docker: first implementation#790

Closed
ashutoshdhundhara wants to merge 7 commits intofluent:masterfrom
ashutoshdhundhara:in_docker
Closed

in_docker: first implementation#790
ashutoshdhundhara wants to merge 7 commits intofluent:masterfrom
ashutoshdhundhara:in_docker

Conversation

@ashutoshdhundhara
Copy link
Copy Markdown
Contributor

@ashutoshdhundhara ashutoshdhundhara commented Sep 23, 2018

This is for #319.
Just a draft. Also attaching a sample output. Am I moving in right direction?

[0] docker: [1537730279.000958364, {"id"=>"a7e9ae541af46e1", "cpu_used"=>88037091, "mem_used"=>1921024, "mem_limit"=>20971520}]
[1] docker: [1537730279.000966259, {"id"=>"b471533d3e517ce", "cpu_used"=>55996183, "mem_used"=>1417216, "mem_limit"=>10485760}]
[2] docker: [1537730279.000968985, {"id"=>"06d3de15fc7bcad", "cpu_used"=>127468132, "mem_used"=>1343488, "mem_limit"=>10485760}]

Config:

[INPUT]
  Name docker
  Tag docker

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Sep 26, 2018

hi, thanks for your work on this.

would you please specify how it operate and the plans to query metrics per containers ?

@edsiper edsiper self-requested a review September 26, 2018 04:49
@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

Hi Eduardo,
This implementation is based on docker's official guide and is right now functional. You can pull this branch and try it out.
Here CPU/Memory metrics per container are fetched from cgroup metric files as follows:

  • CPU Usage: /sys/fs/cgroup/cpu/docker/<docker_id>/cpuacct.usage
  • Memory Usage: /sys/fs/cgroup/memory/docker/<docker_id>/memory.usage_in_bytes
  • Memory Limit: /sys/fs/cgroup/memory/docker/<docker_id>/memory.limit_in_bytes

Configuration is very simple (as mentioned in previous comment) like in_cpu but still there is scope for more.
Right now, these metrics are iteratively read from their respective files. Can this be a performance issue?

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Sep 26, 2018

thanks, this looks definitely very good to be included in the next major version. Let's keep working in the implementation:

  • append container name/id to the record
  • add option to include or exclude containers:
    • containers: list of containers to include, if not set do all
    • _exclude_containers: list of containers to exclude, if not set include all

now from a dev cycle I recommend:

  • work on top of GIT master
  • when doing tests, run under valgrind (I saw some alerts)

@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

thanks, this looks definitely very good to be included in the next major version. Let's keep working in the implementation:

Thanks.

* append container name/id to the record

Short id (12 char long) is appended to the record

* add option to include or exclude containers:
  
  * _containers_: list of containers to include, if not set do all
  * _exclude_containers: list of containers to exclude, if not set include all

Done.

[INPUT]
  Name docker
# Include f735b07819a1 4fd301e1e402
  Exclude f735b07819a1 4fd301e1e402
  Tag docker

now from a dev cycle I recommend:

* work on top of GIT master

I created my current branch in_docker from master.

* when doing tests, run under valgrind (I saw some alerts)

Although I tried fixing some alerts by starting the binary using following command:

G_SLICE=always-malloc G_DEBUG=gc-friendly  valgrind -v --tool=memcheck --leak-check=full --num-callers=40 --log-file=valgrind.log fluent-bit -c fluent-bit.conf

Can you please guide my how to run tests with valgrind or point me to some resource?

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Oct 1, 2018

Valgrind can be used to test memory issues, it will slowdown the problem but is good to find problems when developing, I normally do:

valgrind --leak-check=full fluent-bit ....

just make sure that when exiting with ctrl-c there are not problems reported.

@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

Valgrind can be used to test memory issues, it will slowdown the problem but is good to find problems when developing, I normally do:

valgrind --leak-check=full fluent-bit ....

just make sure that when exiting with ctrl-c there are not problems reported.

Thanks Eduardo.
Finally I got the message:

==10655== All heap blocks were freed -- no leaks are possible

I have pushed my changes to the same branch.

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Nov 21, 2018

thanks.

I think would be good to have the container name, from a data analysis the container id/hash is not enough

@edsiper edsiper added the waiting-for-user Waiting for more information, tests or requested changes label Nov 21, 2018
@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

ashutoshdhundhara commented Nov 26, 2018

thanks.

I think would be good to have the container name, from a data analysis the container id/hash is not enough

I considered this as well, but wasn't able to get container name from any of the cgroup pseudo files or anything similar to this. Although docker stats does exactly what we want. Will now look more deeper into how this command collects metrics. If you have something in your mind in this direction, please suggest.

@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

@edsiper I've added container name to the record as well .

user@hostname:~/fluent-bit/build$ sudo valgrind --leak-check=full ./bin/fluent-bit --config=/home/user/confi
g.yaml
==21178== Memcheck, a memory error detector
==21178== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21178== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==21178== Command: ./bin/fluent-bit --config=/home/user/config.yaml
==21178== 
Fluent-Bit v0.13.0
Copyright (C) Treasure Data
[2019/01/03 16:34:02] [ info] [engine] started
[0] docker.0: [1546533243.095125318, {"id"=>"b2e0d9268755", "name"=>"redis-1", "cpu_used"=>124548762542, "mem_used"=>2043904, "mem_limit"=
>4294963200}]
[1] docker.0: [1546533243.101521258, {"id"=>"be8144fe5cd8", "name"=>"redis-2", "cpu_used"=>124716612528, "mem_used"=>1761280, "mem_limit"=
>4294963200}]
[2] docker.0: [1546533244.001340963, {"id"=>"b2e0d9268755", "name"=>"redis-1", "cpu_used"=>124549692866, "mem_used"=>2043904, "mem_limit"=
>4294963200}]
[3] docker.0: [1546533244.001373850, {"id"=>"be8144fe5cd8", "name"=>"redis-2", "cpu_used"=>124717779757, "mem_used"=>1761280, "mem_limit"=
>4294963200}]
[4] docker.0: [1546533245.001376290, {"id"=>"b2e0d9268755", "name"=>"redis-1", "cpu_used"=>124550795330, "mem_used"=>2043904, "mem_limit"=
>4294963200}]
[5] docker.0: [1546533245.001409345, {"id"=>"be8144fe5cd8", "name"=>"redis-2", "cpu_used"=>124718936079, "mem_used"=>1761280, "mem_limit"=
>4294963200}]
[6] docker.0: [1546533246.005077304, {"id"=>"b2e0d9268755", "name"=>"redis-1", "cpu_used"=>124552048887, "mem_used"=>2043904, "mem_limit"=
>4294963200}]
[7] docker.0: [1546533246.005109867, {"id"=>"be8144fe5cd8", "name"=>"redis-2", "cpu_used"=>124719879755, "mem_used"=>1761280, "mem_limit"=
>4294963200}]
^C[engine] caught signal
[2019/01/03 16:34:11] [ info] [input] pausing docker.0
==21178== 
==21178== HEAP SUMMARY:
==21178==     in use at exit: 0 bytes in 0 blocks
==21178==   total heap usage: 527 allocs, 527 frees, 962,633 bytes allocated
==21178== 
==21178== All heap blocks were freed -- no leaks are possible
==21178== 
==21178== For counts of detected and suppressed errors, rerun with: -v
==21178== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Jan 18, 2019

thanks. To start merging this we need a few changes:

Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
@ashutoshdhundhara ashutoshdhundhara force-pushed the in_docker branch 3 times, most recently from 4be294a to 2285386 Compare January 19, 2019 17:05
Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

thanks. To start merging this we need a few changes:

@edsiper I have fixed these. But my tests are failing because no dockers are running on build machine and no dockers means no output. How should I handle this?

Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
@edsiper
Copy link
Copy Markdown
Member

edsiper commented Feb 19, 2019

can you emulate some running dockers through file system static files ? e,g: tests/runtime/data/...

@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

can you emulate some running dockers through file system static files ? e,g: tests/runtime/data/...

There are certain paths which are defined as macros in in_docker.h. How should I change them to those of test data directories when running the test suite? If I dynamically create dummy files without changing the actual paths before running the tests, then we will have to launch the make test command with sudo.

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Mar 26, 2019

@ashutoshdhundhara

I reviewed the latest changes and looks good, the only thing we need to show in a different way is the CPU metrics, ideally in % from 0-100 instead of the current value.

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Apr 3, 2019

ping.

note: please also rebase on top of master.

@ashutoshdhundhara
Copy link
Copy Markdown
Contributor Author

Showing percentage from 0-100% makes sense instead of current value. I will need to change the logic a bit. Will update once done.

@hanoch
Copy link
Copy Markdown

hanoch commented Aug 11, 2019

@ashutoshdhundhara @edsiper what is the status of this PR?

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Sep 2, 2019

waiting for the latest changes requested.

@nigels-com
Copy link
Copy Markdown
Contributor

Unfortunately this branch is non-trival (but not impossible) to rebase onto master.

One thing that has changed in the meanwhile is the flb_config_prop migration to the new kv interface:

commit 5a0f8a303161525f4fff3e22f8a6c462a8f78e6c
Author: Eduardo Silva <eduardo@treasure-data.com>
Date:   Mon Aug 5 17:27:11 2019 -0600

    input: use new kv interface to handle properties

It looks like a bit of work to have this suitable for merging to the mainline.

edsiper pushed a commit that referenced this pull request Sep 24, 2019
Signed-off-by: Ashutosh Dhundhara <ashutoshdhundhara@yahoo.com>
Signed-off-by: Eduardo Silva <eduardo@treasure-data.com>
@edsiper
Copy link
Copy Markdown
Member

edsiper commented Sep 24, 2019

@ashutoshdhundhara

first of all my apologies for the very long delay on this. I've just merged your new plugin and this one will be part of v1.3 release (this week), I've made small changes to adjust to the new internal API and now is fully compatible.

I've merged this plugin manually through d19be0d (you remain the author of the plugin in the commit). Since the plugin is ready I am closing this PR.

thanks for your contribution!!!

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Sep 24, 2019

Plugin merged manually, closing PR.

@Reevzee
Copy link
Copy Markdown

Reevzee commented Mar 2, 2021

I'm not able to retrieve the metrics using this input whilst running fluentbit agent as a docker container, should i be mounting to the cgroup path for this to work ?

@Reevzee
Copy link
Copy Markdown

Reevzee commented Mar 2, 2021

I'm not able to retrieve the metrics using this input whilst running fluentbit agent as a docker container, should i be mounting to the cgroup path for this to work ?

"I'm currently mounting path to /sys/fs/cgroup/ to get this working in a docker container"

rawahars pushed a commit to rawahars/fluent-bit that referenced this pull request Oct 24, 2022
* out_http: remove successful_response_code

Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>

* in_http: add successful_response_code

Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-for-user Waiting for more information, tests or requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants