Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for Mesos master #682

Closed
wants to merge 19 commits into from
Closed

Initial support for Mesos master #682

wants to merge 19 commits into from

Conversation

tripledes
Copy link

This PR adds support for Mesos masters metrics. Let me know if anything should be change/modified.

Last commit could be reverted, although I think it could help avoiding issues with existing users when adding more mesos components.

@tripledes
Copy link
Author

Sorry, missed that issue reported by the test, will fix right away.

}
}

func TestSampleConfig(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to test this

@sparrc
Copy link
Contributor

sparrc commented Feb 11, 2016

looks good, thanks @tripledes, I'll do a full review tomorrow

@tripledes
Copy link
Author

@sparrc Thanks, I'll remove those tests later today.

@tripledes
Copy link
Author

Hmm seems the test failed but not because of my changes :)

@sparrc
Copy link
Contributor

sparrc commented Feb 12, 2016

Code looks good, but can you write a README based on this template?: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/EXAMPLE_README.md

@tripledes
Copy link
Author

@sparrc Just added the README, test seems to fail because of something else.

@sparrc sparrc removed the Needs Work label Feb 16, 2016
@sparrc
Copy link
Contributor

sparrc commented Feb 16, 2016

@tripledes I pulled down your changes but I'm also getting the config file panic that CircleCI is hitting. I can't see anything in your code that would cause this, still investigating...

@sparrc
Copy link
Contributor

sparrc commented Feb 16, 2016

@tripledes I'm sorry but I really can't figure out what the problem is here, when I create a sample-config file using your code, I get toml parse panic. When I simply delete the inputs.mesos table, I don't get the panic (using your code or the master code)

@sparrc
Copy link
Contributor

sparrc commented Feb 16, 2016

these are basically the steps to reproduce:

  1. telegraf -sample-config > telegraf.conf
  2. telegraf -config telegraf.conf -input-filter cpu:mem -test

@tripledes
Copy link
Author

@sparrc I'll have a look, thanks for taking the time to check.

@tripledes
Copy link
Author

@sparrc not sure about this, but feels like a bug in the TOML parser, could you please give a try to the following config snippet?

# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms
  timeout = 100
  # A list of Mesos masters. e.g. master1:5050, master2:5050, etc.
  # The port can be skipped if using the default (5050)
  # Default value is localhost:5050.
  masters = ["localhost:5050"]
  # Metrics groups to be collected.
  master_collections = ["resources"]

As soon as I add another comment on top of master_collections or more elements to it, the parser crashes. There are currently some issues opened against naoina/toml that feel somehow related.

@tripledes
Copy link
Author

@sparrc quick update, seems related to multi-line comments, leaving the sample config to the following, works:

# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms.
  timeout = 100
  # Default value is localhost:5050.
  masters = ["localhost:5050"]
  # Default, all enabled.
  master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]

Please, give a try and let me know whether that would be enough for the sample config and I'll modify the method.

Still feels like something funny is going on in the parser.

Cheers.

@sparrc
Copy link
Contributor

sparrc commented Feb 16, 2016

hm, definitely seems like a bug in the parser, I don't understand....there are other plugins with multi-line comments, correct? Why does this seem to only affect yours?

@tripledes
Copy link
Author

Good question :)

So, different combinations:

  • The following ones work:
# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms.
  timeout = 100
  # Default value is localhost:5050.
  masters = ["localhost:5050"]
  # Metrics groups to be collected.
  # Default, all enabled.
  master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]
# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms.
  timeout = 100
  # The quick brown fox jumps over the lazy dog
  masters = ["localhost:5050"]
  # The quick brown fox jumps over the lazy dog
  master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]
# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms.
  timeout = 100
  # The quick brown fox jumps over the lazy dog
  masters = ["localhost:5050"]
  # The quick brown fox jumps over the lazy dog
  # blahblah
  master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]
# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms.
  timeout = 100
  # The quick brown fox jumps over the lazy dog
  # blahblah
  masters = ["localhost:5050"]
  # The quick brown fox jumps over the lazy dog
  # blahblah
  master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]
  • The following and any other attempt to add more chars to the comments, don't work:
# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
  # Timeout, in ms.
  timeout = 100
  # The quick brown fox jumps over the lazy dog
  # blahblaha
  masters = ["localhost:5050"]
  # The quick brown fox jumps over the lazy dog
  # blahblah
  master_collections = ["resources","master","system","slaves","frameworks","messages","evqueue","registrar"]

So still feels like something wrong in the parser...although I'm not 100% what's the combination that triggers it.

@sparrc
Copy link
Contributor

sparrc commented Feb 16, 2016

that's very odd, I wish I could get rid of the naoina toml parser but unfortunately I haven't been able to find anything that has the same level of functionality :(

@tripledes
Copy link
Author

@sparrc so if you agree, I'll shorten the comments and push the change.

Regarding the TOML parser, didn't you guys wanted to fork it? I kind of remember some comments in the influxdb project. I understand your point, but the naoina toml project seems rather dead...it hadn't got any activity for some months.

@sparrc
Copy link
Contributor

sparrc commented Feb 17, 2016

yes please push the change to shorten comments.....I'm not really following the status of our config library, but it certainly seems like naoina is not maintaining the library very actively.

@tripledes
Copy link
Author

I know this makes no sense, but just moving all mesos settings to memcache plugin section and removing the mesos section, makes the parser to crash in the same fashion....

Ok then, I'll push right away.

@sparrc
Copy link
Contributor

sparrc commented Feb 17, 2016

🙈 🔫

@sparrc
Copy link
Contributor

sparrc commented Feb 17, 2016

thanks @tripledes, I'll try to get this merged tomorrow

@@ -184,6 +184,18 @@
# If no servers are specified, then localhost is used as the host.
servers = ["localhost"]

# Telegraf plugin for gathering metrics from N Mesos masters
[[inputs.mesos]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make sure match the new SampleConfig?

@sparrc
Copy link
Contributor

sparrc commented Feb 18, 2016

thanks @tripledes, just a couple places to make the new SampleConfig consistent

Sergio Jimenez added 2 commits February 18, 2016 09:00
The plugin is able to query a Mesos master and push the metrics, a
blacklist can be configured and a timeout, it's still not used.

Added unit test, might be a good idea to have system test using docker.
Sergio Jimenez added 17 commits February 18, 2016 09:00
The plugin will iterate over the Servers slice and create a goroutine
for each of them.
* Now the user selects what to push instead of what not
* Required to check and improve tests
* Missing checks in the code when MetricsCol is empty
* Defined global var for holding default metric groups
* Refactor removeGroup() to work with the whitelist
* Refactor TestRemoveGroup()
* Use timeout as parameter in the http request
* A bit of cleanup
* More tests
* Fixed import for JSONFlattener{} it's now in parsers, broke after
  rebasing.
* This should help backwards compatibility when adding more features or
  supported Mesos components
* Use it as a paramater for the closure
And on the test configuration file
* It was still using the previous config name
@sparrc sparrc closed this in 963c51f Feb 18, 2016
@sparrc
Copy link
Contributor

sparrc commented Feb 18, 2016

merged, thanks @tripledes

@sparrc sparrc mentioned this pull request Feb 19, 2016
@asdfsx
Copy link

asdfsx commented Feb 22, 2016

excellent job!!! @tripledes

geodimm pushed a commit to miketonks/telegraf that referenced this pull request Mar 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants