Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integrate go.d into netdata #5006

Closed
ilyam8 opened this issue Dec 17, 2018 · 38 comments · Fixed by #5199
Closed

integrate go.d into netdata #5006

ilyam8 opened this issue Dec 17, 2018 · 38 comments · Fixed by #5199
Assignees
Labels
area/collectors Everything related to data collection area/packaging Packaging and operating systems support collectors/go.d feature request New features priority/high Super important issue
Milestone

Comments

@ilyam8
Copy link
Member

ilyam8 commented Dec 17, 2018

Summary

I see we have golang orchestator.

It lives in it's own repo. We need to find a way to integrate it to netdata.

@paulfantom said we have 3 options how to do it.

@netdatabot netdatabot added needs triage Issues which need to be manually labelled no changelog Issues which are not going to be added to changelog labels Dec 17, 2018
@paulfantom paulfantom added area/packaging Packaging and operating systems support feature request New features area/external priority/high Super important issue and removed needs triage Issues which need to be manually labelled no changelog Issues which are not going to be added to changelog labels Dec 18, 2018
@paulfantom
Copy link
Contributor

paulfantom commented Dec 18, 2018

Three options:

move code into netdata/netdata

Pros:

  • simple to do
  • simple packaging
  • nothing new
  • simple documentation creation

Cons:

  • only one CI pipeline instead of separate specific ones
  • lower development velocity
  • external plugins isn't really external

git submodule

Use a concept of git submodules for vendoring go.d.plugin code into netdata repo.

Pros:

  • external repo
  • showing a path for including other external plugins (ex. java)
  • creating a path to move other external plugins out of tree
  • CI systems can be domain specific

Cons:

  • people doesn't like managing git submodules
  • repo synchronization

Releases

Do separate releases of go.d.plugin

Pros:

  • same as with git submodules
  • super simple split packaging

Cons:

  • we need to have a version compatibility matrix or lock development cycles

@paulfantom
Copy link
Contributor

Also everything from #4129 is relevant to this issue.

@cakrit
Copy link
Contributor

cakrit commented Dec 18, 2018

Additional cons for separate releases:

  • Complexity for the users (install/update separately, another place to look for info).
  • Additional package maintenance.
  • Separate documentation?

I'm personally leaning towards having it in netdata/netdata, with second option the submodule. Can you explain a the 'repo synchronization' issue a bit more?

@ilyam8 which ones are your favorites?

@Ferroin
Copy link
Member

Ferroin commented Dec 18, 2018

@cakrit A Git submodule works by specifying an exact remote commit hash which is checked out when the submodule is initialized. This has a whole lot of nasty implications, with the biggest two being that you cant track the top commit of a branch in the submodule (including not being able to track the tip of the default branch), and once you commit an update in the main repository to what commit is tracked from the submodule, you can't rebase that commit or anything before it in the submodule's repo or everything falls apart.

Personally, I'm also in favor of the combined repo, it's a bit more complicated for CI, but it significantly simplifies things for users.

@ilyam8
Copy link
Member Author

ilyam8 commented Dec 18, 2018

which ones are your favorites?

I want go.d to remain a separate repo. External plugin should be external, if i remember it right, that is the direction we chose.

@Ferroin
Copy link
Member

Ferroin commented Dec 18, 2018

I would very much suggest that we look into versioning the external plugin API, as that will greatly simplify the version compatibility issues (Netdata would then be easily able to tell people that the plugin version is bad, instead of just randomly failing).

@paulfantom
Copy link
Contributor

Personally I am in favor of doing a separate releases.

@Ferroin
Copy link
Member

Ferroin commented Dec 20, 2018

In light of the number of issues being opened about moving stuff to go.d, I would very much suggest that we decide on this and then figure out how we're going handle the transition, especially considering that what's probably one of the most widely used python.d collectors is getting switched over to go.d (namely, the web_log plugin). If we screw this up, it's likely to alienate a lot of users.

If we go with merging go.d into the repo, it becomes a lot easier to ensure that stuff keeps working for end users across the upgrade. If not, we need to have some way to get this pulled in and handled by upgrades until we get full packaging working properly (and we need an easy way to bundle it into the build for people who are just building themselves).

@ktsaou, your opinion would be helpful here.

@ktsaou
Copy link
Member

ktsaou commented Dec 30, 2018

I think the technical solution is not that important. We can pick any of the solutions is more convenient, provided that:

  1. kickstart.sh will continue to work the way it does, including all official plugins
  2. kickstart-static64.sh will also install everything (so makeself has to be adapted too)
  3. deb, rpm, etc packages can be built, for the whole thing (not separate packages)
  4. netdata-installer.sh will manage to install the whole thing - keep in mind there are 2 use-cases here: with git, without git.

So, I agree with @Ferroin that the key here is the user experience. If you can make it simple for users, any solution is acceptable.

If you can't make it simple for users, then I suggest to use the solution that is simpler for them.

There is also another alternative: merge everything in now (the currently simplest for users), and split it when we will have our own packaging (it will be the simplest for users at that time).

@Wing924
Copy link
Contributor

Wing924 commented Dec 31, 2018

if we move go plugin into netdata repo, users need golang to compile the whole project.
we can expect users have gcc since it’s default compiler in linux, but how can we expect they have golang?
I think provide a single prebuilt golang binary and download them from kitckstart script is better.

@ktsaou
Copy link
Member

ktsaou commented Dec 31, 2018

I think provide a single prebuilt golang binary and download them from kitckstart script is better.

This is a nice idea too.

@Wing924
Copy link
Contributor

Wing924 commented Dec 31, 2018

In addition, golang relay on git commits and tags to versioning.
Because golang compile everything into single binary, if someone want to add custom module, they need to fork it or use go plugin as library. In this case, git tags and commits become very important.

Enforce netdata’s git tag may not fit the go plugin’s development cycle.

@paulfantom
Copy link
Contributor

I think provide a single prebuilt golang binary and download them from kitckstart script is better.

Enforce netdata’s git tag may not fit the go plugin’s development cycle.

Those are solved when doing separate releases :)

@paulfantom
Copy link
Contributor

I think the technical solution is not that important.

Fully disagree. In this case technical solution directly translates to user experience. If we have two separate repos, then:

  • we also have two separate support channels
  • we show how user can maintain his own external plugin without a need of forking netdata repo and keeping it up to date

@Ferroin
Copy link
Member

Ferroin commented Jan 2, 2019

I think provide a single prebuilt golang binary and download them from kitckstart script is better.

Enforce netdata’s git tag may not fit the go plugin’s development cycle.

Those are solved when doing separate releases :)

No, only the tagging issue is. If go.d is still an official plugin, then it should get installed as part of an install with kickstart.sh. So, we need to:

  • Detect if there is a version of Go on the system.
  • Have an option to outright disable building go.d, even if we could build it.
  • Either find some way to support building on systems which don't have a system version of Go (isolated local build will probably be better for this than a pre-built bundle from upstream), or include a warning in kickstart.sh that Go is not installed.
  • Set up something to poke the user to rebuild if the locally installed version of Go changes (this is actually kind of important, Go updates have had ABI incompatibilities before).

@ktsaou
Copy link
Member

ktsaou commented Jan 3, 2019

I think the technical solution is not that important.

Fully disagree. In this case technical solution directly translates to user experience.

This is what I said too. User experience is the important factor. Not the technical solution.

  • we also have two separate support channels

This is not bad. Actually I would love if we could have more, so that focus communities can be built.

  • we show how user can maintain his own external plugin without a need of forking netdata repo and keeping it up to date

Since Go is a compiled language, it would also be nice if our Go installer could fetch third party plugins into it. This means we could split go.d.plugin even more (multiple repos).

But let's solve the most basic problem first: how to have go.d.plugin be installed with netdata...

@Wing924
Copy link
Contributor

Wing924 commented Jan 3, 2019

installing and compiling are very easy. please check this dockerfile.
https://github.com/netdata/go.d.plugin/blob/master/Dockerfile.dev

@paulfantom paulfantom added this to the v1.12 milestone Jan 3, 2019
@Ferroin
Copy link
Member

Ferroin commented Jan 3, 2019

@Wing924 The issue isn't any difficulty in installing and compiling, it's getting it integrated with the regular Netdata installs, which is kind of a prerequisite for actually switching anything it supports to using it instead of the existing (usually Python) modules. Then there's also the fact that it has to be rebuilt to include any extra modules you might want that are written in Go.

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 3, 2019

I think option 3 from @paulfantom post is very nice and a way to go.

Then there's also the fact that it has to be rebuilt to include any extra modules you might want that are written in Go.

If you are talking about custom modules - no official custom modules support for now. If someone need custom module - python.

@Ferroin
Copy link
Member

Ferroin commented Jan 3, 2019

I think option 3 from @paulfantom post is very nice and a way to go.

I still contend that if we are going to do that we need to:

  • Version the plugin API and have a handshake so that Netdata can figure out what API version the plugin wants. This will let us give actually useful error messages to users if the plugin doesn't work correctly, instead of some potentially cryptic gibberish.
  • Have a way for Netdata to query the plugin's version (because if we're doing separate versioning, we need to include the plugin version in the version info displayed by Netdata).

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 3, 2019

Version the plugin API

External plugin API ?

@Ferroin
Copy link
Member

Ferroin commented Jan 3, 2019

Version the plugin API

External plugin API ?

Yes, the external plugin API for Netdata itself. My reasoning on this is about the same as versioning a REST API, it lets you check that things behave the way you expect them to, makes sure you know what you can actually use, and ensures that you can complain in a useful manner if those requirements aren't met.

@paulfantom
Copy link
Contributor

We agreed on not including go.d.plugin into v1.12 release mostly due to problems with configuration. Just after cutting out v1.12 release I will create a PR which allows to optionally install go.d.plugin.

How it will work:

On go.d.plugin side:

  • go.d.plugin will be separate repo with separate CI and release schedule
  • each release will consist of following artifacts:
    • one tarball with configuration files
    • multiple pre-build binary files for different CPU architectures
    • sha256sum.txt file with checksums

Steps on installer side:

  • Installer will contain a map of sha checksums for particular go.d.plugin release. We won't use latest so we can have installation reproducibility.
  • Installation will execute those steps:
    1. detect host CPU architecture
    2. calculate checksum for already installed go.d.plugin binary (only when upgrading)
    3. compare calculated checksums with embedded ones and do not anything when they match. (only when upgrading)
    4. if checksum cannot be calculated or is different from one from sha256sum.txt file, then download new binary and config files
    5. compare checksums of downloaded files with ones embedded into installer
    6. unpack tarball with configuration files and replace old ones in /usr... (respecting prefixed installations)
    7. replace go.d.plugin with a new one

All checksum embedding and comparison is needed to be sure we are downloading what we need and when we need it. I am considering extending it with GPG in the future to increase security even more.

Installation conditionality

Due to problems with configuration handling we will include go.d.plugin as an optional installation disabled by default until #5144 is done. Until then the above mechanism will be running only when --enable-go flag is passed to installer script. After we have a translator we will switch to using go.d.plugin by default and --enable-go flag will be removed.

Translator

We want to kill two birds with one stone. To do this go.d.plugin will store its configuration files in conf.d directory which is breaking previous schema (old schema would store config in go.d directory). Translator will populate /etc/netdata/conf.d with configuration from /etc/netdata/python.d so we can have a clean transition. Later we might want to include new configuration schema mechanism into python.d.plugin and remove old one.

@ktsaou
Copy link
Member

ktsaou commented Jan 10, 2019

As we discussed, there are 3 cases about go.d.plugin modules:

  1. new modules, that do not overlap with other netdata modules. These modules will be immediately used when go.d.plugin is installed with netdata.

  2. overlapping modules, that do not have any user configuration. These modules could also be used immediately when go.d.plugin is installed with netdata.

  3. overlapping modules that have user configuration in python.d. These configuration need to be migrated to enable them in go.d.plugin.

I understand that 1 and 2 are pretty simple and the initial integration of go.d.plugin in netdata could take care of them.

Then, for 3 we need a configuration converter.

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 10, 2019

overlapping modules, that do not have any user configuration. These modules could also be used immediately when go.d.plugin is installed with netdata.

Not so simple. Python will use stock configuration in that case.

@ktsaou
Copy link
Member

ktsaou commented Jan 10, 2019

Not so simple. Python will use stock configuration in that case.

I guess the whole idea is to find a way to selectively disable python.d.plugin modules and enable the corresponding go.d.plugin modules.

Can we avoid this?

@ktsaou
Copy link
Member

ktsaou commented Jan 10, 2019

We want to kill two birds with one stone. To do this go.d.plugin will store its configuration files in conf.d directory which is breaking previous schema (old schema would store config in go.d directory). Translator will populate /etc/netdata/conf.d with configuration from /etc/netdata/python.d so we can have a clean transition. Later we might want to include new configuration schema mechanism into python.d.plugin and remove old one.

I think we should stick to /etc/netdata/go.d for go.d.plugin config files, until we have the configuration manager we discussed about.

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 10, 2019

Maybe this flow:
Global announce: netadta is migrating from python to go.

1.12:

  • go.d.plugin with new modules
  • announce that we have group of python modules rewritten in go, they are disabled by default, but ready to use.
  • announce that this group will be disabled in 2 month (1.13)

1.13:

  • go.d.plugin with new modules
  • disable python group from 1.12
  • announce that we have group of python modules rewritten in go, they are disabled by default, but ready to use.
  • announce that this group will be disabled in 2 month (1.14)

In that case we don't need any translators. It's clear for me.

I understand that it always be some users which didn't read announcement. But it is 2 month still.

@Ferroin
Copy link
Member

Ferroin commented Jan 10, 2019

I think the approach proposed by @ilyam8 is probably the best way to go, but it should be kept in mind that we can't realistically convert everything quickly. We should probably also continue to accept new modules written in Python unless we want to potentially exclude a sizable number of developers from contributing.

@ktsaou
Copy link
Member

ktsaou commented Jan 10, 2019

@ilyam8 I am very sorry. This makes netdata significantly worst.

I can only accept solutions that will improve user experience.

@Ferroin
Copy link
Member

Ferroin commented Jan 10, 2019

@ktsaou Some feedback on how you feel it makes things worse would be good. Given that the plan (at least, as I understood it) has been to migrate stuff from python.d to go.d, this seems to be the most logical approach to integrating the go.d plugin with Netdata while giving users adequate warning that things are changing.

@ktsaou
Copy link
Member

ktsaou commented Jan 10, 2019

@Ferroin the current proposal means that all users that have configured netdata python.d.plugin data collection plugins:

  1. will suddenly one day stop collecting metrics from these plugins
  2. will have to migrate their configuration to go.d.plugin by hand

To my understanding the above can be avoided:

  • job configuration for python and go could be the same.

  • if the new go.d.plugin job configs are so much better and such a breaking change is really required, a migration tool should be provided. Actually developing such a migration tool could be simpler than writing a complicated migration guide that will explain everything field by field.

  • if the new go.d.plugin job configs as so much better, then we should also migrate python.d.plugin to them. There is no point to have 2 different ways for configuring the same data collection jobs.

Therefore, I understand that the current proposal will unnecessarily complicate things, requiring from our users to manually re-configure data collection. A lot of wasted effort.

It also assumes that netdata will forever support 2 different ways for configuring the same thing: the python way and the go way.

If you check the netdata home page and README.md, netdata is about making things simpler. Zero configuration, immediate results, a system that offers its users the ability to focus on their monitoring work, not on the internals of the tool, etc.

So, to my understanding, the proposed way increases configuration complexity unnecessarily.

@Ferroin
Copy link
Member

Ferroin commented Jan 10, 2019

I see no reason that what @ilyam8 proposed couldn't be combined with proper configuration migration/management.

Overall, the transition should ideally look similar to how the conversion of the cpufre and cpuidle modules to C (IOW, when the switch happens, the python.d modules don't run, and the go.d modules use the same name as the python.d modules), albeit with config migration, such that a user who doesn't pay attention to the change log notices no differences unless they go and try to reconfigure the python.d module, but still gets the benefits of the new go.d code.

The only problematic part I see is handling of systems which don't have Go installed, but that shouldn't be too hard if we just keep the python.d modules around and compute the list of which ones to not run during the build (so for systems without Go, the Python code continues to be used).

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 10, 2019

@ktsaou understood.

Here is another suggestion:

We are not in a hurry with replacing python modules with go. It's very nice and all but not high priority.

I suggest to stick to new modules. As you all know we have a veeery long list modules to implement.

We will:

  • add new modules to go.d (main task)
  • rewrite python to go (background task)

We will wait for Central Configuration thing, i really like it, it is very nice! It will handle migration.

Migration process is to vague for now, i suggest to postpone with it. I see no reason to hurry with it.

All overlapping modules will be disabled by default.

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 10, 2019

if we just keep the python.d modules around and compute the list of which ones to not run during the build (so for systems without Go, the Python code continues to be used).

Central Configuration Service is responsible for that.

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 10, 2019

All overlapping modules will be disabled by default.

Don't worry. They will be tested by community. I am 100% sure a lot of (advanced) users will switch to go version.

@ilyam8
Copy link
Member Author

ilyam8 commented Jan 11, 2019

It also assumes that netdata will forever support 2 different ways for configuring the same thing: the python way and the go way.

Basically it's one way - multi job yaml configuration.
The main difference:

  • python and go has different options for modules. Ex. python use scheme, host, port, go use url - it is ok and we should allow that.
  • go jobs is a list of jobs, python jobs is a dict (ordered) of jobs.

We don't need universal config, we need universal output format. Example: json.

Central Configuration Service gathers jobs config from different places and returns list of jobs in json to go.d, python.d, whatever.d.

We don't need one configuration file for all plugins, because if we have 1 config file we have 2 options:

  • mix different options in one file (like if you use python please use scheme, host and port, if you use go please use url) - ugly.
  • force all plugins to use same options (rewrite python) - not flexible and ... i see no reason to do it.

That is how i see Central Configuration Service. It is a collection of plugins.

It has input plugins (service discovery plugins). Python file reader reads python files, tags and formats them and waits for request from python.d.plugin, same goes for go.d.

And CCS will decide (somehow) which plugin for specific module will take precedence over other.


EDIT:

python and go has different options for modules. Ex. python use scheme, host, port, go use url - it is ok and we should allow that.

Ok, this is still a problem when we have another input source (ex. consul, not file) 😢

@cakrit
Copy link
Contributor

cakrit commented Jan 11, 2019

As dicussed this morning, the first step is clear, we will include go.d plugin with the modules common with python disabled by default.
To plan the next step, @ilyam8 will have on Monday a draft documentation of what is included in step 1 and what users can do for each common plugin to migrate manually from python to go.d. This document will let us identify all the difficulties, so we can plan the next step.

We also discussed the introduction of the central configuration as a breaking change, but we agreed that we have a more general issue with breaking changes, that doesn't make them feasible in the short term: We would first need to ensure that we have a communication channel that guarantees we reach the vast majority of our user base. Without any such guarantee, breaking changes are off the table.

We will continue the discussion on the next step for the go.d plugin after we have the data from the document, but the discussion on the comm. channell and breaking changes will proceed independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collectors Everything related to data collection area/packaging Packaging and operating systems support collectors/go.d feature request New features priority/high Super important issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants