Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Move modules to separate repos #4129

Closed
paulfantom opened this issue Sep 7, 2018 · 24 comments
Closed

RFC: Move modules to separate repos #4129

paulfantom opened this issue Sep 7, 2018 · 24 comments
Assignees
Labels
discussion feature request New features priority/low A 'nice to have' not critical issue

Comments

@paulfantom
Copy link
Contributor

Netdata architecture allows to decouple main program and plugins/modules. We could have "less popular" modules (example: #4008) in separate repository with less relaxed QA and main program here. We could even do it as a "community" project with governance similar to vox populi.

This should increase product quality and could allow us to focus on main program instead of on supporting modules and creating new ones.

@paulfantom paulfantom changed the title [RFC] Decouple plugins to separate repo [RFC] Move "less popular" modules to separate repo Sep 7, 2018
@paulfantom
Copy link
Contributor Author

To clarify, I am saying only about app modules (like mysql.chart.py, postgres.chart.py etc.) not about plugins (like python one, nodejs one, etc.)

@paulfantom
Copy link
Contributor Author

We can leave more popular modules here and measure "popularity" by amount of issues reported regarding a module.

@Ferroin
Copy link
Member

Ferroin commented Sep 7, 2018

At minimum, it would be nice to have some more well defined infrastructure for handling of third-party modules. Something along the lines of how Python handles third-party libraries actually comes to mind (all the shipped modules go in the main directory, then there's a sub-directory for third-party and user-created stuff). Aside from making it a bit easier for users to sanely manage their own local modules, it would also make development a bit smoother (instead of having to reinstall a patched version to test things, you could just drop the module into the third-party directory, and then remove it once you know it works).

I'm not sure about measuring popularity based on issues though. Just because a plugin is popular doesn't mean it will have a lot of opened issues (especially if it's well written), and just because it has a lot of issues on GitHub does not mean it's popular (it could be very poorly written, but only used by a rather vocal minority).

@paulfantom
Copy link
Contributor Author

I'm not sure about measuring popularity based on issues though.

I am open to other methods of measuring it, any propositions?

@paulfantom
Copy link
Contributor Author

As for management of modules, I am ok with that as long as we don't come up with some new "content/package/repository management system"

@ccremer
Copy link
Contributor

ccremer commented Sep 7, 2018

Who would have access to the repos? only the original authors? Or would they go into the github.com/firehol namespace?
If the modules go into the author-owned repos, it could go into the opposite direction of quality as well. Let's imagine a simple method renaming in the python UrlService. Every "official" python module that uses this service could be adapted to that refactor, even if the original author is unavailable, in other words limited support is still possible. This is today's state.

If the modules go into completely author-owned repos, they will probably not adapt to the changes in time, meaning they stop working. If the author is unavailable, they even may get stale. Unless there is a ton of forking...

Naturally custom, non-contributed modules stop working either way.

I am not entirely against it, it's just that I feel that the number of problems remains; they just move to a different location.
Do you have more concrete ideas?

@paulfantom
Copy link
Contributor Author

paulfantom commented Sep 7, 2018

Who would have access to the repos? only the original authors? Or would they go into the github.com/firehol namespace?

Depends on how fine grained we would like it to be. If we want to have one repo-one module, then I think that creating github.com/netdata-community organization with community governance would be a good way to do it as Github allows to have a sufficient level of control over projects (write access to one repo, not all of them).
But other way would be to move modules to a few monorepos, like netdata-python-modules and have similar style of governance as above.

If the modules go into the author-owned repos, it could go into the opposite direction of quality as well

I was thinking of giving governance board full admin access to all module repos and authors only write access. This way someone could create a module somewhere in his/hers space and then later it could be promoted to community-supported module.

Current model isn't scaling well and we need something else or @l2isbad will burn out by being the only person supporting most modules.
This will also limit drive-by module contributions - currently when someone requests a module/feature, we implement it and then there is no one supporting it.

imagine a simple method renaming in the python UrlService

I follow kernel ways: https://lkml.org/lkml/2012/12/23/75 so it is unimaginable 😄

it's just that I feel that the number of problems remains; they just move to a different location.

It might be the case, but moving them to different location provides better project management and could lead a way to better prioritization of issues.

Vox Populi wants to have 5 persons in governance board. I was thinking of having 2 (or even only 1) from netdata and 3 from community there with yearly elections.

@Ferroin
Copy link
Member

Ferroin commented Sep 7, 2018

As much as I hate the thought of adding data collection here, is there some reason we couldn't do some anonymized collection of which modules are both configured and collecting data on a given netdata installation? Because that seems like the only realistic way to determine what constitutes 'popular' usage.

The other possibility is to just split all the modules out except for ones which really are nearly universal (I'm thinking cpufreq, cpuidle, sensors, and possibly the web_log plugin, just about everything else that's 'universal' is written in C). This one gets potentially very complicated for handling of the Go modules though, because those have to be present at build-time and can't be added in later (at least, I'm pretty sure they're that way), so if we do go this way, having something similar to how nginx handles third-party modules might make more sense. There, you pass in the source locations for third-party modules when you call the configure script, and they end up in the standard package.

I was thinking of giving governance board full admin access to all module repos and authors only write access. This way someone could create a module somewhere in his/hers space and then later it could be promoted to community-supported module.

I assume you're talking about something similar to how Ansible used to handle things? There, they had the core ansible repository containing the main framework and the core code (so, in our case, the C native stuff, possibly excluding the apps.plugin), the ansible-modules-core repository which provided support for the official modules, and the ansible-modules-extras repository which handled all the community supported stuff. I'd say a similar approach should (in theory) work reasonably well. In practice, we're technically near that point already, just with only one repository and only one person responsible for pretty much all the merges.

@paulfantom
Copy link
Contributor Author

is there some reason we couldn't do some anonymized collection of which modules are both configured and collecting data on a given netdata installation?

That is something which would need to be first implemented then we would need to wait for data to flow in (probably couple of months) and we need sth now.

This one gets potentially very complicated for handling of the Go modules though

We don't have go...yet. And I am still a bit reluctant of introducing it in the mix as we don't have any experienced go devs (I might be wrong about it). But you are right this model is more suitable for plugins written in interpreted languages.

Let's use yaml and write our own DSL for modules </ joke> 😄

I assume you're talking about something similar to how Ansible used to handle things?

Something similar. Have minimal netdata package in "core" repo and modules in separate place.

In practice, we're technically near that point already, just with only one repository and only one person responsible for pretty much all the merges.

That's one of reasons for creating this issue. This model is not scalable and not very community-friendly.

@paulfantom
Copy link
Contributor Author

paulfantom commented Sep 7, 2018

Ok, full disclosure. Motivations behind this issue:

  • no. of "support" issues regarding modules is too high and causes much noise
  • providing minimal netdata package (useful if you want only base functionality)
  • moving away modules introduced to core project by drive-by contributors and letting them die (if unused) instead of forcing their support.
  • encouraging community to write and maintain modules.
  • focusing on more important things, like providing packages, working on registry, authentication...
  • giving more people merge access to project but still limiting access to core framework.

I have no intention of inventing some new package management system.

@Ferroin
Copy link
Member

Ferroin commented Sep 7, 2018

That is something which would need to be first implemented then we would need to wait for data to flow in (probably couple of months) and we need sth now.

True, but it could help long-term with figuring out where resources are actually needed.

We don't have go...yet. And I am still a bit reluctant of introducing it in the mix as we don't have any experienced go devs (I might be wrong about it). But you are right this model is more suitable for plugins written in interpreted languages.

It's fine for actual plugins too, the issue is module oriented plugins written in compiled languages like what has been proposed for Go support.

Something similar. Have minimal netdata package in "core" repo and modules in separate place.

If we do go that way, having a setup like Ansible did where you can trivially get a full package with everything (essentially equivalent to what our current build does) would still be useful

I do think having a bare-bones core does make sense here though. Other possibilities to help reduce the footprint that come to mind include:

  • Having an option to install without the Web UI. Maybe call it a 'headless' mode? The idea here being that some use cases don't need the Web UI at all. A lot of streaming clients out there probably never have the dashboard used directly, and less code means less attack surface (and less disk usage).
  • Having an option to turn off some of the core code at build time. BTRFS and ZFS come to mind here as prime candidates, as do FreeIPMI, NFACCT, and TC/QoS All of these are cases where it's very easy to see in advance if you will need them or not, so not having to build them if you know you won't need them sees practical.
  • Have an option to completely turn off the plugins.d infrastructure. Similar argument to the above thing about some of the core plugins.

@ktsaou
Copy link
Member

ktsaou commented Sep 7, 2018

This is a nice discussion! I am glad to see it.

I think it correlates with my intention to create something a lot more generic. Something people can collaborate on any issue related to monitoring, performance or health troubleshooting, etc.

For sure a few problems could be solved the technical way. For example, we could use some kind of versioning on internal and external APIs, to ensure that different modules can really work, we could require something in each module repo so that the installer will checkout the right version, etc.

We will have to solve installation issues too. For example, requiring from users to install module by module will not work.

Thought, I really see the need for per-module, or even per-alarm collaboration. People get alarms and they are lost. They try to configure modules and they fail. So, having a community of mysql or postgres users collaborate is crucial. I really like this.

hm... we have to think a bit about it...

@paulfantom
Copy link
Contributor Author

Having an option to turn off some of the core code at build time. BTRFS and ZFS come to mind here as prime candidates, as do FreeIPMI, NFACCT, and TC/QoS All of these are cases where it's very easy to see in advance if you will need them or not, so not having to build them if you know you won't need them sees practical.

So basically - feature flags. This is nice to have but personally I think we should focus more on pre-packaged netdata binary than on installation script. This should increase netdata adoption, easy installation process and reduce number of issues like #4118 and similar ones.


we could use some kind of versioning on internal and external APIs

I think it is mentioned in almost every "API design best practices" guidebook to version APIs :)

we could require something in each module repo so that the installer will checkout the right version

If we won't break core interfaces and stick to semantic versioning then I think that requiring only release version will be enough. There is no need to require sth from module authors when most breaking changes are most likely to be introduced by us not them. Let's just quote Torvalds - "Do not break userspace" 😄

We will have to solve installation issues too. For example, requiring from users to install module by module will not work.

I was thinking of installing all "community" modules by default and/or packaging them as separate deb/rpm/other bundle packages (all python community modules in one package etc.). We already have a mechanism to automatically switch off modules when they are not needed, so I don't see any reason to provide a module-by-module installation method.
In current repo state I think the easiest way would be to use git submodules, but this is not a good idea in the long run.

Thought, I really see the need for per-module, or even per-alarm collaboration. People get alarms and they are lost. They try to configure modules and they fail. So, having a community of mysql or postgres users collaborate is crucial. I really like this.

That's why guys from prometheus and grafana are collaborating on using "monitoring mixins". And they are pushing it even further, so that program authors could include "mixins" in their repo, since who knows better what and how to monitor in an app than the creator of that app.
Example: https://github.com/etcd-io/etcd/tree/master/Documentation/etcd-mixin
Mixins design doc: https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/edit#

@ktsaou
Copy link
Member

ktsaou commented Sep 9, 2018

Interesting!

Well, we could develop and PrometheusImporter (or OpenMetricsImporter) service in python plugin to import remote prometheus application metrics. Then, we could allow netdata mixins, like this:

  1. A python module that uses PrometheusImporter and adds some structure to the metrics (groups them in charts and adds titles, units, etc).
  2. A js file that would provide dashboard metadata (template for the top of the app, info boxes above the charts, colors, etc).
  3. A healfth.conf file with the alarms.

We could then maintain a registry of such repos, and we could provide an installer (much like npm) which could just checkout and install the ones needed by the users. With version checks, we could ensure this will always be stable, or the installation will fail.

The above would allow users to maintain plugins, dashboard metdata and alarm configs at separate repos.

@ktsaou
Copy link
Member

ktsaou commented Sep 9, 2018

And actually it could be generic enough, so that plugins and modules of any language could be defined.

@paulfantom
Copy link
Contributor Author

paulfantom commented Sep 9, 2018

A js file that would provide dashboard metadata

No need to use general purpose language for storing config/metadata. It would be better to use json/yaml/jsonnet

we could provide an installer

Ok, as long as this is something NOT developed by us. There are many installers there and there is no need to reinvent wheel. To quote myself from comment somewhere above:

I have no intention of inventing some new package management system.

@paulfantom
Copy link
Contributor Author

Also this is an off topic :)

@ktsaou
Copy link
Member

ktsaou commented Sep 9, 2018

ok, let's flirt a bit with this idea.

Let's suppose we are splitting most application specific packages to separate packages repos, how would it work? How would you do it, so that people will still be able to use a coherent package? How distros will package it? Which installer would you use for these repos, etc?

@paulfantom
Copy link
Contributor Author

so that people will still be able to use a coherent package?

The whole point is not to ship everything to everyone and ship only "proven/graduated" modules. Modules which are proven in production by many people and which are watched, used, and supported by someone.
But if we want to support everything, just use git submodules and include it in core package.

How distros will package it?

I am in favor of doing packaging on our side and using packagecloud.io to distribute it. Official channels are too slow for our update model. So packaging won't be a problem, we can provide netdata-extras-python as an additional package to netdata-core and metapackage netdata which wold install everything.

Which installer would you use for these repos, etc?

I haven't been doing research on that topic yet, probably embedding git would be the simplest way. But I know that developing everything by ourselves is a clean path to support hell - relevant XKCD: https://xkcd.com/927/

I think I need to "make CI great again" and create release pipeline first before going back to this idea, so it will be easier to explain it.

@ktsaou
Copy link
Member

ktsaou commented Sep 9, 2018

I like the idea of distributing our own packages for most common distros.
We will need to support a lot of them though. For sure we need a maintainer for this.

@Ferroin
Copy link
Member

Ferroin commented Sep 10, 2018

So basically - feature flags. This is nice to have but personally I think we should focus more on pre-packaged netdata binary than on installation script. This should increase netdata adoption, easy installation process and reduce number of issues like #4118 and similar ones.

Pretty much. I agree that focusing on packaging things well is a much better ROI short-term, but it might be useful to split some of this stuff out as separate plugins so that it could end up packaged similarly to how we're talking about handling the other plugins.

If we won't break core interfaces and stick to semantic versioning then I think that requiring only release version will be enough. There is no need to require sth from module authors when most breaking changes are most likely to be introduced by us not them. Let's just quote Torvalds - "Do not break userspace"

Having the modules embed a bit of metadata about which API version they expect to run against would be good though. That way, Netdata can check any user provided modules to see if it can run them or not, and log a user-friendly error if it can't (as opposed to trying to run them blindly, and failing with a potentially really ugly and incomprehensible error).

I was thinking of installing all "community" modules by default and/or packaging them as separate deb/rpm/other bundle packages (all python community modules in one package etc.). We already have a mechanism to automatically switch off modules when they are not needed, so I don't see any reason to provide a module-by-module installation method.

I mostly agree here, with the caveat that I would probably split the Python modules into two packages, stuff that's likely to be useful to almost everybody (cpuidle, cpufreq, sensors, etc), and stuff that's not likely to be needed on all installations (the various database modules for example).

In current repo state I think the easiest way would be to use git submodules, but this is not a good idea in the long run.

Agreed, git submodules are a pain in the arse to work with, even if they would be an easy way to handle this.

I am in favor of doing packaging on our side and using packagecloud.io to distribute it. Official channels are too slow for our update model. So packaging won't be a problem, we can provide netdata-extras-python as an additional package to netdata-core and metapackage netdata which wold install everything.

Packagecloud.io sounds like a great idea for the big distros (although the bandwidth limits may be an issue). It doesn't exactly work for everything though.

Gentoo immediately comes to mind, though if you get things fleshed out so that it's easily possible to build individual 'packages' locally on a system (preferably continuing to use methodology similar to what's done currently), I can put together a portage overlay for Gentoo-based distributions that will mirror how we structure things for other distributions.

Other platforms that aren't supported by Packagecloud.io but should ideally continue to be supported by netdata include:

  • Alpine Linux
  • Arch Linux and derivatives (this should be easy though, just create AUR packages).
  • Slackware and derivatives.
  • FreeBSD (ideally we should have stuff in the ports tree, which should get us easy support for most of the FreeBSD derived stuff like pfSense and FreeNAS).
  • MacOS (homebrew support on Packagecloud.io may be a possibility though if we ask nicely, and unlike the other stuff should work reasonably easily).
  • The various embedded Linux platforms we currently support.

Also, any decisions made here regarding packaging need to be relayed to the various maintainers.

@paulfantom paulfantom changed the title [RFC] Move "less popular" modules to separate repo [RFC] Move modules to separate repo Nov 18, 2018
@paulfantom paulfantom changed the title [RFC] Move modules to separate repo [RFC] Move modules to separate repos Nov 18, 2018
@paulfantom
Copy link
Contributor Author

Pros Cons
easier development Documentation synchronization
separate release cycles a bit harder to create a release, we probably will need to decouple release cycles
we can use build system native to a language
better directory structure, more close to ones used by language
separate test frameworks
faster CI feedback loop due to less things needed to be checked

@paulfantom paulfantom self-assigned this Nov 18, 2018
@paulfantom paulfantom added the priority/low A 'nice to have' not critical issue label Nov 19, 2018
@Ferroin
Copy link
Member

Ferroin commented Nov 19, 2018

easier development

Possibly, but not necessarily the case,

separate release cycles

This can be a con too. Think of what things will be like for users having to deal with keeping packages in sync across changes like #4562. Requiring complex limits on dependencies is a pain in the arse for users.

we can use build system native to a language

Only an advantage for some languages. This would be great for Go. This will be useless for whatever charts.d plugins we keep, and it's going to be a drastic sweeping change to how everything related to the Python and Node.js plugins are handled, which I'm not sure is a good thing here (personally, I see zero reason to use setuptools for the python.d.plugin stuff or NPM for the node stuff, it may actually make integration harder).

better directory structure, more close to ones used by language

I'd argue that for most languages we're likely to use, this isn't much of a benefit.

separate test frameworks

While this is a good thing, I don't think we need separate repos for it.

faster CI feedback loop due to less things needed to be checked

Given what I've seen, we could speed things up significantly just by dropping Codacy from the checks, and that would probably do far more than subdividing the repo.

Documentation synchronization

As an alternative to doing tightly integrated documentation, we could do a better job of encapsulating the docs properly so that we don't need to keep everything in tight sync. For example, there's no reason that the individual Python.d modules need to actively reference anything but themselves and the python.d.plugin docs in their documentation.

a bit harder to create a release, we probably will need to decouple release cycles

This ties in with my comment above about separate release cycles. While I wasn't too worried about it originally, recent experience, both here and in dealing with updates for other software, has made me come to the conclusion that you should avoid splitting packaging in such a way that you end up with these hard-breaks for version compatibility if you aren't going to have lock-step dependencies.

Put a bit differently, this could go three ways:

  • We keep a single netdata package like we're doing now. This would require the minimum of work for end-users and downstream maintainers, but the most for us.
  • We do independent versioning. That is, version numbers don't imply strict dependency on equivalent version numbers. This is the easiest for us, but the most painful for end-users because of having to deal with stuff like Move cpufreq python module to proc plugin #4562.
  • We do lock-step versioning of all the packages. IOW, netdata-python 1.12 depends strictly on netdata-core 1.12. This type of thing is pretty commonplace for packaging of certain things and is reasonably accepted. It nets us the split arrangement we want, lets users pick whether they want specific things, and isn't as painful to deal with for end-users as independent versions. It's a non-trivial amount of work for us though, as we have to extend the netdata plugin protocol to have a version handshake so we can properly enforce it.

@paulfantom paulfantom changed the title [RFC] Move modules to separate repos RFC: Move modules to separate repos Nov 24, 2018
@stale
Copy link

stale bot commented Jan 8, 2019

Currently netdata team doesn't have enough capacity to work on this issue. We will be more than glad to accept a pull request with a solution to problem described here. This issue will be closed after another 60 days of inactivity.

@stale stale bot added the stale label Jan 8, 2019
@paulfantom paulfantom added the feature request New features label Jan 8, 2019
@stale stale bot removed the stale label Jan 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion feature request New features priority/low A 'nice to have' not critical issue
Projects
None yet
Development

No branches or pull requests

4 participants