Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Kickstart script doesn't check for default release configuration on Debian-based Linux #12837

Open
PatTheMav opened this issue May 6, 2022 · 7 comments
Assignees
Labels
area/packaging Packaging and operating systems support bug priority/high Super important issue

Comments

@PatTheMav
Copy link

Bug description

The kickstart script will add a custom apt repository on compatible systems to install netdata. By default, this script will prefer nightly builds of netdata, which are given priority level 500 (check via apt-cache policy netdata).

If a system has a "default release" configuration set-up, e.g. via APT::Default-Release "stable", then the default stable distribution will have priority level 900.

This results in the kickstart script calling apt to install netdata, but instead of the nightly version the stable version from the main system repository will be installed (e.g. "buster-stable").

As those versions are usually quite old, they won't meet expectations of the kickstart script, which will then fail and leave a half-installed netdata installation on the system, requiring users to manually clean up and run the kickstart script again (as it will detect an "unknown" installation otherwise).

Expected behavior

One possible expected behaviour would be for the kickstart script to check if:

a) Installation via apt is the selected/preferred method
b) What the current "default" priority is or check for a "Default-Release" setting
c) Inform users about a current apt system setting that prohibits the script from installing netdata successfully

Steps to reproduce

  1. Add a file called "99defaultrelease" in /etc/apt/apt.conf.d with the content APT::Default-Release "stable"
  2. Run netdata kickstart script per installation instructions
  3. Observe kickstart script installing an old version of netdata (e.g. 1.29.3-4)4.
  4. Observe kickstart script failing because it cannot correctly identify the installed version

Installation method

kickstart.sh

System info

Linux patthemav.com 5.4.137-xen #54137 SMP Sat Jul 31 13:26:05 CEST 2021 x86_64 GNU/Linux
/etc/os-release:PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
/etc/os-release:NAME="Debian GNU/Linux"
/etc/os-release:VERSION_ID="11"
/etc/os-release:VERSION="11 (bullseye)"
/etc/os-release:VERSION_CODENAME=bullseye
/etc/os-release:ID=debian

Netdata build info

Version: netdata v1.34.0-115-nightly
Configure options:  '--build=x86_64-linux-gnu' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=/usr/lib' '--libexecdir=/usr/libexec' '--with-user=netdata' '--with-math' '--with-zlib' '--with-webdir=/var/lib/netdata/www' '--disable-dependency-tracking' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -ffile-prefix-map=/usr/src/netdata=. -fstack-protector-strong -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-z,relro' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' 'CXXFLAGS=-g -O2 -ffile-prefix-map=/usr/src/netdata=. -fstack-protector-strong -Wformat -Werror=format-security'
Install type: binpkg-deb
    Binary architecture: x86_64
    Packaging distro:  
Features:
    dbengine:                   YES
    Native HTTPS:               YES
    Netdata Cloud:              YES 
    ACLK Next Generation:       YES
    ACLK-NG New Cloud Protocol: YES
    ACLK Legacy:                NO
    TLS Host Verification:      YES
    Machine Learning:           YES
    Stream Compression:         YES
Libraries:
    protobuf:                YES (system)
    jemalloc:                NO
    JSON-C:                  YES
    libcap:                  NO
    libcrypto:               YES
    libm:                    YES
    tcalloc:                 NO
    zlib:                    YES
Plugins:
    apps:                    YES
    cgroup Network Tracking: YES
    CUPS:                    YES
    EBPF:                    YES
    IPMI:                    YES
    NFACCT:                  YES
    perf:                    YES
    slabinfo:                YES
    Xen:                     NO
    Xen VBD Error Tracking:  NO
Exporters:
    AWS Kinesis:             NO
    GCP PubSub:              NO
    MongoDB:                 NO
    Prometheus Remote Write: YES

Additional info

No response

@PatTheMav PatTheMav added bug needs triage Issues which need to be manually labelled labels May 6, 2022
@ilyam8 ilyam8 added area/packaging Packaging and operating systems support and removed needs triage Issues which need to be manually labelled labels May 6, 2022
@Ferroin
Copy link
Member

Ferroin commented May 12, 2022

OK, this appears to actually be two bugs:

  1. We’re not handling priorities correctly. I was under the impression that APT does not support per-repository priorities like DNF (or Zypper/YaST2, or Portage, or pretty much any other sane package manager) does, and thus that there was no clean way for us to handle this. If there is some way we can set a priority per-repo and not resort to package pinning, then I’m all for fixing that sanely so we don’t have to worry about things like this.
  2. The kickstart script is not actually detecting correctly that apt-get install netdata will pull from a source other than our repos. We actually do have a check in the script for such cases, and should fall back to a static build instead of a native package if the check does not see the copy of Netdata from our repos as the primary installation candidate. On APT-based systems, this is done by parsing the output of apt-cache policy, but I’ve never quite been certain that the check is actually correct (I just couldn’t find a case where it wasn’t correct when I was testing it).

@iigorkarpov
Copy link
Contributor

  1. It does not. As I said, using priorities is a bad idea. But apt -t <repository-name> forces (at least in theory) apt to use specified repository:
    -t, --target-release, --default-release This option controls the default input to the policy engine. It creates a default pin at priority 990 using the specified release string. The preferences file may further override this setting. In short, this option lets you have simple control over which distribution packages will be retrieved from. Some common examples might be -t '2.1*' or -t unstable.

@Ferroin
Copy link
Member

Ferroin commented May 16, 2022

@iigorkarpov The problem with that is the same as what we run into trying to use equivalent options for DNF/YUM. Namely, it only applies for that specific invocation of the package manager, and then never again unless users specify the option again. That would allow the install to work correctly, but the moment the user runs a system upgrade in the scenario described in the issue description, the package will be downgraded to the one provided by the distribution repository.

@iigorkarpov
Copy link
Contributor

I see. Though as a matter of fact downgrades require the confirmation. So at least it won't be downgraded silently.

@Ferroin
Copy link
Member

Ferroin commented May 16, 2022

Shorter term though, we need to figure out why the apt-cache policy check we’re doing (currently located on line 1137 of the kickstart script) isn’t working in this particular case. While not ideal, fixing that would at least mean that users could get the latest version of the agent, even if it would be a static build instead of a native package.

@PatTheMav
Copy link
Author

PatTheMav commented May 16, 2022

Shorter term though, we need to figure out why the apt-cache policy check we’re doing (currently located on line 1137 of the kickstart script) isn’t working in this particular case. While not ideal, fixing that would at least mean that users could get the latest version of the agent, even if it would be a static build instead of a native package.

That's because the check will never fail - the command will generate output similar to this:

     [...]
     1.31.0-204-nightly 500
        500 https://packagecloud.io/netdata/netdata-edge/debian bullseye/main amd64 Packages
     1.29.3-4 900
        900 http://some.debian.mirror/debian bullseye/main amd64 Packages
     1.12.0-1+deb10u1 900
        900 http://some-debian.mirror/debian buster/main amd64 Packages

The nightly versions will all appear, the grep for "packagecloud.io" will successfully return. However 1.29.3-4 will be installed, because its priority has been bumped to 900 because of the APT::Default-Release "stable setting.

@Ferroin
Copy link
Member

Ferroin commented May 16, 2022

Shorter term though, we need to figure out why the apt-cache policy check we’re doing (currently located on line 1137 of the kickstart script) isn’t working in this particular case. While not ideal, fixing that would at least mean that users could get the latest version of the agent, even if it would be a static build instead of a native package.

That's because the check will never fail - the command will generate output similar to this:

     [...]
     1.31.0-204-nightly 500
        500 https://packagecloud.io/netdata/netdata-edge/debian bullseye/main amd64 Packages
     1.29.3-4 900
        900 http://some.debian.mirror/debian bullseye/main amd64 Packages
     1.12.0-1+deb10u1 900
        900 http://some-debian.mirror/debian buster/main amd64 Packages

The nightly versions will all appear, the grep for "packagecloud.io" will successfully return. However 1.29.3-4 will be installed, because its priority has been bumped to 900 because of the APT::Default-Release "stable setting.

Ah, good catch. Looking at it again I have no idea why I thought that would work when I first wrote it. So effectively what we need to be looking for then is that the entry associated with our repo has the highest probity of any of the entries, or alternatively some option that only lists the entry for the version that would be installed if apt install netdata was run with no other options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/packaging Packaging and operating systems support bug priority/high Super important issue
Projects
None yet
Development

No branches or pull requests

7 participants