Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datadog-agent: Update to v7 #105221

Closed
2 tasks done
schneefux opened this issue Nov 28, 2020 · 21 comments
Closed
2 tasks done

datadog-agent: Update to v7 #105221

schneefux opened this issue Nov 28, 2020 · 21 comments

Comments

@schneefux
Copy link
Contributor

datadog-agent v7 is available (https://github.com/DataDog/datadog-agent) and uses python3 by default. There are many new integrations (https://github.com/DataDog/integrations-core) which are incompatible with v6.
The build process seems to have changed, the upgrade steps (https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/networking/dd-agent/README.md) do not work because there is no Gopkg.lock in the datadog-agent repository. As someone who does not know Go I do not know how to proceed.

Checklist
Project name

nix search name: datadog-agent

current version: 6.11.2
desired version: 7.24

Notify maintainers

maintainers: @thoughtpolice @domenkozar @rvl

Note for maintainers

Please tag this issue in your PR.

@stale
Copy link

stale bot commented Jun 3, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 3, 2021
@schneefux
Copy link
Contributor Author

schneefux commented Jun 4, 2021

still relevant, not stale

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 4, 2021
@john-consumable
Copy link
Contributor

john-consumable commented Aug 29, 2021

I took a look at this and I came up with a minimum patch that builds and runs (#136109), but I am totally unsure if the correct integrations are built. It looks like there is a lot of new build machinery using invoke (see the datadog-agent tasks directory).

@john-consumable
Copy link
Contributor

@schneefux does this work for you?

@xvello
Copy link
Contributor

xvello commented Nov 27, 2021

Hello @john-consumable, I am currently testing an upgrade to release-21.11, and the datadog-agent package lost python support: the core golang features (core checks, log collection, process-agent) work OK, but python checks don't load.
This is because the build tag for python support has changed from cpython to just python. Trying to build the package with that tag fails though, because they added a rt-loader helper library to interface with python. My guess is that we'll need to package it first, as a dependency of the agent.

Happy to help on this one, although I'm not that good at packaging.

Here is the derivation build log

@bojanrajkovic
Copy link

I'm having this problem as well — the base agent works OK, but anything that requires a Python package doesn't work at all. @xvello, did you ever figure out a fix for the rt-loader helper library?

@xvello
Copy link
Contributor

xvello commented Feb 20, 2022

Heya @bojanrajkovic, I stopped my investigation at the rt-loader dependency and did not have the patience to try packaging it.

Instead, I worked around the issue by running the official docker image in podman containers. I managed to get journald logs tailing and network metrics (with --network=host) and moved on. If you want, I can cleanup my derivation and post it as a gist.

@bojanrajkovic
Copy link

bojanrajkovic commented Feb 20, 2022 via email

@xvello
Copy link
Contributor

xvello commented Feb 21, 2022

@bojanrajkovic here you go: https://gist.github.com/xvello/09d8a2c5b5e1b7ee48dd744658298817

The agent is the only containers I run, so I didn't set up the podman integration, but it should work OK by mounting the right socket, as described in the documentation.

@Sohalt
Copy link
Contributor

Sohalt commented Aug 9, 2022

#185805 should provide the necessary fixes to run this without podman.

@domenkozar
Copy link
Member

domenkozar commented Sep 19, 2022

With the latest fixes I'm getting

Could not initialize Python: could not load runtime python for version 3: Unable to open three library: libdatadog-agent-three.so: cannot open shared object file: No such file or directory

Example:

/nix/store/6l6zyqy27vq1d1k075jzfv0dd8wslx7z-datadog-agent-7.38.1/bin/agent -c /etc/datadog-agent/datadog.yaml check process_agent
error: Could not initialize Python: could not load runtime python for version 3: Unable to open three library: libdatadog-agent-three.so: cannot open shared object file: No such file or directory
error: Unable to load a check from instance of config 'nginx': Python Check Loader: python is not initialized; Core Check Loader: Check nginx not found in Catalog
error: Unable to load the check: unable to load any check from config 'nginx'
error: Unable to load a check from instance of config 'apm': Python Check Loader: python is not initialized; Core Check Loader: Check apm not found in Catalog
error: Unable to load the check: unable to load any check from config 'apm'
error: Unable to load a check from instance of config 'process_agent': Python Check Loader: python is not initialized; Core Check Loader: Check process_agent not found in Catalog
error: Unable to load the check: unable to load any check from config 'process_agent'
error: Unable to load a check from instance of config 'winproc': Python Check Loader: python is not initialized; Core Check Loader: Check winproc not found in Catalog
error: Unable to load the check: unable to load any check from config 'winproc'

Error: could not load process_agent:
* Core Check Loader: Check process_agent not found in Catalog
* Python Check Loader: python is not initialized
Error: no valid check found

@domenkozar
Copy link
Member

@Sohalt any ideas?

@Sohalt
Copy link
Contributor

Sohalt commented Oct 13, 2022

Nope, sorry. Don't have time to look into it rn. I can take a look next week if you haven't figured it out by then.

actionshrimp added a commit to actionshrimp/nixpkgs that referenced this issue Dec 9, 2022
This fixes the python init error mentioned here:
NixOS#105221 (comment)

However, there are still issues with the derived python environment - for some
reason datadog_checks.base is not present in the env's site-packages, which all
the other checks depend on, so python loading still isn't working fully (but I
believe this is an improvement over what's there already at least).
@actionshrimp
Copy link
Contributor

actionshrimp commented Dec 9, 2022

I fixed the first part of the error around libdatadog-agent-three.so with this commit from this PR, but it looks like there are still a few issues with the python environment that the agent is given:

ERROR | (pkg/collector/embed_python.go:19 in pySetup) | Could not initialize Python: could not initialize rtloader: could not import base class: No module named 'datadog_checks.checks'
ERROR | (pkg/collector/scheduler.go:201 in getChecks) | Unable to load a check from instance of config 'apm': Python Check Loader: unable to import module 'apm': No module named 'apm'; Core Check Loader: Check apm not found in Catalog
ERROR | (pkg/collector/scheduler.go:248 in GetChecksFromConfigs) | Unable to load the check: unable to load any check from config 'apm'
ERROR | (pkg/collector/scheduler.go:201 in getChecks) | Unable to load a check from instance of config 'process_agent': Python Check Loader: unable to import module 'process_agent': No module named 'process_agent'; Core Check Loader: Check process_agent not found in Catalog
ERROR | (pkg/collector/scheduler.go:248 in GetChecksFromConfigs) | Unable to load the check: unable to load any check from config 'process_agent'
ERROR | (pkg/collector/scheduler.go:201 in getChecks) | Unable to load a check from instance of config 'winproc': Python Check Loader: unable to import module 'winproc': No module named 'winproc'; Core Check Loader: Check winproc not found in Catalog
ERROR | (pkg/collector/scheduler.go:248 in GetChecksFromConfigs) | Unable to load the check: unable to load any check from config 'winproc'

Inspecting the site-packages of the built python env built by integrations-core.nix, it seems like the necessary python modules needed haven't made it in there for some reason, but I'm not quite sure why.

@domenkozar
Copy link
Member

@actionshrimp it looks good to me:

❯ ls /nix/store/kh9r0w0pjjax7zr90yhi4avb4l3sshr5-python3-3.10.9-env/lib/python3.10/site-packages/datadog_checks/
config.py  disk  errors.py  __init__.py  log.py  mongo  network  nginx  postgres  process  __pycache__

github-actions bot pushed a commit that referenced this issue Dec 31, 2022
This fixes the python init error mentioned here:
#105221 (comment)

However, there are still issues with the derived python environment - for some
reason datadog_checks.base is not present in the env's site-packages, which all
the other checks depend on, so python loading still isn't working fully (but I
believe this is an improvement over what's there already at least).

(cherry picked from commit 845e54e)
@actionshrimp
Copy link
Contributor

@domenkozar - yeah some of them make it in ok, but some modules are missing, notably checks and base which lead to errors like the one I pasted above. I guess these should come from here: https://github.com/DataDog/integrations-core/tree/master/datadog_checks_base/datadog_checks ?

@ketzacoatl
Copy link
Contributor

At this point, we can install the datadog-agent (v7) with nixos config. The service installs and runs as a systemd service. I can even see the agent register itself with datadoghq and send meta data.

However, it seems as though the agent's plugins are not all installed or available, and a lot of integrations don't work. In my testing, the docker integration is an easy example. The process checks seem to be another.

Question: do we close this issue and start a new one about the plugins not installing correctly, or do we resolve that in this issue? What do we need to do to include the agent's plugin/integration files with the agent install so they can be used by the agent?

@domenkozar
Copy link
Member

domenkozar commented Mar 23, 2023 via email

@ketzacoatl
Copy link
Contributor

@domenkozar should I create that new issue, or defer to you?

@domenkozar
Copy link
Member

domenkozar commented Mar 23, 2023 via email

@ketzacoatl
Copy link
Contributor

Ok, is there any particular description of the problem or recommended solution that you would suggest I include? My explanation is more or less "the agent's plugins should be included in the installation".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants