-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote config with http(s) provider #1143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good start, but I think it's worth some discussion about how quickly we're going to want to expand the usage here.
If we suspect that we might also support remote_urls for #868, then it might be worthwhile having RemoteConfig implement fs.FS sooner rather than later. Otherwise, if we don't think we're going to expand the remote URL concept any further than we probably don't need a broader interface yet.
(fs.FS would be useful for #868 in particular because conf.d would walk over directories)
This makes sense to me. I intentionally kept this simple / extendable so it shouldn't be a problem to support conf.d style configs if we decide to in the future. Instead of having future requirements creep into the scope of this feature, I think we should implement local conf.d configuration support first, and then add it for remote as a separate unit of work. |
I'm not opposed to putting it off, but I do want to say that I think implementing fs.FS at any point would be a pretty significant rewrite. |
I would think it should be possible to retrieve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks reasonable. Small and lightweight. Would love to see some doc, ie examples and the configuration options in our config block. With some comments on this being beta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good, just nits.
would also like to see this documented and added to the changelog before merge though.
maybe also command line flag to enable "experimental remote url support?"
@rfratto @mattdurham addressed some of the minor nits if you guys want to take another pass at some point. I'm still leaving this as a draft for now as I wanted to implement a couple more providers before trying to get this merged in. |
@rlankfo IMO http/s is a good starting point for a first PR, other providers can be done in future PRs |
Ok sounds good, I can switch gears to finalizing this for merge. Do you think we need the experimental URL flag? Also, do think we should support token authorization as well as basic before merge? |
I'm not sure if we need an experimental flag. Experimental flags isn't something we've done historically, but I'm starting to think about when they might be useful to signal to the user that something is subject to change given #1140.
It's not strictly necessary, I don't think. We still don't know long-term if plain URLs is strictly the format we want to use over something like |
Haha, now that I said this, maybe an experimental flag is the way to go. Let's chat offline about this. |
It looks like it's going to be difficult to find a nice, fast way to do experimental flags that error when configuring a flag for an experiment that is not enabled. For now, I think we should have an |
61a5398
to
feda872
Compare
9e34f8a
to
b835d9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM minus final nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work, thank you!
* Register missing metric (#860) * register missing metric for configstore consul list duration * rename metric * rename metric (operation type is a label) * Add caller information to logs (#861) * Rename prometheus to metrics (#882) * alias `prometheus` as `metrics` and note deprecation * changelog/upgrade guide * defensively ensure defaults are loaded in to deprecated prometheus struct * Update metrics in mixin (#878) * Add namespace to some queries which were missing it * Move samples queries to samples row, update to current metric names, and allow some graphs to work with multiple instance groups * Update example dashboards with changes * Rename panel to match metric * Honor config refresh timeout, remove timeout for lifecycle and instead use background context (#876) * Honor config reshard timeout, remove timeout for lifecycle and instead use background check. * Added timeout and background refresh * add debugging information around deadline * add debugging information around deadline, add new setting for timeouts, and use that timeout * Fix cancel and refresh->reshard naming * Catch cancel if not joining. * Catch cancel if not joining. * Make the linter happy * Change comments. * Changes recommended from PR, most importantly is moving the goroutine to the reshard instead of the caller to reshard. * Fix changelog * Match err with log. * Move mutex inside goroutine * Update changelog with 877 fix * Revert to previous behavior * Scraping service: use buffered channel for requesting refreshes (#886) * refresh queue, remove mut * fix logs * more logs! * how many logs can i add? * how many logs can i change? * more logging * changelog * Automated smoke tests for metrics (#825) * poc smoke test automation * finish smoke test environment * debug logs * fix typo, sync immediately * add crow, add crow-related alerts * have v0 and v1 libs use v2 lib * basic cpu, memory tests * fix chaos_loop to stop generating replica -3 which will never exist * Add initial documentation for crow. * update comments * Add uml * Cleanup documentation * Reformate/move docs * Fix some verbiage * Fix doc checking * Update cmd/grafana-agent-crow/README.md Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Small nits * Fix pathing in readme Co-authored-by: Matt Durham <mattdurham@ppog.org> * Move port to load_balancing config as receiver_port (#881) * Move port to load_balancing config as receiver_port When extracting load balancing from tail sampling, only the the load_balancing block was moved into its own block, but the receiver's port was left in tail_sampling. This means that the receiver port for load_balancing could not be configured without tail_sampling. Now its moved to load_balancing as receiver_port. * Load balance without tail sampling * Move port up * agentctl: allow specifying Grafana Cloud's API url in cloud-config cmd (#898) * add docs sync (#897) * prepare for v0.18.3 release (#899) * rename prom_instance -> metrics_instance (#905) * Containerize merges to development branches (#893) * Containerize merges to development branches * Update drone.yml Updated drone signature Co-authored-by: mattdurham <mattdurham@ppog.org> * Ensure reshards are scheduled every reshard_interval (#912) * ensure reshards are scheduled every reshard_interval * use single metric for config changed events * logs * Prepare for 0.18.4 release (#913) * prepare for 0.18.4 release * include #831 * simplify logic a little for calculating next reshard (#917) * tempo: if relabel config drops target, don't attempt to process (#904) (#906) This avoids e.g. logging a warning when a target is intentionally dropped * Update for `## Deploy GrafanaAgent` clusterrole (#923) Update `## Deploy GrafanaAgent` section to include additional kubelet metrics. Reason: When checking the metrics sent to Grafana Cloud, we noticed that some of kubelet metrics (cpu/memory usage) was missing. On the agent, we are using the backwards compatibility with the Prometheus Operator CRDs ServiceMonitor and PodMonitor , so that metric instance discovery is exactly as it is with our hosted prometheus. * Rename Tempo -> Traces (#909) * Initial run through on changing tempo -> traces * Second run through of changing tempo->traces * Update changelog * Fix dead link * Clean up readme * Clean up changelog * Rename to fit pattern * Fix k8s build * Change to deprecate instead of enhancement, make tests and implement backwards compatibility * Rename traces config until aliasing is known * update to tempo-config * PR feedback * Update doc * Convert in k8s RBAC api calls version from /v1beta1 to stable /v1 (#925) * Convert RBAC api version from /v1beta1 to stable /v1 From the documentation: RBAC resources The rbac.authorization.k8s.io/v1beta1 API version of ClusterRole, ClusterRoleBinding, Role, and RoleBinding is no longer served as of v1.22. Migrate manifests and API clients to use the rbac.authorization.k8s.io/v1 API version, available since v1.8. All existing persisted objects are accessible via the new APIs * Update jsonnet lib alpha-1.14 to stable-1.21 * Update changelog * Sync of vendor files * update .gitignore to be explicit about root paths (#932) * Mixin Dashboard Updates (#931) * Show targets by pod * Switch active series back to appended samples * Add a new samples rate panel * 1hour time range and 30s refresh for dashboards * Update example dashboards with changes * Add the agent-operational dashboard to the mixin * docs & tanka changes for s/loki/logs rename (#739) * docs & tanka changes for s/loki/logs rename * add alias for logs-config * remove aliasing * Standardize default scrape interval to 1m (#939) * Set default scrape_interval to 1m * Add envsubst to drone * Update changelog * Making configurable what method is used when adding k/v to spans (#920) * Making configurable what method is used when adding k/v to spans * Remove weird diff * Remove new nesting * Add operationType validation * CHANGELOG * Rename Prometheus to Metrics in docs / jsonnet (#940) * rename prometheus to metrics in docs / jsonnet * oops * Add note about using backslashes and backticks (#942) * Add note about using backslashes and backticks We need to note users about using backslashes in agent config file, and about not supported backticks too. Also I've added error messages text for quicker finding the solution by error text. * More detailed info about backslashes in regex * re-prefix metrics-subsystem metrics (#943) * Release v0.19.0 (#952) * Update changelog, versioning and some errors that slipped through * update path correctly * Updated the memory test with @rfratto suggestion to up the avalanche parameters to not get preemptive errors on memory. * Release v0.19.0 (#954) * Update changelog, versioning and some errors that slipped through * update path correctly * Updated the memory test with @rfratto suggestion to up the avalanche parameters to not get preemptive errors on memory. * Use correct package name for envsubst * Use correct package name for envsubst * Operator: Add support for managing a kubelet service (#907) * add support for managing a kubelet service * write test for kubelet reconciler * add example ServiceMonitor for the kubelet service * add instructions for kubelet ServiceMonitor * Update documentation with better release note and contributions (#956) * Update documentation * PR feedback changes * update docs link in deb/rpm service files (#945) * Upgrading Integration Dependencies (#944) * upgrading integrations depencies: elasticsearch, redis, postgres, mysql, memcached, statsd * Updating changelog * Updating changelog * Fix changelog typo * updating postgres exporter repo address on documentation * CheckKeyGroupsBatchSize directed to CheckKeysBatchSize configuration property * solving broken link * Add note about ports in tempo remote_write enpoint (#953) * Note about ports in tempo remote_write enpoint Added comment with examples how to fill port of `remote_write` tempo endpoint for local and on-premises instances. For new users it's unclear which port to choose, and in most cases they try to fill the default `3200` Tempo port, instead of gRPC receiver port. * Update docs/configuration/traces-config.md Co-authored-by: mattdurham <mattdurham@ppog.org> * Replace dots for underscores in autologged labels (#951) * Replace dots for underscores in autologged labels * Use SanitizeLabel function * Update changelog with some bad merges that added notes to v0.19.0 (#959) * Update changelog with some bad merges * Flatten out dependencies and give shoutout to (@gaantunes) * Add silent install to windows (#969) * Add silent install to windows * Add silent install to windows * remove extra colon * Update mongodb_exporter to release-0.27.0-grafana (#965) * update mongodb_exporter to release-0.27.0-grafana * fix lint errors * Update pkg/integrations/mongodb_exporter/mongodb_exporter.go Co-authored-by: gaantunes <g.amaral.antunes@gmail.com> * add note for compatiblemode Co-authored-by: gaantunes <g.amaral.antunes@gmail.com> * bump version of dnsmasq_exporter used (#973) * docs/operator: stop using FQDN for kube-apiserver __address__ (#972) This example only works when cluster domain is set to cluster.local. When the cluster domain is changed, scraping silently fails (hidden behind the "debug" log level): grafana-agent ts=2021-10-08T15:10:06.054149178Z level=debug agent=prometheus instance=320d908d14feeac9f916f752b6276792 component="scrape manager" scrape_pool=integrations/kubernetes/cadvisor target=https://kubernetes.default.svc.cluster.local:443/api/v1/nodes/ns100566-865ef3ad/proxy/metrics/cadvisor msg="Scrape failed" err="Get "https://kubernetes.default.svc.cluster.local:443/api/v1/nodes/ns100566-865ef3ad/proxy/metrics/cadvisor\": dial tcp: lookup kubernetes.default.svc.cluster.local on 10.43.0.10:53: no such host" Luckily, kubernetes also populates /etc/resolv.conf with search domains, so we can simply specify kubernetes.default.svc, and it works with all cluster domains. * Upgrade OTel to v0.36 (#971) * Upgrade OTel to v0.36 * Add replace for jaeger * Remove conversion * changelog * lint * Remove g1.17 annotations * Lint again * Left annotation change * Correct minor spelling mistake in comments (#977) * Update examples, docs for name changes, deprecations. (#978) * Add Hanif to Governance (#979) * traces: remove extra line feed in password file (#976) cf. 975. This is to match loki and prometheus behavior * Update statsd to latest release and change a few things around the cache (#985) * Update statsd to latest release and change a few things around the cache * standardize verbiage * remove unneeded mappercache * Replace perflib_exporter to use forked version that remove prometheus/log (used implicitly by windows_exporter) (#989) * Replace directive for perflab_exporter that windows_exporter uses so we can remove reliance on prometheus logging * update changelog * Update CHANGELOG.md Clean up changelog * [Documentation] Adding DandyDev public Grafana Agent chart (#987) * Adding DandyDev public Grafana Agent chart * Updating per reccomendations in review. * List the Grafana Agent charmed operator in the community projects (#994) * Do not immediately cancel promServiceDiscoProcessor's discoveryMgrCtx (#998) * Do not immediately cancel promServiceDiscoProcessor's discoveryMgrCtx * Add more information to changelog * Attempting to get or delete a config which does not exist 404's (#995) * [Tracing] Service graphs (#988) * Service graphs processor (#756) * Bare implementation of the service graph processor * Comment fixes * Collect unpaired spans metric * Add unpaired metric * Implementation improvements * Some more improvements * No need to close * Fix CI * Improve test stability * Some documentation * Fix tests * Truly fix it * Check span error (#901) * Check span's error * Also check http status code * Update tests * Inspect http and grpc status too * Use map for performance * Cleanup * Count number of dropped spans and remove untagged metric (#967) * Fix CHANGELOG * Extend docs * Fix pipelines * Add note for metrics' namespace * Scrape only local endpoint docker-compose * Rename metrics: tempo -> traces * Gaantunes/docs security recommendations (#1003) * Including security recommendations on docs, to restrict user privileges were secrets are necessary * Updatintg CHANGELOG.md * Fixing typos * Update windows_exporter to v0.16.0 (#990) * update windows_exporter to 0.16.0 * add go.mod note to block merging * disable textfile collector by default * go mod tidy, use latest release branch commit * upgrade to github.com/drone/envsubst/v2 (#1004) * upgrade to github.com/drone/envsubst/v2 * update changelog * add github.com/drone/envsubst/v2 to depcheck * re-organize changelog * update out of place BUGFIX in changelog * update depcheck * Document changes due to OTel upgrade (#1011) * Document changes due to OTel upgrade * Fix doc link * It breaks things! * add grafana agent community call (#1016) * pkg/operator: only delete managed resources (#1027) * pkg/operator: only delete managed resources Also ensures that all managed resources have the managed-by label * optimize isManagedResource * Fully deprecate push_config (#1010) * Fully deprecate push_config * Document change in upgrade guide * prepare for v0.20.0 release (#1041) * Update _index.md (#1032) * Update _index.md * Update _index.md * fix upgrade guide (#1043) * Show remote write latency on dashboard (#1051) If samples are getting backlogged, it may be due to latency between the agent and the backend it is sending to. Replaced the differential panel which doesn't add much to the timestamp panel to the left, and essentially duplicates the differential samples rate panel below. Signed-off-by: Bryan Boreham <bjboreham@gmail.com> * Log config watcher reshard timer at debug level (#1050) Since this message is logged periodically and an implementation detail we can log it at a debug level. Resolves #1048 * Use correct user/group env variables (#1053) * Use correct user/group env variables The env variables in add_to_logging_group() did not match the names set at the start of the post install script. * Update CHANGELOG.md * Validate logs config when using logs_instance (#1058) * Change integrations to have a unique instance key (#1033) * wip: generate instance key from integration configs * remove accidentally committed file * refactor manager to support instance keys * docs changes * fix typo Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> * Operator: Fix metrics service port (#1060) * Fix metrics service port * Create a govern metrics service per Grafana Agent (#1065) * Add metrics service per Grafana Agent * Update changelog * Remove dead code * Operator: Add relabel_config.libsonnet to logs templates dir (#1064) * Add relabel_config.libsonnet to logs templates dir * Update CHANGELOG.md * Service graphs performance improvements (#1022) * Ditch cache library for map + mutex The implementation with the cache library had a ver bad performance, specially when calling `Items()`, since it would create a new map and copy all items to it. This commit changes the implementation to use a map with a mutex. * Clean up and comments * Correctly set expiration time * Notify of ready edges instead of looping through all * Use list to store edges * Minor fixes and comments * Properly check when head edge should be evicted * Remove old logic from previous implementations * Extract store to its own struct * Remove ugly copy * Fix shutdown * Correct log level * Correctly measure dropped spans * Minor fixes * Move collection logic outside store * Store tests * Fix dropped items logic * Lint fixes * Fix flaky test * Unnest for clarity * Skip empty service name spans * Update changelog * Fmt * add more detail to migrating to new CRD names in v0.19.0 (#1028) * promtail update (#1020) * upgrade loki version * update kubernetes sd config in test * fix windows logger * update prometheus * update prometheus dependency to grafana/prometheus * remove commented replace * add changelog entry for primary dependency updates * comment on klog-gokit replace * remove weaveworks/common replace * get rid of forks with merged PRs * make crds * update changelog entry for prometheus * Remove duplicate wrapping of logger with component key (#1077) Signed-off-by: Christian Haudum <christian.haudum@gmail.com> * pkg/integrations: Add Ryan Geyer and Gabriel Antunes as code owners (#1075) * pkg/integrations: Add Ryan Geyer and Gabriel Antunes as code owners * update MAINTAINERS list * [Operator] Add Helm quickstart guide (#1079) * Add Helm quickstart guide * Update CHANGELOG.md * Remove trailing newlines * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update docs/operator/helm-getting-started.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Review edits and link from production/README.md Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Exempt dev branches issues/PRs from staleness (#1083) * handle retrieve_config failures (#1085) * hack around dash not having pipefail * fix typo 🤦♂️ * remove cat * add comment * fix shellcheck issues * trap and send TERM sig in fatal * hide error message * Update go-kit logger package to remove debug logs (#1094) * Update github.com/go-kit/kit to v0.12.0 (#1090) * update github.com/go-kit/kit to v0.12.0 * node_exporter integration: fix build error on non-linux GOOS Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Reorg agent operator docs (#1100) * Reorg agent operator docs * Update CHANGELOG.md * Fix broken links in agent operator docs (#1101) * Fix `success_codes` logic and document new config options (#1076) * Document new options * Add tests and fix logic * Changelog * lint * Moar fixes * Remove noisy log * Ups, clean up println * Better naming * prepare for v0.21.0 release (#1106) * fix upgrade guide (#1107) * Fix parsePostgresURL issue (#1111) * Fix parsePostgresURL issue * Update Implementation and fix unit test * Add nolint and fix unit test * Update changelog * move changelog entry to unreleased Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version of temporary dnsmasq_exporter fork used (#1114) Fixes #1113 * Fix small issues in stats_exporter integration (#1092) * adding the logger to statsd exporter mapper * Updating changelog * Fixing typo in changelog * update statsd_exporter to include registry fix Fixes #1002 Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Default to using latest agent release version in operator (#1117) Fixes #1029 * docs: stop using the term "fully deprecated" (#1118) Prefer "removed" over "fully deprecated" Fixes #1000 * Update Promtail links to link to grafana.com/docs (#1115) * Don't create a WAL cleaner when there's no configured WAL directory. (#1119) This avoids noisy logs when someone uses the agent only for logs/traces. Fixes #735 * v0.21.1 release prep (#1120) * Setup CD for agent (#1103) * Setup CD for agent This deploys agent to dev on a merge to main. Depends on grafana/deployment_tools#19936 Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Update config and drone hash Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Hopefully final changes Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Finally make it work! Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * fix relref link (#1123) * fix relref link * Update docs/scraping-service/_index.md Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Mario <mariorvinas@gmail.com> * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: kgeckhart <kgeckhart@users.noreply.github.com> Co-authored-by: mattdurham <mattdurham@ppog.org> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Cristian Greco <cristian@regolo.cc> Co-authored-by: Robby Milo <robby.milo@grafana.com> Co-authored-by: James Callahan <35791147+james-callahan@users.noreply.github.com> Co-authored-by: aengusrooneygrafana <54800912+aengusrooneygrafana@users.noreply.github.com> Co-authored-by: Alexey Murz Korepov <murznn@gmail.com> Co-authored-by: gaantunes <g.amaral.antunes@gmail.com> Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Joseph Woodward <josephwoodward@xeuse.com> Co-authored-by: Bruce Mitchener <bruce.mitchener@gmail.com> Co-authored-by: nicoche <78445450+nicoche@users.noreply.github.com> Co-authored-by: Aaron Layfield <Aaron.Layfield@gmail.com> Co-authored-by: Michele Mancioppi <michele.mancioppi@canonical.com> Co-authored-by: Lucas Heinlen <lucas.heinlen@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Bryan Boreham <bjboreham@gmail.com> Co-authored-by: Ifeanyi Ubah <ify1992@yahoo.com> Co-authored-by: Simon Crute <simonc6372@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: hanif <hjet@users.noreply.github.com> Co-authored-by: shturman <s.s.koshel@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Christian Haudum <christian.haudum@gmail.com> Co-authored-by: Mario <mario.rodriguez@grafana.com> Co-authored-by: Dharma Saputra <mail.dharma.saputra@gmail.com> Co-authored-by: Goutham Veeramachaneni <gouthamve@gmail.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com>
* Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com>
* Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com>
* [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com>
* sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates
* [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (grafana#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (grafana#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (grafana#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (grafana#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (grafana#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (grafana#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (grafana#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (grafana#1184) * Fix typo (grafana#1141) * Traces: Improved pod association in PromSD processor (grafana#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (grafana#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (grafana#1152) * Merge patch release to main (grafana#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (grafana#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (grafana#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (grafana#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (grafana#1177) * Upgrade to OTel v0.40.0 (grafana#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (grafana#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes grafana#1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (grafana#1184)" (grafana#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (grafana#1191) * revert breaking changes to integrations v1 This commit reverts grafana#1062 in favor of making breaking changes directly in integrations-next instead. The part of grafana#1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (grafana#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (grafana#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (grafana#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test * Use log format in traces subsystem (#1272) * Use log format in traces subsystem * Changelog * Undo unwanted change * Fix changelog entry * integrations-next: Add extra_labels to inject extra labels for an integration (#1312) * integrations-next: Add extra_labels to inject extra labels for an integration. * separate tests * Fix anchor link on operator docs (#1302) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * updated config URL (#1304) The existing URL returns a 404: https://grafana.com/docs/agent/latest/getting-started/configuration/_index.md Updated to https://grafana.com/docs/agent/latest/configuration/ * Fix typo in node_exporter (#1325) * Allow remote_write URL credentials (#1329) * Bypass Prometheus password redaction Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add inline secret in existing test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry * Add to scrubbed testcase as well Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Stop appending duplicate exemplars (#1316) * Add memExemplar in stripeSeries as first iteration Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add test for skipped duplicate exemplars; Simplify conditional Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry; discard test errors * Move changelog entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add Benchmark for AppendExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Discard error on added benchmark Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use original exemplar struct instead of custom memExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Surround benchmark loop with start/stop timers and close test storage Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add comment about prepopulating exemplars on WAL startup Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in the totalAppendedExemplars metric Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make comment more discoverable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make sure we're recording exemplars for non-nil series ref only Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * integrations-next: wait for integrations to exit after stopping them (#1318) * integrations-next: wait for integrations to exit after stopping them * fix lint errors * minor refactor * integrations-next: stop holding config mutex for entire reload * make controller.run authoritative over running integrations * fix log line * move running integrations into a dedicated worker pool * operator/hierarchy: stop using field selector when listing Secrets & ConfigMaps (#1340) The initial implementation of hierarchy.KeySelector injected a FieldSelector when listing Secrets and ConfigMaps to immediately return the single object being queried for. This causes a problem with the client generated by the controller-runtime framework, where the client is wrapped in a cache and field indexer (where only the namespace is indexed by default). This commit avoids using the field selector and the index lookup. The resulting behavior aligns more closely with discovering other resources in the hierarchy (i.e., ServiceMonitors), where the List call is also insufficient and needs post-processing via Matches to find the final list of resources. Given the controller-runtime client uses an informer for reads, all relevant Secrets and ConfigMaps are already in-memory anyway, and using the index for a faster List is a bit of an over-optimization at the moment. * Add dependabot to update go modules and github actions. (#1217) Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com> * smoke framework refactor (#1326) * Agent smoke test (#1291) * convert smoke script to go program * update build for agent-smoke * fix pr comments * use existing log helper package * refactor context cancel * update exit codes * use ticker * prefer oklog/run instead of errgroup * use nop logger * refactor task interface * remove functional options * log.With for task loggers * move smoke to tools * build smoke image, push to internal registry * move crow to tools * add gcr_admin secret * fix link to crow * add smoke libsonnet and use in local k3d smoke test * add deletePodBySelectorTask * scale smoke-test replica down after local test * refactor smoke Options to Config * update duration usage message * add some basic unit tests * newlines * pass mutation frequency and chaos frequency from smoke script * pull crow image from gcr * update smoke script * move monitoring to smoke libsonnet * move additional smoke resources needed in deployment tools * reference libsonnet files from grafana-agent dep * make drone * fix images in smoke script * get rid of extVars * update k3d example environment to reference etcd from new location * update smoke docker builds to use go1.17 * use pointer.Int64 * refactor smoke jsonnet (#1296) * add policy rule for list and delete pods (#1319) * refactor smoke.new function to take config object (#1327) * Apply suggestions from code review * Update production/tanka/grafana-agent/smoke/crow/main.libsonnet * Update production/tanka/grafana-agent/smoke/main.libsonnet * Update example/k3d/scripts/smoke-test.bash Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * readme update (#1338) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Correct link to the configuration (#1036) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Add stale check Github Action (#1345) * Add a stale check GH action to run every 24 hours * remove old stale.yml file * add permissions to action * update the stale message to clarify when the stale label will get removed * Update .github/workflows/stale.yml * stale action: fix missing indent (#1346) * Fix mssql issue (#1351) * Add K8s Events integration (#1330) * Add K8s eventhandler integration (#1310) * Add docs and sample manifests to eventhandler integration (#1328) * Wait for cache to flush before returning * Clarify eventhandler docs (#1334) * Clarify docs * Update CHANGELOG.md * Review changes (#1349) * stale action: fix typo in label exemptions (#1347) * update withVolumesMixin for agent jsonnet (#1358) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Configure cluster label using logs client external_labels param (#1357) * Configure cluster label using logs client external_labels param * Update CHANGELOG.md * add password file and basic auth round tripper in crow (#1361) * add password file and basic auth round tripper in crow * add ca-certificates in crow image * add orgID flag * update help text * default send_exemplars to true in remote_write (#1352) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update eventhandler labels (#1368) * Update eventhandler integration labels * Update CHANGELOG * Remove unnecessary kind label * update changelog (#1374) Remove BUGFIX entries that fix a bug introduced by main (i.e., bugs which were never part of a release) * Prepare for release of v0.23.0 (#1377) * Update version references * Fix fat-fingered delete; Remove mention of upgrade Go * RFC: Design in the open (#1055) * rfc: first draft of RFC0001 * add placeholder for PR * update PR link * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify "designing in the open" is best-effort * update 0001 * fix dead link in production/README.md * add recommended sections for RFC proposals * describe the process for approving a proposal * ignore RFC template in link checker * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * do my nitty 80-char line length limit change * indent pros/cons to a single section * document process for superseding RFCs * remove RFC mutability requirement * add extra flavor around not recommending google docs * require Google Doc -> RFC conversion * move new files Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Add Grafana Labs SECURITY.md (#1356) Signed-off-by: Richard Hartmann <richih@richih.org> * Add readiness check to metrics component (#1369) * PR Base Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix autoscrape's mockInstance Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in atomic readiness check Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Add CHANGELOG.md entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Reference page to download windows installer (#1372) Fixes #1366 * fix typo in node_exporter_config (#1389) which should be `privileged` instead of `priviliged` * Add option for Operator to pass arguments to GrafanaAgent #1227 (#1248) * 1250 oauth2 tracing (#1386) * Add oauth support for trace Otel trace exporter via opentelemetry-collector-contrib oauth2clientauthextension * start extensions on collector instance startup fix decoding to otelconfig build extensions add oauth extension to service map * Update traces config documentation * lint fixes * fix godoc comments * pass exporter index directly to exporter name generator * PR feedback; Update Changelog * sort extensions when sorting pipelines for testing determinism * README: Fix link to agent logo (#1396) * update MAINTAINERS.md (#1402) * add smoke alerts to mixin; move local alerts into examples dir (#1397) * add smoke alerts to mixin; move local alerts into examples dir * add podPrefix for smoke test * podPrefix in libsonnet config * [RFC] Integrations in Grafana Agent Operator (#1224) * rfc: integrations in grafana agent operator Supersedes #883 * add missing links * Apply suggestions from code review Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify how many daemonsets/deployments/service/secrets are created * add example of defining secrets * try defining integrations * s/IntegrationsMonitor/IntegrationMonitor/g * simplify proposal * add alternatives * remove old reference to `hasMetrics` field * document example generated agent configuration file * assign ID RFC-0002 * add missing PR link Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * add fake rw endpoint to smoke program (#1405) * fix alerts typo (#1407) * continuous delivery for smoke images (#1408) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * fix continuous delivery job errors (#1409) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * [operator] - Use _file variants for basic auth credentials. (#1411) * use password_file alternatives in operator config * update tests * reduce smoke alert noise (#1412) * reduce smoke alert noise Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update production/grafana-agent-mixin/alerts.libsonnet Co-authored-by: Robert Fratto <robertfratto@gmail.com> * update cpu check comment Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * add minimum load threshold to cpu alert Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Clarify usage of instanceNamespaceSelector (#1413) * RFC-0001: Add status to RFC (#1391) * rfc-0001: add rules for when RFC PRs should be merged * use status field instead of merge to indicating state * Parametrize logs DaemonSet K8s manifests (#1420) * Parametrize logs daemonset K8s manifests * Update CHANGELOG.md * Extend linting configuration file (#1421) * Add depguard linter to reject packages we tend to avoid * Replace golint with revive, since golint is deprecated * Remove interfacer, which is deprecated with no replacement * Add makezero linter to detect misuse of make with append * Add tenv to prefer t.Setenv over os.Setenv in tests * Add whitespace to report unnecessary blank lines * Ignore test files for errcheck In addition to the above, the following changes were made: * Remove settings that just re-set default values, instead pointing to the website to retrieve defaults. * Simplify the errcheck rule to only include functions we actually need to ignore. * Main merge changes Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: melGL <81323402+melgl@users.noreply.github.com> Co-authored-by: Tom Wilkie <tomwilkie@users.noreply.github.com> Co-authored-by: Joseph Woodward <josephwoodward@xeuse.com> Co-authored-by: hanif <hjet@users.noreply.github.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> Co-authored-by: laiwei <laiwei.ustc@gmail.com> Co-authored-by: Sam <shamsalmon@users.noreply.github.com> Co-authored-by: Chris Knutson <christopher.knutson@gmail.com> Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Craig Peterson <192540+captncraig@users.noreply.github.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test * Use log format in traces subsystem (#1272) * Use log format in traces subsystem * Changelog * Undo unwanted change * Fix changelog entry * integrations-next: Add extra_labels to inject extra labels for an integration (#1312) * integrations-next: Add extra_labels to inject extra labels for an integration. * separate tests * Fix anchor link on operator docs (#1302) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * updated config URL (#1304) The existing URL returns a 404: https://grafana.com/docs/agent/latest/getting-started/configuration/_index.md Updated to https://grafana.com/docs/agent/latest/configuration/ * Fix typo in node_exporter (#1325) * Allow remote_write URL credentials (#1329) * Bypass Prometheus password redaction Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add inline secret in existing test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry * Add to scrubbed testcase as well Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Stop appending duplicate exemplars (#1316) * Add memExemplar in stripeSeries as first iteration Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add test for skipped duplicate exemplars; Simplify conditional Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry; discard test errors * Move changelog entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add Benchmark for AppendExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Discard error on added benchmark Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use original exemplar struct instead of custom memExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Surround benchmark loop with start/stop timers and close test storage Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add comment about prepopulating exemplars on WAL startup Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in the totalAppendedExemplars metric Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make comment more discoverable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make sure we're recording exemplars for non-nil series ref only Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * integrations-next: wait for integrations to exit after stopping them (#1318) * integrations-next: wait for integrations to exit after stopping them * fix lint errors * minor refactor * integrations-next: stop holding config mutex for entire reload * make controller.run authoritative over running integrations * fix log line * move running integrations into a dedicated worker pool * operator/hierarchy: stop using field selector when listing Secrets & ConfigMaps (#1340) The initial implementation of hierarchy.KeySelector injected a FieldSelector when listing Secrets and ConfigMaps to immediately return the single object being queried for. This causes a problem with the client generated by the controller-runtime framework, where the client is wrapped in a cache and field indexer (where only the namespace is indexed by default). This commit avoids using the field selector and the index lookup. The resulting behavior aligns more closely with discovering other resources in the hierarchy (i.e., ServiceMonitors), where the List call is also insufficient and needs post-processing via Matches to find the final list of resources. Given the controller-runtime client uses an informer for reads, all relevant Secrets and ConfigMaps are already in-memory anyway, and using the index for a faster List is a bit of an over-optimization at the moment. * Add dependabot to update go modules and github actions. (#1217) Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com> * smoke framework refactor (#1326) * Agent smoke test (#1291) * convert smoke script to go program * update build for agent-smoke * fix pr comments * use existing log helper package * refactor context cancel * update exit codes * use ticker * prefer oklog/run instead of errgroup * use nop logger * refactor task interface * remove functional options * log.With for task loggers * move smoke to tools * build smoke image, push to internal registry * move crow to tools * add gcr_admin secret * fix link to crow * add smoke libsonnet and use in local k3d smoke test * add deletePodBySelectorTask * scale smoke-test replica down after local test * refactor smoke Options to Config * update duration usage message * add some basic unit tests * newlines * pass mutation frequency and chaos frequency from smoke script * pull crow image from gcr * update smoke script * move monitoring to smoke libsonnet * move additional smoke resources needed in deployment tools * reference libsonnet files from grafana-agent dep * make drone * fix images in smoke script * get rid of extVars * update k3d example environment to reference etcd from new location * update smoke docker builds to use go1.17 * use pointer.Int64 * refactor smoke jsonnet (#1296) * add policy rule for list and delete pods (#1319) * refactor smoke.new function to take config object (#1327) * Apply suggestions from code review * Update production/tanka/grafana-agent/smoke/crow/main.libsonnet * Update production/tanka/grafana-agent/smoke/main.libsonnet * Update example/k3d/scripts/smoke-test.bash Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * readme update (#1338) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Correct link to the configuration (#1036) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Add stale check Github Action (#1345) * Add a stale check GH action to run every 24 hours * remove old stale.yml file * add permissions to action * update the stale message to clarify when the stale label will get removed * Update .github/workflows/stale.yml * stale action: fix missing indent (#1346) * Fix mssql issue (#1351) * Add K8s Events integration (#1330) * Add K8s eventhandler integration (#1310) * Add docs and sample manifests to eventhandler integration (#1328) * Wait for cache to flush before returning * Clarify eventhandler docs (#1334) * Clarify docs * Update CHANGELOG.md * Review changes (#1349) * stale action: fix typo in label exemptions (#1347) * update withVolumesMixin for agent jsonnet (#1358) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Configure cluster label using logs client external_labels param (#1357) * Configure cluster label using logs client external_labels param * Update CHANGELOG.md * add password file and basic auth round tripper in crow (#1361) * add password file and basic auth round tripper in crow * add ca-certificates in crow image * add orgID flag * update help text * default send_exemplars to true in remote_write (#1352) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update eventhandler labels (#1368) * Update eventhandler integration labels * Update CHANGELOG * Remove unnecessary kind label * update changelog (#1374) Remove BUGFIX entries that fix a bug introduced by main (i.e., bugs which were never part of a release) * Prepare for release of v0.23.0 (#1377) * Update version references * Fix fat-fingered delete; Remove mention of upgrade Go * RFC: Design in the open (#1055) * rfc: first draft of RFC0001 * add placeholder for PR * update PR link * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify "designing in the open" is best-effort * update 0001 * fix dead link in production/README.md * add recommended sections for RFC proposals * describe the process for approving a proposal * ignore RFC template in link checker * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * do my nitty 80-char line length limit change * indent pros/cons to a single section * document process for superseding RFCs * remove RFC mutability requirement * add extra flavor around not recommending google docs * require Google Doc -> RFC conversion * move new files Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Add Grafana Labs SECURITY.md (#1356) Signed-off-by: Richard Hartmann <richih@richih.org> * Add readiness check to metrics component (#1369) * PR Base Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix autoscrape's mockInstance Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in atomic readiness check Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Add CHANGELOG.md entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Reference page to download windows installer (#1372) Fixes #1366 * fix typo in node_exporter_config (#1389) which should be `privileged` instead of `priviliged` * Add option for Operator to pass arguments to GrafanaAgent #1227 (#1248) * 1250 oauth2 tracing (#1386) * Add oauth support for trace Otel trace exporter via opentelemetry-collector-contrib oauth2clientauthextension * start extensions on collector instance startup fix decoding to otelconfig build extensions add oauth extension to service map * Update traces config documentation * lint fixes * fix godoc comments * pass exporter index directly to exporter name generator * PR feedback; Update Changelog * sort extensions when sorting pipelines for testing determinism * README: Fix link to agent logo (#1396) * update MAINTAINERS.md (#1402) * add smoke alerts to mixin; move local alerts into examples dir (#1397) * add smoke alerts to mixin; move local alerts into examples dir * add podPrefix for smoke test * podPrefix in libsonnet config * [RFC] Integrations in Grafana Agent Operator (#1224) * rfc: integrations in grafana agent operator Supersedes #883 * add missing links * Apply suggestions from code review Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify how many daemonsets/deployments/service/secrets are created * add example of defining secrets * try defining integrations * s/IntegrationsMonitor/IntegrationMonitor/g * simplify proposal * add alternatives * remove old reference to `hasMetrics` field * document example generated agent configuration file * assign ID RFC-0002 * add missing PR link Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * add fake rw endpoint to smoke program (#1405) * fix alerts typo (#1407) * continuous delivery for smoke images (#1408) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * fix continuous delivery job errors (#1409) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * [operator] - Use _file variants for basic auth credentials. (#1411) * use password_file alternatives in operator config * update tests * reduce smoke alert noise (#1412) * reduce smoke alert noise Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update production/grafana-agent-mixin/alerts.libsonnet Co-authored-by: Robert Fratto <robertfratto@gmail.com> * update cpu check comment Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * add minimum load threshold to cpu alert Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Clarify usage of instanceNamespaceSelector (#1413) * RFC-0001: Add status to RFC (#1391) * rfc-0001: add rules for when RFC PRs should be merged * use status field instead of merge to indicating state * Parametrize logs DaemonSet K8s manifests (#1420) * Parametrize logs daemonset K8s manifests * Update CHANGELOG.md * Extend linting configuration file (#1421) * Add depguard linter to reject packages we tend to avoid * Replace golint with revive, since golint is deprecated * Remove interfacer, which is deprecated with no replacement * Add makezero linter to detect misuse of make with append * Add tenv to prefer t.Setenv over os.Setenv in tests * Add whitespace to report unnecessary blank lines * Ignore test files for errcheck In addition to the above, the following changes were made: * Remove settings that just re-set default values, instead pointing to the website to retrieve defaults. * Simplify the errcheck rule to only include functions we actually need to ignore. * Merging again! Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: melGL <81323402+melgl@users.noreply.github.com> Co-authored-by: Tom Wilkie <tomwilkie@users.noreply.github.com> Co-authored-by: Joseph Woodward <josephwoodward@xeuse.com> Co-authored-by: hanif <hjet@users.noreply.github.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> Co-authored-by: laiwei <laiwei.ustc@gmail.com> Co-authored-by: Sam <shamsalmon@users.noreply.github.com> Co-authored-by: Chris Knutson <christopher.knutson@gmail.com> Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Craig Peterson <192540+captncraig@users.noreply.github.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test * Use log format in traces subsystem (#1272) * Use log format in traces subsystem * Changelog * Undo unwanted change * Fix changelog entry * integrations-next: Add extra_labels to inject extra labels for an integration (#1312) * integrations-next: Add extra_labels to inject extra labels for an integration. * separate tests * Fix anchor link on operator docs (#1302) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * updated config URL (#1304) The existing URL returns a 404: https://grafana.com/docs/agent/latest/getting-started/configuration/_index.md Updated to https://grafana.com/docs/agent/latest/configuration/ * Fix typo in node_exporter (#1325) * Allow remote_write URL credentials (#1329) * Bypass Prometheus password redaction Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add inline secret in existing test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry * Add to scrubbed testcase as well Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Stop appending duplicate exemplars (#1316) * Add memExemplar in stripeSeries as first iteration Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add test for skipped duplicate exemplars; Simplify conditional Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry; discard test errors * Move changelog entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add Benchmark for AppendExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Discard error on added benchmark Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use original exemplar struct instead of custom memExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Surround benchmark loop with start/stop timers and close test storage Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add comment about prepopulating exemplars on WAL startup Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in the totalAppendedExemplars metric Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make comment more discoverable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make sure we're recording exemplars for non-nil series ref only Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * integrations-next: wait for integrations to exit after stopping them (#1318) * integrations-next: wait for integrations to exit after stopping them * fix lint errors * minor refactor * integrations-next: stop holding config mutex for entire reload * make controller.run authoritative over running integrations * fix log line * move running integrations into a dedicated worker pool * operator/hierarchy: stop using field selector when listing Secrets & ConfigMaps (#1340) The initial implementation of hierarchy.KeySelector injected a FieldSelector when listing Secrets and ConfigMaps to immediately return the single object being queried for. This causes a problem with the client generated by the controller-runtime framework, where the client is wrapped in a cache and field indexer (where only the namespace is indexed by default). This commit avoids using the field selector and the index lookup. The resulting behavior aligns more closely with discovering other resources in the hierarchy (i.e., ServiceMonitors), where the List call is also insufficient and needs post-processing via Matches to find the final list of resources. Given the controller-runtime client uses an informer for reads, all relevant Secrets and ConfigMaps are already in-memory anyway, and using the index for a faster List is a bit of an over-optimization at the moment. * Add dependabot to update go modules and github actions. (#1217) Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com> * smoke framework refactor (#1326) * Agent smoke test (#1291) * convert smoke script to go program * update build for agent-smoke * fix pr comments * use existing log helper package * refactor context cancel * update exit codes * use ticker * prefer oklog/run instead of errgroup * use nop logger * refactor task interface * remove functional options * log.With for task loggers * move smoke to tools * build smoke image, push to internal registry * move crow to tools * add gcr_admin secret * fix link to crow * add smoke libsonnet and use in local k3d smoke test * add deletePodBySelectorTask * scale smoke-test replica down after local test * refactor smoke Options to Config * update duration usage message * add some basic unit tests * newlines * pass mutation frequency and chaos frequency from smoke script * pull crow image from gcr * update smoke script * move monitoring to smoke libsonnet * move additional smoke resources needed in deployment tools * reference libsonnet files from grafana-agent dep * make drone * fix images in smoke script * get rid of extVars * update k3d example environment to reference etcd from new location * update smoke docker builds to use go1.17 * use pointer.Int64 * refactor smoke jsonnet (#1296) * add policy rule for list and delete pods (#1319) * refactor smoke.new function to take config object (#1327) * Apply suggestions from code review * Update production/tanka/grafana-agent/smoke/crow/main.libsonnet * Update production/tanka/grafana-agent/smoke/main.libsonnet * Update example/k3d/scripts/smoke-test.bash Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * readme update (#1338) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Correct link to the configuration (#1036) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Add stale check Github Action (#1345) * Add a stale check GH action to run every 24 hours * remove old stale.yml file * add permissions to action * update the stale message to clarify when the stale label will get removed * Update .github/workflows/stale.yml * stale action: fix missing indent (#1346) * Fix mssql issue (#1351) * Add K8s Events integration (#1330) * Add K8s eventhandler integration (#1310) * Add docs and sample manifests to eventhandler integration (#1328) * Wait for cache to flush before returning * Clarify eventhandler docs (#1334) * Clarify docs * Update CHANGELOG.md * Review changes (#1349) * stale action: fix typo in label exemptions (#1347) * update withVolumesMixin for agent jsonnet (#1358) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Configure cluster label using logs client external_labels param (#1357) * Configure cluster label using logs client external_labels param * Update CHANGELOG.md * add password file and basic auth round tripper in crow (#1361) * add password file and basic auth round tripper in crow * add ca-certificates in crow image * add orgID flag * update help text * default send_exemplars to true in remote_write (#1352) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update eventhandler labels (#1368) * Update eventhandler integration labels * Update CHANGELOG * Remove unnecessary kind label * update changelog (#1374) Remove BUGFIX entries that fix a bug introduced by main (i.e., bugs which were never part of a release) * Prepare for release of v0.23.0 (#1377) * Update version references * Fix fat-fingered delete; Remove mention of upgrade Go * RFC: Design in the open (#1055) * rfc: first draft of RFC0001 * add placeholder for PR * update PR link * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify "designing in the open" is best-effort * update 0001 * fix dead link in production/README.md * add recommended sections for RFC proposals * describe the process for approving a proposal * ignore RFC template in link checker * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * do my nitty 80-char line length limit change * indent pros/cons to a single section * document process for superseding RFCs * remove RFC mutability requirement * add extra flavor around not recommending google docs * require Google Doc -> RFC conversion * move new files Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Add Grafana Labs SECURITY.md (#1356) Signed-off-by: Richard Hartmann <richih@richih.org> * Add readiness check to metrics component (#1369) * PR Base Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix autoscrape's mockInstance Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in atomic readiness check Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Add CHANGELOG.md entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Reference page to download windows installer (#1372) Fixes #1366 * fix typo in node_exporter_config (#1389) which should be `privileged` instead of `priviliged` * Add option for Operator to pass arguments to GrafanaAgent #1227 (#1248) * 1250 oauth2 tracing (#1386) * Add oauth support for trace Otel trace exporter via opentelemetry-collector-contrib oauth2clientauthextension * start extensions on collector instance startup fix decoding to otelconfig build extensions add oauth extension to service map * Update traces config documentation * lint fixes * fix godoc comments * pass exporter index directly to exporter name generator * PR feedback; Update Changelog * sort extensions when sorting pipelines for testing determinism * README: Fix link to agent logo (#1396) * update MAINTAINERS.md (#1402) * add smoke alerts to mixin; move local alerts into examples dir (#1397) * add smoke alerts to mixin; move local alerts into examples dir * add podPrefix for smoke test * podPrefix in libsonnet config * [RFC] Integrations in Grafana Agent Operator (#1224) * rfc: integrations in grafana agent operator Supersedes #883 * add missing links * Apply suggestions from code review Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify how many daemonsets/deployments/service/secrets are created * add example of defining secrets * try defining integrations * s/IntegrationsMonitor/IntegrationMonitor/g * simplify proposal * add alternatives * remove old reference to `hasMetrics` field * document example generated agent configuration file * assign ID RFC-0002 * add missing PR link Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * add fake rw endpoint to smoke program (#1405) * fix alerts typo (#1407) * continuous delivery for smoke images (#1408) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * fix continuous delivery job errors (#1409) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * [operator] - Use _file variants for basic auth credentials. (#1411) * use password_file alternatives in operator config * update tests * reduce smoke alert noise (#1412) * reduce smoke alert noise Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update production/grafana-agent-mixin/alerts.libsonnet Co-authored-by: Robert Fratto <robertfratto@gmail.com> * update cpu check comment Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * add minimum load threshold to cpu alert Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Clarify usage of instanceNamespaceSelector (#1413) * RFC-0001: Add status to RFC (#1391) * rfc-0001: add rules for when RFC PRs should be merged * use status field instead of merge to indicating state * Parametrize logs DaemonSet K8s manifests (#1420) * Parametrize logs daemonset K8s manifests * Update CHANGELOG.md * Extend linting configuration file (#1421) * Add depguard linter to reject packages we tend to avoid * Replace golint with revive, since golint is deprecated * Remove interfacer, which is deprecated with no replacement * Add makezero linter to detect misuse of make with append * Add tenv to prefer t.Setenv over os.Setenv in tests * Add whitespace to report unnecessary blank lines * Ignore test files for errcheck In addition to the above, the following changes were made: * Remove settings that just re-set default values, instead pointing to the website to retrieve defaults. * Simplify the errcheck rule to only include functions we actually need to ignore. * Merging again! Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: melGL <81323402+melgl@users.noreply.github.com> Co-authored-by: Tom Wilkie <tomwilkie@users.noreply.github.com> Co-authored-by: Joseph Woodward <josephwoodward@xeuse.com> Co-authored-by: hanif <hjet@users.noreply.github.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> Co-authored-by: laiwei <laiwei.ustc@gmail.com> Co-authored-by: Sam <shamsalmon@users.noreply.github.com> Co-authored-by: Chris Knutson <christopher.knutson@gmail.com> Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Craig Peterson <192540+captncraig@users.noreply.github.com>
* Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2dd in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec5. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test * Use log format in traces subsystem (#1272) * Use log format in traces subsystem * Changelog * Undo unwanted change * Fix changelog entry * integrations-next: Add extra_labels to inject extra labels for an integration (#1312) * integrations-next: Add extra_labels to inject extra labels for an integration. * separate tests * Fix anchor link on operator docs (#1302) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * updated config URL (#1304) The existing URL returns a 404: https://grafana.com/docs/agent/latest/getting-started/configuration/_index.md Updated to https://grafana.com/docs/agent/latest/configuration/ * Fix typo in node_exporter (#1325) * Allow remote_write URL credentials (#1329) * Bypass Prometheus password redaction Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add inline secret in existing test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry * Add to scrubbed testcase as well Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Stop appending duplicate exemplars (#1316) * Add memExemplar in stripeSeries as first iteration Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add test for skipped duplicate exemplars; Simplify conditional Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry; discard test errors * Move changelog entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add Benchmark for AppendExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Discard error on added benchmark Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use original exemplar struct instead of custom memExemplar Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Surround benchmark loop with start/stop timers and close test storage Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add comment about prepopulating exemplars on WAL startup Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in the totalAppendedExemplars metric Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make comment more discoverable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Make sure we're recording exemplars for non-nil series ref only Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * integrations-next: wait for integrations to exit after stopping them (#1318) * integrations-next: wait for integrations to exit after stopping them * fix lint errors * minor refactor * integrations-next: stop holding config mutex for entire reload * make controller.run authoritative over running integrations * fix log line * move running integrations into a dedicated worker pool * operator/hierarchy: stop using field selector when listing Secrets & ConfigMaps (#1340) The initial implementation of hierarchy.KeySelector injected a FieldSelector when listing Secrets and ConfigMaps to immediately return the single object being queried for. This causes a problem with the client generated by the controller-runtime framework, where the client is wrapped in a cache and field indexer (where only the namespace is indexed by default). This commit avoids using the field selector and the index lookup. The resulting behavior aligns more closely with discovering other resources in the hierarchy (i.e., ServiceMonitors), where the List call is also insufficient and needs post-processing via Matches to find the final list of resources. Given the controller-runtime client uses an informer for reads, all relevant Secrets and ConfigMaps are already in-memory anyway, and using the index for a faster List is a bit of an over-optimization at the moment. * Add dependabot to update go modules and github actions. (#1217) Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com> * smoke framework refactor (#1326) * Agent smoke test (#1291) * convert smoke script to go program * update build for agent-smoke * fix pr comments * use existing log helper package * refactor context cancel * update exit codes * use ticker * prefer oklog/run instead of errgroup * use nop logger * refactor task interface * remove functional options * log.With for task loggers * move smoke to tools * build smoke image, push to internal registry * move crow to tools * add gcr_admin secret * fix link to crow * add smoke libsonnet and use in local k3d smoke test * add deletePodBySelectorTask * scale smoke-test replica down after local test * refactor smoke Options to Config * update duration usage message * add some basic unit tests * newlines * pass mutation frequency and chaos frequency from smoke script * pull crow image from gcr * update smoke script * move monitoring to smoke libsonnet * move additional smoke resources needed in deployment tools * reference libsonnet files from grafana-agent dep * make drone * fix images in smoke script * get rid of extVars * update k3d example environment to reference etcd from new location * update smoke docker builds to use go1.17 * use pointer.Int64 * refactor smoke jsonnet (#1296) * add policy rule for list and delete pods (#1319) * refactor smoke.new function to take config object (#1327) * Apply suggestions from code review * Update production/tanka/grafana-agent/smoke/crow/main.libsonnet * Update production/tanka/grafana-agent/smoke/main.libsonnet * Update example/k3d/scripts/smoke-test.bash Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * readme update (#1338) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Correct link to the configuration (#1036) Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Add stale check Github Action (#1345) * Add a stale check GH action to run every 24 hours * remove old stale.yml file * add permissions to action * update the stale message to clarify when the stale label will get removed * Update .github/workflows/stale.yml * stale action: fix missing indent (#1346) * Fix mssql issue (#1351) * Add K8s Events integration (#1330) * Add K8s eventhandler integration (#1310) * Add docs and sample manifests to eventhandler integration (#1328) * Wait for cache to flush before returning * Clarify eventhandler docs (#1334) * Clarify docs * Update CHANGELOG.md * Review changes (#1349) * stale action: fix typo in label exemptions (#1347) * update withVolumesMixin for agent jsonnet (#1358) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Configure cluster label using logs client external_labels param (#1357) * Configure cluster label using logs client external_labels param * Update CHANGELOG.md * add password file and basic auth round tripper in crow (#1361) * add password file and basic auth round tripper in crow * add ca-certificates in crow image * add orgID flag * update help text * default send_exemplars to true in remote_write (#1352) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update eventhandler labels (#1368) * Update eventhandler integration labels * Update CHANGELOG * Remove unnecessary kind label * update changelog (#1374) Remove BUGFIX entries that fix a bug introduced by main (i.e., bugs which were never part of a release) * Prepare for release of v0.23.0 (#1377) * Update version references * Fix fat-fingered delete; Remove mention of upgrade Go * RFC: Design in the open (#1055) * rfc: first draft of RFC0001 * add placeholder for PR * update PR link * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify "designing in the open" is best-effort * update 0001 * fix dead link in production/README.md * add recommended sections for RFC proposals * describe the process for approving a proposal * ignore RFC template in link checker * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Update docs/rfcs/0001-designing-in-the-open.md Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * do my nitty 80-char line length limit change * indent pros/cons to a single section * document process for superseding RFCs * remove RFC mutability requirement * add extra flavor around not recommending google docs * require Google Doc -> RFC conversion * move new files Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> * Add Grafana Labs SECURITY.md (#1356) Signed-off-by: Richard Hartmann <richih@richih.org> * Add readiness check to metrics component (#1369) * PR Base Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix autoscrape's mockInstance Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Wire in atomic readiness check Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Add CHANGELOG.md entry Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Reference page to download windows installer (#1372) Fixes #1366 * fix typo in node_exporter_config (#1389) which should be `privileged` instead of `priviliged` * Add option for Operator to pass arguments to GrafanaAgent #1227 (#1248) * 1250 oauth2 tracing (#1386) * Add oauth support for trace Otel trace exporter via opentelemetry-collector-contrib oauth2clientauthextension * start extensions on collector instance startup fix decoding to otelconfig build extensions add oauth extension to service map * Update traces config documentation * lint fixes * fix godoc comments * pass exporter index directly to exporter name generator * PR feedback; Update Changelog * sort extensions when sorting pipelines for testing determinism * README: Fix link to agent logo (#1396) * update MAINTAINERS.md (#1402) * add smoke alerts to mixin; move local alerts into examples dir (#1397) * add smoke alerts to mixin; move local alerts into examples dir * add podPrefix for smoke test * podPrefix in libsonnet config * [RFC] Integrations in Grafana Agent Operator (#1224) * rfc: integrations in grafana agent operator Supersedes #883 * add missing links * Apply suggestions from code review Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * clarify how many daemonsets/deployments/service/secrets are created * add example of defining secrets * try defining integrations * s/IntegrationsMonitor/IntegrationMonitor/g * simplify proposal * add alternatives * remove old reference to `hasMetrics` field * document example generated agent configuration file * assign ID RFC-0002 * add missing PR link Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * add fake rw endpoint to smoke program (#1405) * fix alerts typo (#1407) * continuous delivery for smoke images (#1408) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * fix continuous delivery job errors (#1409) Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * [operator] - Use _file variants for basic auth credentials. (#1411) * use password_file alternatives in operator config * update tests * reduce smoke alert noise (#1412) * reduce smoke alert noise Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update production/grafana-agent-mixin/alerts.libsonnet Co-authored-by: Robert Fratto <robertfratto@gmail.com> * update cpu check comment Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * add minimum load threshold to cpu alert Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Clarify usage of instanceNamespaceSelector (#1413) * RFC-0001: Add status to RFC (#1391) * rfc-0001: add rules for when RFC PRs should be merged * use status field instead of merge to indicating state * Parametrize logs DaemonSet K8s manifests (#1420) * Parametrize logs daemonset K8s manifests * Update CHANGELOG.md * Extend linting configuration file (#1421) * Add depguard linter to reject packages we tend to avoid * Replace golint with revive, since golint is deprecated * Remove interfacer, which is deprecated with no replacement * Add makezero linter to detect misuse of make with append * Add tenv to prefer t.Setenv over os.Setenv in tests * Add whitespace to report unnecessary blank lines * Ignore test files for errcheck In addition to the above, the following changes were made: * Remove settings that just re-set default values, instead pointing to the website to retrieve defaults. * Simplify the errcheck rule to only include functions we actually need to ignore. Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> Co-authored-by: melGL <81323402+melgl@users.noreply.github.com> Co-authored-by: Tom Wilkie <tomwilkie@users.noreply.github.com> Co-authored-by: Joseph Woodward <josephwoodward@xeuse.com> Co-authored-by: hanif <hjet@users.noreply.github.com> Co-authored-by: Richard Hartmann <RichiH@users.noreply.github.com> Co-authored-by: laiwei <laiwei.ustc@gmail.com> Co-authored-by: Sam <shamsalmon@users.noreply.github.com> Co-authored-by: Chris Knutson <christopher.knutson@gmail.com> Co-authored-by: Florian Klink <flokli@flokli.de> Co-authored-by: Craig Peterson <192540+captncraig@users.noreply.github.com>
* Merge main -> dev.dynamic_configuration (#1270) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec54f9a781fc83d3e7001808c887f37833ff. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> * Main merge dynamic (#1305) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec54f9a781fc83d3e7001808c887f37833ff. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Main merge dynamic (#1307) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec54f9a781fc83d3e7001808c887f37833ff. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * [RFC] Dynamic Documentation (#1308) * Documentation and feature flag support. Part 1 of many. * Fix linting * Documentation * More documentation * MOAR documentation * Update overall readme * fix typo * Update docs/configuration/dynamic-config.md Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * fix typos and add additional comments * Feedback from PR * Allow overrides of documentation Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Code to support dynamic configuration. (#1360) * Code to support dynamic configuration. * Fix linting errors * Fix issue with examples and add template parse * Fix windows issues * Simplify tests * Respond to PR feedback * Simplify tests and remove setting default * Move to unexported configshim * Export configshims members for testing * Simplify singleton checking * Add error around expand var * sever -> server * Better error code * Fix error when unmarshalling yaml which allowed unexporting fields * Switch to using require.noerror instead of assert. * Split tests into two sections and remove redundant tests * Verbiage cleanup and var renaming * PR feedback * PR feedback * Move singleton check to the controller instead * Cleanup the read file code to MUCH more compact code. * Remove EOF * PR feedback from session with rfratto, lots of changes to simplify the code * Fix misc feedback from PR * Dyn main merge (#1423) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * L…
…#1500) * RFC-0001: Add status to RFC (#1391) * rfc-0001: add rules for when RFC PRs should be merged * use status field instead of merge to indicating state * Parametrize logs DaemonSet K8s manifests (#1420) * Parametrize logs daemonset K8s manifests * Update CHANGELOG.md * Extend linting configuration file (#1421) * Add depguard linter to reject packages we tend to avoid * Replace golint with revive, since golint is deprecated * Remove interfacer, which is deprecated with no replacement * Add makezero linter to detect misuse of make with append * Add tenv to prefer t.Setenv over os.Setenv in tests * Add whitespace to report unnecessary blank lines * Ignore test files for errcheck In addition to the above, the following changes were made: * Remove settings that just re-set default values, instead pointing to the website to retrieve defaults. * Simplify the errcheck rule to only include functions we actually need to ignore. * update smoke alerts (#1432) * update smoke alerts Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * filter pod variable with namespace Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * update cpu high alert duration to 1h Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * upgrade to loki v2.4.2 (#1422) * upgrade to loki v2.4.2 Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * update CHANGELOG.md * rename cortex_ metrics to agent_dskit_ * update changelog * cadvisor config Fixes #1281 (#1293) * Fixes #1281 * Update pkg/integrations/cadvisor/common.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * gofmt * Add tests for cadvisor config * Lint * simplify test, enhance comments, return nil on run * Default config slices to values expected by upstream cadvisor * Build flags to only run cadvisor tests when docker and network are present Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update install-agent-on-windows.md (#1438) * Update install-agent-on-windows.md add quotes to URL silent install doc * Update install-agent-on-windows.md * Update install-agent-on-windows.md * Update install-agent-on-windows.md Fix linting error * add bump-formula-pr workflow jobs (#1382) * add bump-formula-pr workflow jobs Signed-off-by: Robbie Lankford <robert.lankford@grafana.com> * Update .github/workflows/bump-formula-pr.yml * Update .github/workflows/bump-formula-pr.yml * Update .github/workflows/bump-formula-pr.yml * Update .github/workflows/bump-formula-pr.yml * Update .github/workflows/bump-formula-pr.yml * Consulagent sd (#1439) * add consulagent_sd * add consulagent_sd * Dev.dynamic configuration (#1429) * Merge main -> dev.dynamic_configuration (#1270) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec54f9a781fc83d3e7001808c887f37833ff. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> * Main merge dynamic (#1305) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec54f9a781fc83d3e7001808c887f37833ff. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Main merge dynamic (#1307) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDirectory * handle nil HTTPClientConfig * remove blank identifier assignment * pass basic auth command line flags for remote config * address pr nits * add expiremental flag * set loader inline * update changelog * add remote config section in docs * pr comment updates * announce patch releases for cve-2021-41090 (#1152) * Merge patch release to main (#1153) * Add secret type to sensitive values * Break out config tests to their own implementation. Also remove username has a sensitive value. * Update changelog * Fix failing test * Scrub secrets when marshaling instance configs * update for v0.21 * Updated changes from the merge. * Remove changelog * Scrub out receivers has ***receivers_scrubber***:null * obscure etcd/consul credentials * Update pkg/traces/config_test.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Update pkg/config/config.go * go fmt * Change to using custom object and return <secret> * Fix bad merge * [v0.21.2] toggle config endpoint (#19) * disable /-/config endpoint by default * disable scraping api get endpoint as well * fix new test * add test and rename flag Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Update version to v0.21.2 * Update defaults.go * fix /-/config endpoint * also fix non-pointer config bug * temporarily disable linting for release * fix lint errors Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter (#1162) * Fix POSTGRES_EXPORTER_DATA_SOURCE_NAME usage for postgres_exporter A recent change broke the usage of POSTGRES_EXPORTER_DATA_SOURCE_NAME for the postgres_exporter. As the incorrect variable was checked in the if clause, it always raises an error. * changelog: keep feature -> enhancement -> bugfix order * postgres_exporter: add regression test Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix syntax error in Jsonnet logs helper method (#1174) Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * cAdvisor Integration (#1081) * Add cadvisor module * Begin creating common config for cadvisor * Don't export internal state * Finish config options for cadvisor * Set config options, and implement cAdvisor collectors * Linting * Buildflags for cadvisor only in linux * I R LEArN Build Tags * Don't zero value the zero value * Offload sketchy global var manipulation to the integrations Run func * Remove unused collectors * Lint * Create generic stub integration and use it for cadvisor * Lint * Final refactor of cAdvisor config for unsupported platforms. Pared down stub integrations. * Lint * Docs for cadvisor config * Update changelog * Update pkg/integrations/stub_integration.go Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Reorder changelog * Instance key clarity * Inclusive naming * Finish name changes Keep default disable metric list in sync with upstream Idiomatic golang * Hardcode disabled metrics for cadvisor Co-authored-by: Robert Fratto <robert.fratto@grafana.com> * Remove log-level flag from systemd unit file (#1177) * Upgrade to OTel v0.40.0 (#1176) * Upgrade to OTel v0.40.0 * Changelog * Add factories check * go mod tidy * config/features: create package to standardize experimental features (#1170) * config/features: create package to standardize experiemental features This commit introduces a new package, pkg/config/features, which allows defining a set of features and validating whether flags associated with those features are allowed to be set. Closes #1163 * update documentation (also s/enabled-features/enable-features) * Fix typo * Update pkg/config/features/features.go Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Revert "Merge main into dev.multiple-integrations (#1184)" (#1189) This reverts commit ad76ec54f9a781fc83d3e7001808c887f37833ff. * [dev.multiple-integrations] Revert breaking changes to existing integrations (#1191) * revert breaking changes to integrations v1 This commit reverts #1062 in favor of making breaking changes directly in integrations-next instead. The part of #1181 to remove `wal_truncate_frequency` has also been reverted. As part of this change, the enabled field is removed from the v2 common metrics configs, and v2 integrations can no longer be disabled. v2 integrations can only be disabled by removing them from the YAML. * integrations/v2: remove stale reference to ErrDisabled (fix typo too) * integrations/v2: bring in common config decoupling * [dev.multiple-integrations] Introduce autoscraper (#1195) * pkg/integrations/v2: introduce self-scraping * linting * [dev.multiple-integrations] Multiple instances of integrations (#1196) * multiple instances of integrations opt in relevant v1 integrations into supporting multiple instances * shims should check for instance key override * Document integrations-next (#1197) * document integrations-next * remove json tags since they make markdown unhappy * changelog * s/Run/RunIntegration * remove stale comment about integrations.controller purpose * create dedicated run method for instanceScraper * s/expoter/exporter/g * Document why an autoscrape.Scraper manages a set of per-instance scrapers * spell out prerequisite instead of pre-req * use go.uber.org/atomic to make the code a little easier to follow * remove started callback for running integration * use smaller interface for autoscrape Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: Matt Durham <mattdurham@ppog.org> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> * Fix panic when using 'stdout' in automatic logging (#1233) * integrations-next: fix bug where v2 integrations were not being strictly unmarshaled (#1235) * Remove jsonnet vendor folders (#1222) * remove jsonnet vendor This adds all vendor folders into .gitignore and removes cached vendor files from the repository. Closes #1221 * Update scripts and instructions for jsonnet vendor removal * `make example-dashboards` will now also run `jb install` * k3d environment instructions now include `jb install` * smoke-test.bash will now run `jb install` prior to `tk apply` * Fix link to k3d example in DEVELOPERS.md (#1242) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Fix node_exporter upgrade docs (#1239) * Fix panic in automatic logging with stdout backend (#1243) * pkg/util: support custom yaml.Unmarshaler implementations for util.UnmarshalYAMLMerged (#1244) It's common for config types to have implement yaml.Unmarshaler for: * Applying defaults * Applying extra logic post-unmarshal If these config types were unmarshaled through util.UnmarshalYAMLMerged, the yaml.Unmarshaler implementation would never complete successfully, preventing the post-unmarshal logic from running. This issue was introduced in #1192, but went unnoticed until #1228 implemented yaml.Unmarshaler to perform field migrations. #1240 reported the issue. This commit fixes the bug by performing a second non-strict unmarshal to ensure that all input values unmarshal successfully, with the exception of unmarshal errors unrelated to unrecognized field names. This is hacky, but it's worthwhile noting that util.UnmarshalYAMLMerged is a temporary workaround needed for the integrations-next migration, and will eventually be removed. * Update k3d example grafana/grafonnet-lib version (#1246) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Create an e2e framework with support for running tests against k8s (#1234) * e2e: create an e2e framework with support for running tests against a k3d cluster * add new E2E drone job * E2E tests should pass when doing a release * sign drone.yml again * move e2e lint to different step that has golangci-lint installed * upgrade golangci-lint and go for e2e test * e2e: add gcc * E2E: install build-essential to get a working full gcc env * :( * e2e: support running from inside of docker * fix lint error * address review feedback * Operator: fix bug where /-/ready and /-/healthy always returned 404 (#1252) * operator: fix bug where /-/ready and /-/healthy always returned 404 controller-runtime must have at least one ready/healthy check for the endpoints to exist * fix lint error, use healthz.Ping * Make scraping-svc use the new `metrics:` key (#1259) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * update prometheus dependency (#1260) * corrected typo (#1265) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags (#1264) * Use RELEASE_TAG to choose between `:main` and `:latest` docker tags Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Use :main tag for images in smoke test Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Set IMAGE_BRANCH_TAG env var in drone and actions pipelines Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove quotes from Makefile variable Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Remove force_release action Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * prepare for v0.22.0 release (#1266) * prepare for v0.22.0 release * remove E2E pipeline * Add basic testing framework for operator (#1268) * remove dedicated go.mod for e2e/ * move e2e/k8s to pkg/util/k8s * Migrate operator tests to pkg/util/k8s * remove dedicated e2e tests * allow skipping TestCluster in pkg/util/k8s * remove e2e/ * fix bad merge * fix order of make env args for windows * actually declare referenced docker volume * introduce pkg/util/subset for asserting subset of objects * refactor operator so it's testable * define basic integration test for operator * fix lint errors * fix invalid address in operator test config * Update release-note.md (#1267) * Set scrape User-Agent header during init (#1274) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Upgrade to Go 1.17 (#1278) * Upgrade to 1.17.6 in go.mod and Dockerfiles * Update CHANGELOG.md to mention the update * Update Go version in drone/actions pipelines * Update go.mod, go.sum files via * Re-sign drone.yml * Remove leading newline causing drone build to fail * Bump golangci-lint image to a version using Go 1.17 * Re-attempt to solve linter issue with new golangci-lint image * Remove suffix of exclude rules * Clean previous Go version before unpacking Go 1.17 * Also clean up previous Go versions in other steps * fix typo (#1284) * Use custom Go version in agent-operator Dockerfile (#1286) Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * pkg/operator: refactor resource hierarchy discovery (#1271) * pkg/operator: refactor resource hierarchy discovery This commit moves common logic related to discovering the resource hierarchy to pkg/operator/hierarchy. This new package requires less boilerplate, which the reconciler is updated to take advantage of. * remove unused code * test construction of resource hierarchy * add missing build constraints * small extra cleanup to use pointer package * review feedback * update agent-build-image for go 1.17 (#1287) (also use a consistent base image tag instead of latest) * Skip non-ready entries when listing instances (#1289) * Skip non-ready instances in LoadInstances() Signed-off-by: Paschalis Tsilias <paschalis.tsilias@grafana.com> * Add changelog entry Co-authored-by: Robert Fratto <robertfratto@gmail.com> * Fix panic in prom_sd_processor when address is empty (#1279) * Fix panic in prom_sd_processor when address is empty * Fix panic in prom_sd_processor when address is empty * Fix docs * Add test case * Lint * Move to unreleased * Operator: generate proxy_url for remote_write (#1298) * operator: generate proxy_url for remote_write * fix weird indentation in test Co-authored-by: Robert Fratto <robert.fratto@grafana.com> Co-authored-by: Ursula Kallio <73951760+osg-grafana@users.noreply.github.com> Co-authored-by: Mario <mariorvinas@gmail.com> Co-authored-by: Robert Lankford <robert.lankford@grafana.com> Co-authored-by: f11r <fiete.gruenter@rwth-aachen.de> Co-authored-by: f11r <f11r@users.noreply.github.com> Co-authored-by: Nick Pillitteri <56quarters@users.noreply.github.com> Co-authored-by: Ryan Geyer <me@ryangeyer.com> Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de> Co-authored-by: Robert Lankford <rlankfo@gmail.com> Co-authored-by: Paschalis Tsilias <tpaschalis@users.noreply.github.com> Co-authored-by: Patrick Koenig <pkoenig10@gmail.com> Co-authored-by: DataPoints <langer.markus@gmail.com> Co-authored-by: Alex <52292902+alexrudd2@users.noreply.github.com> Co-authored-by: Robert Fratto <robertfratto@gmail.com> * [RFC] Dynamic Documentation (#1308) * Documentation and feature flag support. Part 1 of many. * Fix linting * Documentation * More documentation * MOAR documentation * Update overall readme * fix typo * Update docs/configuration/dynamic-config.md Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * fix typos and add additional comments * Feedback from PR * Allow overrides of documentation Co-authored-by: Robert Lankford <robert.lankford@grafana.com> * Code to support dynamic configuration. (#1360) * Code to support dynamic configuration. * Fix linting errors * Fix issue with examples and add template parse * Fix windows issues * Simplify tests * Respond to PR feedback * Simplify tests and remove setting default * Move to unexported configshim * Export configshims members for testing * Simplify singleton checking * Add error around expand var * sever -> server * Better error code * Fix error when unmarshalling yaml which allowed unexporting fields * Switch to using require.noerror instead of assert. * Split tests into two sections and remove redundant tests * Verbiage cleanup and var renaming * PR feedback * PR feedback * Move singleton check to the controller instead * Cleanup the read file code to MUCH more compact code. * Remove EOF * PR feedback from session with rfratto, lots of changes to simplify the code * Fix misc feedback from PR * Dyn main merge (#1423) * Update node_exporter dependency to v1.3.1 (#1228) * Add node_exporter to depcheck * update weaveworks/common dependency * map current release flags and changed defaults * documentation * revert accidental checkin * print out flags when node_exporter test fails to assist debugging * oops, i introduced some flags from master by mistake * Introduce experimental integrations revamp (#1198) * [dev.multiple-integrations] Enable present integrations by default, deprecate enabled field (#1062) * integrations: default to enabled by default * document deprecation of enabled * pkg/integrations: support *_configs field for integrations (#1130) Creates the basic code to unmarshal integrations from a YAML field called <integration name>_configs, which is a slice of that integration. Note that this is NOT wired up to the integrations manager yet, and trying to run the agent with more than one integration of the same type will likely cause problems. * [dev.multiple-integrations] Prototype new integrations subsystem (#1142) * wip: prototype new integrations subsystem * implement Controller with basic logic for Integration and UpdateIntegration * Implement HTTPIntegration for Controller * decouple controller and subsystem * don't have controller implement integration slightly less smelly now * multiplexer integration * rely on boilerplate for multiplexing for now generics would be nice here * remove multiplex_integration.go Also a little code smelly. Instead of having integrations that run other integrations, I'm going to fall back to having only one controller. * introduce Subsystem, unexport Controller start wiring up things to Subsystem * introduce v2 agent integration to use for testing * start wiring metrics integrations * rename Options to Globals call a spade a spade * add subsystem options to globals * remove dead code * metricsutils: calculate self-scraping based on globals * complete HTTP target API * working example with agent integration * appease the linter * don't return an error when context to cancel an integration is closed * once again i am asking the linter to forgive my typos * fix bug where labels from individual targets were getting dropped at the API endpoint * pkg/config: fix broken test * finish unit tests for integrations v2 controller * metricsutil/metricshandler_integration: make job name unique Before this change, the job name would have collided when using multiple instances of the same integration. * ensure that global subsystem labels are injected into targets * integrations/v2: Infer target hostname from SD API host (#1175) * [dev.multiple-integrations] integrations/v2: allow shimming between v1 and v2 integrations. (#1179) * integrations/v2: allow shimming between v1 and v2 integrations. Shimming is done by changing how the integration registration works; a new RegisterDynamic was added that allows for creating Configs at runtime. Here be dragons; this should be removed whenever we no longer have a need for it. * fix lint * pkg/integrations/v2: use "RegisterLegacy" instead of a generic mechanism * fine, I won't add the deprecation notice if it will make the linter sad * pkg/integrations: re-align (#1181) This commit reverts 69ba2ddfa9483cc8ac6e010dd7abccd319580c80 in favor of allowing the new subsystem to handle multiple instances of integrations. This commit also removes the wal_truncate_frequency field from integrations as it is the only field from old integrations that does not have a current counterpart. * [dev.multiple-integrations] Hide integrations/v2 behind a feature flag (#1185) * feature flag wip * dynamically switch between integrations v1 and v2 default to v1. * pkg/integrations/versionselector to file in pkg/config * pkg/config: fix defaults for Integrations * pkg/config: use more generic way to unmarshal differently based on flag * add missing godoc comment * more comments * switch to deferred unmarshaling * remove unused Config field * simplify completeUnmarshal * do not perform lazy deferred unmarshaling * enable cadvisor by default * switch to using real feature flag * fix postgres_exporter * Merge main into dev.multiple-integrations (#1184) * Fix typo (#1141) * Traces: Improved pod association in PromSD processor (#1137) * Improve k8s pod association * Add tests * Changelog * typo * Add prom_sd_pod_association * Extend tests for pod associations * Docs for pod association config * Lint fixes * Move to unreleased * Add instrumentation recommendations * Remove uncessary constants * Improve tests * remote config with http(s) provider (#1143) * sample remote config code with http provider * use t.TempDir() in unit test * no need to clean up after T.TempDir() * use NewClientFromConfig and make caller responsible for calling SetDi…
PR Description
This PR adds a simple remote configuration framework and implements an http(s) provider. More providers to come in following PRs.
Which issue(s) this PR fixes
#1121 (partially)
Notes to the Reviewer
Currently the http remote provider supports basic auth, by passing the password, or a password file. To enable the feature, pass the
-experiment.config-urls.enable
flag when starting the agent.The following command line flags have been added for auth for the http/https provider:
-config.url.basic-auth-user
-config.url.basic-auth-password-file
PR Checklist