Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Component fails to retrieve collector rules for logging >= release-5.5 #69

Closed
anothertobi opened this issue Nov 9, 2022 · 3 comments · Fixed by #72
Closed

Component fails to retrieve collector rules for logging >= release-5.5 #69

anothertobi opened this issue Nov 9, 2022 · 3 comments · Fixed by #72
Labels
bug Something isn't working

Comments

@anothertobi
Copy link
Contributor

When configuring the component to retrieve alert rules for release-5.5 or later, compiling the component fails.

This happens because the upstream rules were moved from https://raw.githubusercontent.com/openshift/cluster-logging-operator/${openshift4_logging:alerts}/files/fluentd/fluentd_prometheus_alerts.yaml to https://raw.githubusercontent.com/openshift/cluster-logging-operator/${openshift4_logging:alerts}/files/collector/fluentd_prometheus_alerts.yaml for release-5.5 and later (file moved to different folder).

Steps to Reproduce the Problem

  1. Set openshift4_logging.alerts: "release-5.4"
  2. Compile -> works
  3. Set openshift4_logging.alerts: "release-5.5"
  4. Compile -> fails

Actual Behavior

Unknown (Non-Kapitan) Error occurred
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 159, in fetch_http_dependency
    content_type = fetch_http_source(source, cached_source_path, item_type)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 207, in fetch_http_source
    content, content_type = make_request(source)
                            ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kapitan/utils.py", line 478, in make_request
    r.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 953, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/openshift/cluster-logging-operator/master/files/fluentd/fluentd_prometheus_alerts.yaml
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/kapitan/targets.py", line 113, in compile_targets
    fetch_dependencies(
  File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 88, in fetch_dependencies
    [p.get() for p in pool.imap_unordered(http_worker, http_deps.items()) if p]
  File "/usr/local/lib/python3.11/site-packages/kapitan/dependency_manager/base.py", line 88, in <listcomp>
    [p.get() for p in pool.imap_unordered(http_worker, http_deps.items()) if p]
  File "/usr/local/lib/python3.11/multiprocessing/pool.py", line 873, in next
    raise value
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/openshift/cluster-logging-operator/master/files/fluentd/fluentd_prometheus_alerts.yaml
404 Client Error: Not Found for url: https://raw.githubusercontent.com/openshift/cluster-logging-operator/master/files/fluentd/fluentd_prometheus_alerts.yaml

Expected Behavior

Successful compile

@anothertobi anothertobi added the bug Something isn't working label Nov 9, 2022
@anothertobi
Copy link
Contributor Author

The upstream component is actively changing the way they manage the Prometheus rules.

On the release-5.5 branch the file was moved from files/fluentd/fluentd_prometheus_alerts.yaml to files/collector/fluentd_prometheus_alerts.yaml as part of openshift/cluster-logging-operator#1726 and on master the rules have been moved again, but this time into a go constant: openshift/cluster-logging-operator#1732

@simu
Copy link
Member

simu commented Nov 9, 2022

Additionally, since we switched to using a Git commit ID as the default value for the alerts parameter, we can't implement the usual lookup map indexed by release as a quickfix. As I noted in #47, we'll have to completely reconsider how we want to manage logging collector alerts going forward.

@simu
Copy link
Member

simu commented Nov 11, 2022

We'll handle the change to having the alert rules as a Go constant in #74, it's not immediately relevant for OpenShift logging release 5.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants