Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSDOCS-3917:Installing and Configuring the Network Observability operator #53263

Merged
merged 1 commit into from
Jan 5, 2023

Conversation

skrthomas
Copy link
Contributor

@skrthomas skrthomas commented Nov 28, 2022

Contains documentation for the following Network Observability topic areas:

-installing Loki for NOO/ installing NOO/uninstalling
-configuring
-API Reference

Version(s):

4.12 only
Issue:

https://issues.redhat.com/browse/OSDOCS-3917
Link to docs preview:

https://53263--docspreview.netlify.app/openshift-enterprise/latest/networking/network_observability/installing-operators.html
QE review:

  • QE has approved this change.

Additional information:

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 28, 2022
@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 29, 2022
@skrthomas skrthomas force-pushed the OSDOCS-3917 branch 4 times, most recently from b26040e to 7998664 Compare November 29, 2022 20:00
@ocpdocs-previewbot
Copy link

ocpdocs-previewbot commented Nov 29, 2022

🤖 Updated build preview is available at:
https://53263--docspreview.netlify.app

Build log: https://circleci.com/gh/ocpdocs-previewbot/openshift-docs/6245

@skrthomas skrthomas added this to the Planned for 4.12 GA milestone Nov 29, 2022
@skrthomas skrthomas added branch/enterprise-4.12 do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Nov 29, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 29, 2022
@skrthomas skrthomas force-pushed the OSDOCS-3917 branch 9 times, most recently from 004c14c to 9b5a08d Compare November 29, 2022 23:44
@skrthomas skrthomas changed the title OSDOCS-3917:Installing the Loki operator OSDOCS-3917:Installing the Network Observability operator Nov 29, 2022
@skrthomas skrthomas force-pushed the OSDOCS-3917 branch 6 times, most recently from 174181c to b20cd5a Compare November 30, 2022 21:55
@skrthomas skrthomas force-pushed the OSDOCS-3917 branch 3 times, most recently from 414fb41 to 1992a46 Compare January 3, 2023 21:59
@@ -0,0 +1,18 @@
// Module included in the following assemblies:

// * configuring-operators.adoc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the full directory path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, addressed globally.

// * configuring-operators.adoc

:_content-type: PROCEDURE
[id="network-observability-config-FLP-sampling{context}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an underscore before {context}, and remove the carriage return between the anchor tag and title.

[id="network-observability-config-FLP-sampling{context}"]

= Updating the Flow Collector resource
As an alternative to editing YAML in the {product-title} web console, you can do configure specifications, such as eBPF sampling, by patching the `flowcollector` custom resource (CR):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider s/you can do configure/you can configure/

@@ -0,0 +1,79 @@
// Module included in the following assemblies:

// * configuring-operators.adoc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the directory path.


:_content-type: PROCEDURE
[id="network-observability-config-quick-filters_{context}"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the extra carriage return between the anchor tag and title.

----
<1> The Agent specification, `spec.agent.type`, must be `EBPF`. eBPF is the only {product-title} supported option.
<2> You can set the Sampling specification, `spec.agent.ebpf.sampling`, to manage resources. Lower sampling values might consume a large amount of computational, memory and storage resources. You can mitigate this by setting a sampling ratio. A value of 100 means one flow every 100 is sampled. A value of 0 or 1 means all flows are captured. The lower the value, the increase in returned flows and the accuracy of derived metrics. By default, eBPF sampling is set to a value of 50, for example 1:50. Note that more sampled flows also means more storage needed. It is recommend to start with default values and refine empirically, to determine which setting your cluster can manage.
<3> The Loki specification, `spec.loki` the Loki client. The default values match the Loki install paths mentioned in the Installing the Loki Operator section, but you might have to configure differently if you used another installation method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configure it differently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yea I do think this can be reworked so it seems less vague. However, it will still be vague because we are not providing any installation guidelines if users choose to install Loki from a 3rd party like Helm or Garfana. This documentation only supports installing Loki via the Red Hat Loki Operator

<1> The Agent specification, `spec.agent.type`, must be `EBPF`. eBPF is the only {product-title} supported option.
<2> You can set the Sampling specification, `spec.agent.ebpf.sampling`, to manage resources. Lower sampling values might consume a large amount of computational, memory and storage resources. You can mitigate this by setting a sampling ratio. A value of 100 means one flow every 100 is sampled. A value of 0 or 1 means all flows are captured. The lower the value, the increase in returned flows and the accuracy of derived metrics. By default, eBPF sampling is set to a value of 50, for example 1:50. Note that more sampled flows also means more storage needed. It is recommend to start with default values and refine empirically, to determine which setting your cluster can manage.
<3> The Loki specification, `spec.loki` the Loki client. The default values match the Loki install paths mentioned in the Installing the Loki Operator section, but you might have to configure differently if you used another installation method.
<4> The `spec.quickFilters` specification defines filters that show up in the web console. The `Application` filter keys,`src_namespace` and `dst_namespace`, are negated `!`, so the `Application` filter shows all traffic that _do not_ originate from, nor have a destination of, any `openshift-` or `netobserv` namespaces. For more information see Configuring quick filters below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not ? nor has a destination of? For more information, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooof, I am thinking does not originate from, or have a destination to,...

@@ -0,0 +1,14 @@
// Module included in the following assemblies:

// * installing-operators.adoc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the full directory path.


:_content-type: CONCEPT
[id="network-observability-kafka-option_{context}"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omit the carriage return.

@@ -0,0 +1,56 @@
// Module included in the following assemblies:

// * installing-operators.adoc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the full directory path.

@johnwilkins
Copy link
Contributor

johnwilkins commented Jan 3, 2023

/remove-label peer-review-in-progress
/label peer-review-done

@openshift-ci openshift-ci bot added the peer-review-done Signifies that the peer review team has reviewed this PR label Jan 3, 2023
@johnwilkins
Copy link
Contributor

/remove-label peer-review-in-progress

@openshift-ci openshift-ci bot removed the peer-review-in-progress Signifies that the peer review team is reviewing this PR label Jan 3, 2023
Copy link

@memodi memodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skrthomas - LGTM! I'll let @nathan-weinberg do final QE Approval on this, thanks!


= Create roles for authentication and authorization
Specify authentication and authorization configurations by defining `ClusterRole` and `ClusterRoleBinding`. You can create a YAML to define these roles.
.Procedure
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line before this?

@nathan-weinberg
Copy link

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jan 4, 2023
Copy link

@jotak jotak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@skrthomas skrthomas force-pushed the OSDOCS-3917 branch 2 times, most recently from a9947ac to 7007ab0 Compare January 4, 2023 21:51
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 4, 2023
@skrthomas skrthomas force-pushed the OSDOCS-3917 branch 2 times, most recently from 8bed042 to 3f90f24 Compare January 4, 2023 22:07
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 4, 2023
:_content-type: PROCEDURE
[id="network-observability-lokistack-configuring-ingestion{context}"]

= Configuring LokiStack ingestion
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jotak Configuring ingestion topic

Comment on lines 31 to 51
----
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: loki-alerts
namespace: openshift-operators-redhat
spec:
groups:
- name: LokiRateLimitAlerts
rules:
- alert: LokiTenantRateLimit
annotations:
message: |-
{{ $labels.job }} {{ $labels.route }} is experiencing 429 errors.
summary: "At any number of requests are responded with the rate limit error code."
expr: sum(irate(loki_request_duration_seconds_count{status_code="429"}[1m])) by (job, namespace, route) / sum(irate(loki_request_duration_seconds_count[1m])) by (job, namespace, route) * 100 > 0
for: 10s
labels:
severity: warning
----
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there was indentation lost during the copy. Cf the original: https://github.com/netobserv/documents/blob/main/examples/distributed-loki/alerting/loki-ratelimit-alert.yaml

Suggested change
----
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: loki-alerts
namespace: openshift-operators-redhat
spec:
groups:
- name: LokiRateLimitAlerts
rules:
- alert: LokiTenantRateLimit
annotations:
message: |-
{{ $labels.job }} {{ $labels.route }} is experiencing 429 errors.
summary: "At any number of requests are responded with the rate limit error code."
expr: sum(irate(loki_request_duration_seconds_count{status_code="429"}[1m])) by (job, namespace, route) / sum(irate(loki_request_duration_seconds_count[1m])) by (job, namespace, route) * 100 > 0
for: 10s
labels:
severity: warning
----
----
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: loki-alerts
namespace: openshift-operators-redhat
spec:
groups:
- name: LokiRateLimitAlerts
rules:
- alert: LokiTenantRateLimit
annotations:
message: |-
{{ $labels.job }} {{ $labels.route }} is experiencing 429 errors.
summary: "At any number of requests are responded with the rate limit error code."
expr: sum(irate(loki_request_duration_seconds_count{status_code="429"}[1m])) by (job, namespace, route) / sum(irate(loki_request_duration_seconds_count[1m])) by (job, namespace, route) * 100 > 0
for: 10s
labels:
severity: warning
----

@skrthomas skrthomas merged commit 065d0e9 into openshift:main Jan 5, 2023
@skrthomas
Copy link
Contributor Author

/cherrypick enterprise-4.12

@openshift-cherrypick-robot

@skrthomas: new pull request created: #54263

In response to this:

/cherrypick enterprise-4.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/enterprise-4.12 peer-review-done Signifies that the peer review team has reviewed this PR qe-approved Signifies that QE has signed off on this PR size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet