Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-764: Loki v1beta2 model follow-up #474

Merged
merged 6 commits into from Nov 1, 2023

Conversation

jotak
Copy link
Member

@jotak jotak commented Oct 23, 2023

Description

API changes:

  • Improve default values in monolithic and microservices modes to stick with our guides (e.g. http://loki-distributor:3100/ as the ingester URL)
  • Rename modes so that they match upstream Loki naming
  • In microservices mode, the querier URL isn't optional (?)
  • Add ability to configure tenant in these modes
  • Add documentation in places where it was missing; fix some typos
  • Strong-type LokiMode; set the enum a CamelCase as this is recommended; set it as a unionDiscriminator

Implementation changes:

  • Use a "Hub" struct to convert any Loki mode into something alike Manual config; this avoids having to define many getter helpers each doing a switch/case.
  • Adapt some tests to better cover nominal use case, e.g. certificates_test now uses LokiStack mode

Also, when a lokistack is referenced, the corresponding role YAML is now automatically be created/updated. It sets the correct FLP service account depending on whether kafka is used.

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 23, 2023

@jotak: This pull request references NETOBSERV-764 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

Description

API changes:

  • Improve default values in monolithic and microservices modes to stick with our guides (e.g. http://loki-distributor:3100/ as the ingester URL)
  • Rename modes so that they match upstream Loki naming
  • In microservices mode, the querier URL isn't optional (?)
  • Add ability to configure tenant in these modes
  • Add documentation in places where it was missing; fix some typos
  • Strong-type LokiMode; set the enum a CamelCase as this is recommended; set it as a unionDiscriminator

Implementation changes:

  • Use a "Hub" struct to convert any Loki mode into something alike Manual config; this avoids having to define many getter helpers each doing a switch/case.
  • Adapt some tests to better cover nominal use case, e.g. certificates_test now uses LokiStack mode

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@codecov
Copy link

codecov bot commented Oct 23, 2023

Codecov Report

Attention: 61 lines in your changes are missing coverage. Please review.

Comparison is base (3238982) 55.00% compared to head (b136d06) 55.20%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #474      +/-   ##
==========================================
+ Coverage   55.00%   55.20%   +0.20%     
==========================================
  Files          47       49       +2     
  Lines        6460     6441      -19     
==========================================
+ Hits         3553     3556       +3     
+ Misses       2667     2645      -22     
  Partials      240      240              
Flag Coverage Δ
unittests 55.20% <73.59%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
api/v1beta2/flowcollector_types.go 100.00% <ø> (ø)
...trollers/consoleplugin/consoleplugin_reconciler.go 64.44% <100.00%> (+1.25%) ⬆️
controllers/reconcilers/common.go 56.30% <ø> (ø)
pkg/helper/flowcollector.go 67.44% <ø> (-14.55%) ⬇️
pkg/loki/roles.go 100.00% <100.00%> (ø)
controllers/flowlogspipeline/flp_common_objects.go 80.13% <92.30%> (ø)
...ollers/flowlogspipeline/flp_monolith_reconciler.go 62.50% <50.00%> (ø)
api/v1beta1/zz_generated.conversion.go 0.23% <0.00%> (ø)
controllers/flowcollector_controller.go 55.13% <50.00%> (+0.17%) ⬆️
pkg/helper/loki_config.go 96.00% <96.00%> (ø)
... and 5 more

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Oct 23, 2023
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:aec9c8d
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-aec9c8d
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-aec9c8d

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:aec9c8d make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-aec9c8d

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-aec9c8d
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Oct 23, 2023
jpinsonneau
jpinsonneau previously approved these changes Oct 23, 2023
Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good at first look, some suggestions

Comment on lines +614 to +617
LokiModeManual LokiMode = "Manual"
LokiModeLokiStack LokiMode = "LokiStack"
LokiModeMonolithic LokiMode = "Monolithic"
LokiModeMicroservices LokiMode = "Microservices"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these should be camel case when we use uppercase for agent type, deployment mode, HPA status, Loki auth, SASL, Exporter type ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cf #394 (comment)
We did it wrong on the others.
Now we have a mix, we should eventually make them all consistent, but from now on, like we did on Features, we should adopt the correct convention

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so to avoid having a mix I would suggest to fix everything at once for v1beta2 since it's already an API change. WDYT ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to do it before releasing v1beta2, but rather on a different PR to not mix it up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a task: https://issues.redhat.com/browse/NETOBSERV-1374
We should add there everything that needs to be cleaned up before we do the first release, ie. any breaking change that we might still find before 1.5

Comment on lines +35 to +38
if len(spec.LokiStack.Namespace) > 0 {
dotNamespace = "." + spec.LokiStack.Namespace
}
gatewayURL := fmt.Sprintf("https://%s-gateway-http%s.svc:8080/api/logs/v1/network/", spec.LokiStack.Name, dotNamespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could fallback on FlowCollectorSpec.namespace here to have something easier to read and consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would add a dependency on spec.namespace in the function input, which would break the logic in Convert_v1beta2_FlowCollectorLoki_To_v1alpha1_FlowCollectorLoki (https://github.com/netobserv/network-observability-operator/pull/474/files#diff-23e9a27bd609a9414ced41fdaf8063ade03a51b3154d85ad4edfdae228dd5531R139-R150) where we need to run this conversion only with v1beta2.FlowCollectorLoki as an input, not the whole spec. So removing the dependency on spec.namespace actually helps there.

@jotak
Copy link
Member Author

jotak commented Oct 24, 2023

I added the changes to automatically create the roles/binding for reading/writing loki logs (these things: https://github.com/netobserv/documents/blob/main/examples/loki-stack/role.yaml )
FYI @nathan-weinberg @Amoghrd that's surely something you do in automation scripts, that can be removed once this PR is merged

Comment on lines 791 to 801
{
ObjectMeta: metav1.ObjectMeta{
Name: constants.LokiCRReader,
},
Rules: []rbacv1.PolicyRule{{
APIGroups: []string{"loki.grafana.com"},
Resources: []string{"network"},
ResourceNames: []string{"logs"},
Verbs: []string{"get"},
}},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put that guy on console plugin side when enabled ?
Also the role binding only mention the writer, not the reader 🤔

Copy link
Member Author

@jotak jotak Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually not related to the console plugin, there is no binding that we use for reading (see here https://github.com/netobserv/documents/blob/main/examples/loki-stack/role.yaml it's the same: no reader binding) - the reader binding would have been useful in the HOST mode but since it's deprecated we don't worry about it).
The reader role is really just provided for convenience, so that users can just run something like :

oc adm policy add-cluster-role-to-user netobserv-reader user1

for multi-tenancy.

I've put it in code next to the writer role so that they're grouped together, although I agree that it doesn't relate to flp. As an alternative I can create a new loki package for that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(done creating a loki package via 4d5bc3e)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks !

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 24, 2023

@jotak: This pull request references NETOBSERV-764 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

Also, when a lokistack is referenced, the corresponding role YAML is now automatically be created/updated. It sets the correct FLP service account depending on whether kafka is used.

Description

API changes:

  • Improve default values in monolithic and microservices modes to stick with our guides (e.g. http://loki-distributor:3100/ as the ingester URL)
  • Rename modes so that they match upstream Loki naming
  • In microservices mode, the querier URL isn't optional (?)
  • Add ability to configure tenant in these modes
  • Add documentation in places where it was missing; fix some typos
  • Strong-type LokiMode; set the enum a CamelCase as this is recommended; set it as a unionDiscriminator

Implementation changes:

  • Use a "Hub" struct to convert any Loki mode into something alike Manual config; this avoids having to define many getter helpers each doing a switch/case.
  • Adapt some tests to better cover nominal use case, e.g. certificates_test now uses LokiStack mode

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Oct 24, 2023

@jotak: This pull request references NETOBSERV-764 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

Description

API changes:

  • Improve default values in monolithic and microservices modes to stick with our guides (e.g. http://loki-distributor:3100/ as the ingester URL)
  • Rename modes so that they match upstream Loki naming
  • In microservices mode, the querier URL isn't optional (?)
  • Add ability to configure tenant in these modes
  • Add documentation in places where it was missing; fix some typos
  • Strong-type LokiMode; set the enum a CamelCase as this is recommended; set it as a unionDiscriminator

Implementation changes:

  • Use a "Hub" struct to convert any Loki mode into something alike Manual config; this avoids having to define many getter helpers each doing a switch/case.
  • Adapt some tests to better cover nominal use case, e.g. certificates_test now uses LokiStack mode

Also, when a lokistack is referenced, the corresponding role YAML is now automatically be created/updated. It sets the correct FLP service account depending on whether kafka is used.

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jotak
Copy link
Member Author

jotak commented Oct 24, 2023

@jpinsonneau the update-bundle check fails apparently because I'm using a different version of yq than what the CI has.
In your previous PR, I didn't see how you installed yq on CI .. or is it already present in the runner image? We should find a way to pin the version...

[edit] no worries, fixed in last commit

jpinsonneau
jpinsonneau previously approved these changes Oct 25, 2023
Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @jotak

@openshift-ci openshift-ci bot added the lgtm label Oct 25, 2023
@jpinsonneau
Copy link
Contributor

Can you please remove flows_v1beta2_flowcollector_lokistack.yaml ? It seems not relevent anymore

API changes:
- Improve default values in monolithic and microservices modes to stick
  with our guides (e.g. http://loki-distributor:3100/ as the ingester
URL)
- Rename modes so that they match upstream Loki naming
- In microservices mode, the querier URL isn't optional
- Add ability to configure tenant in these modes
- Add documentation in places where it was missing; fix some typos
- Strong-type LokiMode; set the enum a CamelCase as this is recommended;
  set it as a unionDiscriminator

Implementation changes:
- Use a "Hub" struct to convert any Loki mode into something alike Manual
  config; this avoids having to define many getter helpers each doing a
switch/case.
- Adapt some tests to better cover nominal use case, e.g.
certificates_test now uses LokiStack mode
@jotak
Copy link
Member Author

jotak commented Oct 31, 2023

Can you please remove flows_v1beta2_flowcollector_lokistack.yaml ? It seems not relevent anymore

@jpinsonneau done
(and rebased)

@openshift-ci openshift-ci bot added the lgtm label Oct 31, 2023
@nathan-weinberg
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Oct 31, 2023
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:3196567
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-3196567
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-3196567

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:3196567 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-3196567

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-3196567
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@jotak
Copy link
Member Author

jotak commented Nov 1, 2023

/approve

Copy link

openshift-ci bot commented Nov 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Nov 1, 2023
@openshift-ci openshift-ci bot merged commit 066cd98 into netobserv:main Nov 1, 2023
10 checks passed
@jotak jotak deleted the loki-followup branch November 1, 2023 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants