Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Grafana and ClickHouse deployment configuration instructions #3525

Merged
merged 1 commit into from Mar 28, 2022

Conversation

heanlan
Copy link
Contributor

@heanlan heanlan commented Mar 25, 2022

Signed-off-by: heanlan hanlan@vmware.com

ClickHouse credentials are also specified in [flow-aggregator.yml][flow_aggregator_manifest_yaml]
as a resource of kind: a Secret. Please also make the corresponding changes.
ClickHouse credentials are also specified in `flow-aggregator.yml` as a resource
of name: `clickhouse-secret`. Please also make the corresponding changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to show these lines of code to help users locate where to change these secrets in flow-aggregator.yml
https://github.com/antrea-io/antrea/blob/main/build/yamls/flow-aggregator.yml#L240-L251

Grafana login credentials are specified in [grafana.yml][grafana_manifest_yaml] as
resource of kind: a Secret.
Grafana login credentials are specified in `flow-visibility.yml` as a resource of
name: `grafana-secret`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion for grafana-secret.

@heanlan
Copy link
Contributor Author

heanlan commented Mar 25, 2022

Changed grafana Service type from LoadBalancer to NodePort in the latest commit, as currently there is no production level requirement asking for a LoadBalancer Service, and it requires cluster networking support. Would like to hear from the team for opinions.

@codecov-commenter
Copy link

codecov-commenter commented Mar 25, 2022

Codecov Report

Merging #3525 (478d75d) into main (a539b78) will decrease coverage by 1.52%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3525      +/-   ##
==========================================
- Coverage   65.46%   63.93%   -1.53%     
==========================================
  Files         278      278              
  Lines       27771    27825      +54     
==========================================
- Hits        18179    17789     -390     
- Misses       7666     8124     +458     
+ Partials     1926     1912      -14     
Flag Coverage Δ
kind-e2e-tests 50.93% <ø> (-4.26%) ⬇️
unit-tests 43.39% <ø> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/apis/controlplane/types.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/controller/networkpolicy/store/group.go 5.00% <0.00%> (-70.00%) ⬇️
pkg/apis/controlplane/helper.go 33.33% <0.00%> (-66.67%) ⬇️
pkg/controller/networkpolicy/convert.go 0.00% <0.00%> (-64.52%) ⬇️
pkg/controller/egress/store/egressgroup.go 1.38% <0.00%> (-58.34%) ⬇️
pkg/controller/networkpolicy/tier.go 12.50% <0.00%> (-40.00%) ⬇️
pkg/apiserver/handlers/webhook/convert_crd.go 2.56% <0.00%> (-32.48%) ⬇️
pkg/controller/types/group.go 53.57% <0.00%> (-32.15%) ⬇️
pkg/controller/networkpolicy/validate.go 48.07% <0.00%> (-32.14%) ⬇️
pkg/controller/networkpolicy/status_controller.go 69.94% <0.00%> (-13.30%) ⬇️
... and 32 more

@@ -193,4 +193,4 @@ spec:
selector:
app: grafana
sessionAffinity: None
type: LoadBalancer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this change, but I think we should also mention in the documentation that users can change this Service type to LoadBalancer in the downloaded YAML, if their K8s cluster supports LoadBalancer Services

When we provide a Helm chart, this can be parameterized

@@ -659,8 +659,22 @@ kubectl delete -f https://raw.githubusercontent.com/Altinity/clickhouse-operator

##### Credentials Configuration

ClickHouse credentials are specified in [clickhouse.yml][clickhouse_manifest_yaml] as
a resource of kind: a Secret. If the username `clickhouse_operator` has changed, please
ClickHouse credentials are specified in `flow-visibility.yml` as a resource of name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a Secret named clickhouse-secret

same below

Comment on lines 750 to 753
- If you have changed the HTTP port, please update the `url` of a resource of name `grafana-datasource-provider` in `flow-visibility.yml`.

- If you have changed the TCP port, please update the `databaseURL` following [Flow Aggregator Configuration](#configuration-1),
and also update the `jsonData.port` of the `grafana-datasource-provider` resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line wrapping

Comment on lines 888 to 890
for dashboard JSON files export and manual import.

To generate a deployment manifest with the changes, please follow the following steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe emphasize that there are 2 ways of adding dashboards:

  • in the UI or by importing a JSON at runtime (I don't know how these are persisted to Grafana, they may be lost in case of a Pod restart?)
  • by generating a new manifest and (re)deploying, as described below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the changes will be lost after a Pod restart. They won't be automatically saved to the source.
I have added the 2 ways, could you help check if the wordings make sense to you? Thanks

specified in [clickhouse.yml][clickhouse_manifest_yaml] as `serviceTemplates`.
To use other ports, please update the following section accordingly.
specified in `flow-visibility.yml` as `serviceTemplates` of a resource of kind:
a `ClickHouseInstallation`. To use other ports, please update the following section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind: a ClickHouseInstallation -> kind: ClickHouseInstallation?

and `databaseURL` in the [Flow Aggregator Configuration](#configuration-1).
This service is used by the Flow Aggregator and Grafana.

- If you have changed the HTTP port, please update the `url` of a resource of name `grafana-datasource-provider` in `flow-visibility.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to keep consistency in whether we need a colon after "a resource of name"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will also join the chorus and ask to change 'resource of name' to 'ConfigMap named' at lines 750 and 753


The ClickHouse throughput depends on two factors - the storage size of the ClickHouse
and the time interval between the batch commits to the ClickHouse. Larger storage
size and longer commit interval provide higher throughput.

Grafana flow collector supports the ClickHouse in-memory deployment with limited
storage size. This is specified in [clickhouse.yml][clickhouse_manifest_yaml].
The default value of storage size for the ClickHouse server is 8 GiB. Users
storage size. This is specified in `flow-visibility.yml` under the `ClickHouseInstallation`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line wrap

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a bit pedant, but maybe you can consider writing:

[...] under the clickhouse resource of kind ClickHouseInstallation. The default value [...]

storage size. This is specified in [clickhouse.yml][clickhouse_manifest_yaml].
The default value of storage size for the ClickHouse server is 8 GiB. Users
storage size. This is specified in `flow-visibility.yml` under the `ClickHouseInstallation`
resource. The default value of storage size for the ClickHouse server is 8 GiB. Users
can expect a linear growth in the ClickHouse throughput when they enlarge the
storage size. For development or testing environment, you can decrease the storage
size to 2GB. To deploy the ClickHouse with a different storage size, please
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
size to 2GB. To deploy the ClickHouse with a different storage size, please
size to 2GiB. To deploy the ClickHouse with a different storage size, please

@@ -602,6 +602,27 @@ use the checked-in [deployment yaml](/build/yamls/flow-visibility.yml):
kubectl apply -f https://raw.githubusercontent.com/antrea-io/antrea/main/build/yamls/flow-visibility.yml
```

To be noticed, Grafana is exposed as a NodePort Service by default in `flow-visibility.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "To be accessible"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess just remove "To be noticed"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exposed through a NodePort Service

@@ -602,6 +602,27 @@ use the checked-in [deployment yaml](/build/yamls/flow-visibility.yml):
kubectl apply -f https://raw.githubusercontent.com/antrea-io/antrea/main/build/yamls/flow-visibility.yml
```

To be noticed, Grafana is exposed as a NodePort Service by default in `flow-visibility.yml`.
If the given K8s cluster support LoadBalancer Services, Grafana can be changed to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/support/supports

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Grafana can be exposed through a LoadBalancer Service..."?

a new dashboard, please follow this [doc](https://grafana.com/docs/grafana/latest/dashboards/)
on how to build a dashboard.

By saving dashboards in the Grafana UI, these changes will be persisted in Grafana
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "By configuring dashboards"?

a new dashboard, please follow this [doc](https://grafana.com/docs/grafana/latest/dashboards/)
on how to build a dashboard.

By saving dashboards in the Grafana UI, these changes will be persisted in Grafana
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the Grafana database


By saving dashboards in the Grafana UI, these changes will be persisted in Grafana
database at runtime, but they will be lost after restarting the Grafana deployment.
To keep those changes for a restart, as the first step, you will need to export the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To restore those changes after a restart

By saving dashboards in the Grafana UI, these changes will be persisted in Grafana
database at runtime, but they will be lost after restarting the Grafana deployment.
To keep those changes for a restart, as the first step, you will need to export the
dashboard JSON file following the [doc](https://grafana.com/docs/grafana/latest/dashboards/export-import/), then there are two ways to import the dashboard depending on your needs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line wrap

also end the sentence with a : and not a .

dashboard JSON file following the [doc](https://grafana.com/docs/grafana/latest/dashboards/export-import/), then there are two ways to import the dashboard depending on your needs.

- In the running Grafana UI, manually import the dashboard JSON files.
- If you want the changed dashboards to be automatically provisioned to Grafana
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/provisioned to Grafana/provisioned in Grafana

- In the running Grafana UI, manually import the dashboard JSON files.
- If you want the changed dashboards to be automatically provisioned to Grafana
like our pre-built dashboards, generate a deployment manifest with the changes by
following the steps as below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the steps below

@@ -602,6 +602,27 @@ use the checked-in [deployment yaml](/build/yamls/flow-visibility.yml):
kubectl apply -f https://raw.githubusercontent.com/antrea-io/antrea/main/build/yamls/flow-visibility.yml
```

To be noticed, Grafana is exposed as a NodePort Service by default in `flow-visibility.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess just remove "To be noticed"

@@ -602,6 +602,27 @@ use the checked-in [deployment yaml](/build/yamls/flow-visibility.yml):
kubectl apply -f https://raw.githubusercontent.com/antrea-io/antrea/main/build/yamls/flow-visibility.yml
```

To be noticed, Grafana is exposed as a NodePort Service by default in `flow-visibility.yml`.
If the given K8s cluster support LoadBalancer Services, Grafana can be changed to a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Grafana can be exposed through a LoadBalancer Service..."?

@@ -602,6 +602,27 @@ use the checked-in [deployment yaml](/build/yamls/flow-visibility.yml):
kubectl apply -f https://raw.githubusercontent.com/antrea-io/antrea/main/build/yamls/flow-visibility.yml
```

To be noticed, Grafana is exposed as a NodePort Service by default in `flow-visibility.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exposed through a NodePort Service

```

We recommend changing all the credentials above if you are going to run the Flow
Collector in production.

##### ClickHouse Configuration

The ClickHouse database can be accessed through the service `clickhouse-clickhouse`.
The pod exposes HTTP port at 8123 and TCP port at 9000 by default. The ports are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pod -> Pod

```

We recommend changing all the credentials above if you are going to run the Flow
Collector in production.

##### ClickHouse Configuration

The ClickHouse database can be accessed through the service `clickhouse-clickhouse`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"service" refers to K8 Service? If so, use "Service" and change other occurrences.

and `databaseURL` in the [Flow Aggregator Configuration](#configuration-1).
This service is used by the Flow Aggregator and Grafana.

- If you have changed the HTTP port, please update the `url` of a resource of name `grafana-datasource-provider` in `flow-visibility.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will also join the chorus and ask to change 'resource of name' to 'ConfigMap named' at lines 750 and 753


The ClickHouse throughput depends on two factors - the storage size of the ClickHouse
and the time interval between the batch commits to the ClickHouse. Larger storage
size and longer commit interval provide higher throughput.

Grafana flow collector supports the ClickHouse in-memory deployment with limited
storage size. This is specified in [clickhouse.yml][clickhouse_manifest_yaml].
The default value of storage size for the ClickHouse server is 8 GiB. Users
storage size. This is specified in `flow-visibility.yml` under the `ClickHouseInstallation`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a bit pedant, but maybe you can consider writing:

[...] under the clickhouse resource of kind ClickHouseInstallation. The default value [...]

docs/network-flow-visibility.md Show resolved Hide resolved
@heanlan
Copy link
Contributor Author

heanlan commented Mar 28, 2022

Resolved all the comments and squashed the the commits. May I ask for another round of review? @antoninbas @jianjuns @salv-orlando

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit.

To use other ports, please update the following section accordingly.
The ClickHouse database can be accessed through the Service `clickhouse-clickhouse`.
The Pod exposes HTTP port at 8123 and TCP port at 9000 by default. The ports are
specified in `flow-visibility.yml` as `serviceTemplates` of a resource of kind:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not understand this sentence. I apologize if it is that I missed something obvious.

Do you mean "as serviceTemplates of a ClickHouseInstallation resource"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's what I was trying to say. Rephrased as suggested. Thanks.

Copy link
Contributor

@salv-orlando salv-orlando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending feedback from @jianjuns

Signed-off-by: heanlan <hanlan@vmware.com>

Change grafana Service to NodePort

Signed-off-by: heanlan <hanlan@vmware.com>
@antoninbas
Copy link
Contributor

/skip-all

@heanlan
Copy link
Contributor Author

heanlan commented Mar 28, 2022

@antoninbas shall we merge it now?

@antoninbas antoninbas merged commit 245ac54 into antrea-io:main Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants