Skip to content

Commit

Permalink
Support ClickHouse deployment with Persistent Volume
Browse files Browse the repository at this point in the history
Signed-off-by: Yanjun Zhou <zhouya@vmware.com>
  • Loading branch information
yanjunz97 committed Apr 8, 2022
1 parent 8ef6808 commit 93372b5
Show file tree
Hide file tree
Showing 10 changed files with 402 additions and 45 deletions.
45 changes: 24 additions & 21 deletions build/yamls/flow-visibility.yml
Expand Up @@ -86,26 +86,27 @@ data:
UInt64,\n reverseThroughputFromDestinationNode UInt64,\n trusted
UInt8 DEFAULT 0\n ) engine=MergeTree\n ORDER BY (timeInserted, flowEndSeconds)\n
\ TTL timeInserted + INTERVAL 1 HOUR\n SETTINGS merge_with_ttl_timeout =
3600;\n\n CREATE MATERIALIZED VIEW flows_pod_view\n ENGINE = SummingMergeTree\n
\ ORDER BY (\n timeInserted,\n flowEndSeconds,\n flowEndSecondsFromSourceNode,\n
\ flowEndSecondsFromDestinationNode,\n sourcePodName,\n destinationPodName,\n
\ destinationIP,\n destinationServicePortName,\n flowType,\n
\ sourcePodNamespace,\n destinationPodNamespace)\n TTL timeInserted
+ INTERVAL 1 HOUR\n SETTINGS merge_with_ttl_timeout = 3600\n POPULATE\n
\ AS SELECT\n timeInserted,\n flowEndSeconds,\n flowEndSecondsFromSourceNode,\n
\ flowEndSecondsFromDestinationNode,\n sourcePodName,\n destinationPodName,\n
\ destinationIP,\n destinationServicePortName,\n flowType,\n
\ sourcePodNamespace,\n destinationPodNamespace,\n sum(octetDeltaCount)
AS octetDeltaCount,\n sum(reverseOctetDeltaCount) AS reverseOctetDeltaCount,\n
\ sum(throughput) AS throughput,\n sum(reverseThroughput) AS reverseThroughput,\n
\ sum(throughputFromSourceNode) AS throughputFromSourceNode,\n sum(throughputFromDestinationNode)
AS throughputFromDestinationNode\n FROM flows\n GROUP BY\n timeInserted,\n
3600;\n\n CREATE MATERIALIZED VIEW IF NOT EXISTS flows_pod_view\n ENGINE
= SummingMergeTree\n ORDER BY (\n timeInserted,\n flowEndSeconds,\n
\ flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ sourcePodName,\n destinationPodName,\n destinationIP,\n
\ destinationServicePortName,\n flowType,\n sourcePodNamespace,\n
\ destinationPodNamespace)\n TTL timeInserted + INTERVAL 1 HOUR\n SETTINGS
merge_with_ttl_timeout = 3600\n POPULATE\n AS SELECT\n timeInserted,\n
\ flowEndSeconds,\n flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ sourcePodName,\n destinationPodName,\n destinationIP,\n
\ destinationServicePortName,\n flowType,\n sourcePodNamespace,\n
\ destinationPodNamespace;\n\n CREATE MATERIALIZED VIEW flows_node_view\n
\ ENGINE = SummingMergeTree\n ORDER BY (\n timeInserted,\n flowEndSeconds,\n
\ destinationPodNamespace,\n sum(octetDeltaCount) AS octetDeltaCount,\n
\ sum(reverseOctetDeltaCount) AS reverseOctetDeltaCount,\n sum(throughput)
AS throughput,\n sum(reverseThroughput) AS reverseThroughput,\n sum(throughputFromSourceNode)
AS throughputFromSourceNode,\n sum(throughputFromDestinationNode) AS throughputFromDestinationNode\n
\ FROM flows\n GROUP BY\n timeInserted,\n flowEndSeconds,\n
\ flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ sourcePodName,\n destinationPodName,\n destinationIP,\n
\ destinationServicePortName,\n flowType,\n sourcePodNamespace,\n
\ destinationPodNamespace;\n\n CREATE MATERIALIZED VIEW IF NOT EXISTS
flows_node_view\n ENGINE = SummingMergeTree\n ORDER BY (\n timeInserted,\n
\ flowEndSeconds,\n flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ sourceNodeName,\n destinationNodeName,\n sourcePodNamespace,\n
\ destinationPodNamespace)\n TTL timeInserted + INTERVAL 1 HOUR\n SETTINGS
merge_with_ttl_timeout = 3600\n POPULATE\n AS SELECT\n timeInserted,\n
Expand All @@ -120,9 +121,9 @@ data:
AS reverseThroughputFromDestinationNode\n FROM flows\n GROUP BY\n timeInserted,\n
\ flowEndSeconds,\n flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ sourceNodeName,\n destinationNodeName,\n sourcePodNamespace,\n
\ destinationPodNamespace;\n\n CREATE MATERIALIZED VIEW flows_policy_view\n
\ ENGINE = SummingMergeTree\n ORDER BY (\n timeInserted,\n flowEndSeconds,\n
\ flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ destinationPodNamespace;\n\n CREATE MATERIALIZED VIEW IF NOT EXISTS
flows_policy_view\n ENGINE = SummingMergeTree\n ORDER BY (\n timeInserted,\n
\ flowEndSeconds,\n flowEndSecondsFromSourceNode,\n flowEndSecondsFromDestinationNode,\n
\ egressNetworkPolicyName,\n egressNetworkPolicyRuleAction,\n ingressNetworkPolicyName,\n
\ ingressNetworkPolicyRuleAction,\n sourcePodNamespace,\n destinationPodNamespace)\n
\ TTL timeInserted + INTERVAL 1 HOUR\n SETTINGS merge_with_ttl_timeout =
Expand All @@ -145,7 +146,7 @@ data:
\ ORDER BY (timeCreated);\n \nEOSQL\n"
kind: ConfigMap
metadata:
name: clickhouse-mounted-configmap-dkbmg82ctg
name: clickhouse-mounted-configmap-58fkkt9b56
namespace: flow-visibility
---
apiVersion: v1
Expand Down Expand Up @@ -4934,12 +4935,14 @@ spec:
value: default.flows
- name: MV_NAMES
value: default.flows_pod_view default.flows_node_view default.flows_policy_view
- name: STORAGE_SIZE
value: 8Gi
image: projects.registry.vmware.com/antrea/flow-visibility-clickhouse-monitor:latest
imagePullPolicy: IfNotPresent
name: clickhouse-monitor
volumes:
- configMap:
name: clickhouse-mounted-configmap-dkbmg82ctg
name: clickhouse-mounted-configmap-58fkkt9b56
name: clickhouse-configmap-volume
- emptyDir:
medium: Memory
Expand Down
8 changes: 2 additions & 6 deletions build/yamls/flow-visibility/base/clickhouse.yml
Expand Up @@ -45,8 +45,6 @@ spec:
volumeMounts:
- name: clickhouse-configmap-volume
mountPath: /docker-entrypoint-initdb.d
- name: clickhouse-storage-volume
mountPath: /var/lib/clickhouse
- name: clickhouse-monitor
image: flow-visibility-clickhouse-monitor
env:
Expand All @@ -66,11 +64,9 @@ spec:
value: "default.flows"
- name: MV_NAMES
value: "default.flows_pod_view default.flows_node_view default.flows_policy_view"
- name: STORAGE_SIZE
value: "8Gi"
volumes:
- name: clickhouse-configmap-volume
configMap:
name: $(CLICKHOUSE_CONFIG_MAP_NAME)
- name: clickhouse-storage-volume
emptyDir:
medium: Memory
sizeLimit: 8Gi
Expand Up @@ -72,7 +72,7 @@ clickhouse client -n -h 127.0.0.1 <<-EOSQL
TTL timeInserted + INTERVAL 1 HOUR
SETTINGS merge_with_ttl_timeout = 3600;
CREATE MATERIALIZED VIEW flows_pod_view
CREATE MATERIALIZED VIEW IF NOT EXISTS flows_pod_view
ENGINE = SummingMergeTree
ORDER BY (
timeInserted,
Expand Down Expand Up @@ -121,7 +121,7 @@ clickhouse client -n -h 127.0.0.1 <<-EOSQL
sourcePodNamespace,
destinationPodNamespace;
CREATE MATERIALIZED VIEW flows_node_view
CREATE MATERIALIZED VIEW IF NOT EXISTS flows_node_view
ENGINE = SummingMergeTree
ORDER BY (
timeInserted,
Expand Down Expand Up @@ -163,7 +163,7 @@ clickhouse client -n -h 127.0.0.1 <<-EOSQL
sourcePodNamespace,
destinationPodNamespace;
CREATE MATERIALIZED VIEW flows_policy_view
CREATE MATERIALIZED VIEW IF NOT EXISTS flows_policy_view
ENGINE = SummingMergeTree
ORDER BY (
timeInserted,
Expand Down
30 changes: 30 additions & 0 deletions build/yamls/flow-visibility/patches/pv/createLocalPv.yml
@@ -0,0 +1,30 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: clickhouse-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowVolumeExpansion: True
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: clickhouse-pv
spec:
storageClassName: clickhouse-storage
capacity:
storage: 8Gi
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
local:
path: LOCAL_PATH
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: clickhouse/instance
operator: In
values:
- data
23 changes: 23 additions & 0 deletions build/yamls/flow-visibility/patches/pv/createNfsPv.yml
@@ -0,0 +1,23 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: clickhouse-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowVolumeExpansion: True
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: clickhouse-pv
spec:
storageClassName: clickhouse-storage
capacity:
storage: 8Gi
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
nfs:
path: NFS_SERVER_PATH
server: NFS_SERVER_ADDRESS
14 changes: 14 additions & 0 deletions build/yamls/flow-visibility/patches/pv/mountPv.yml
@@ -0,0 +1,14 @@
- op: add
path: /spec/defaults/templates/dataVolumeClaimTemplate
value: clickhouse-storage-template
- op: add
path: /spec/templates/volumeClaimTemplates
value:
- name: clickhouse-storage-template
spec:
storageClassName: STORAGECLASS_NAME
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
12 changes: 12 additions & 0 deletions build/yamls/flow-visibility/patches/ram/mountRam.yml
@@ -0,0 +1,12 @@
- op: add
path: /spec/templates/podTemplates/0/spec/volumes/-
value:
name: clickhouse-storage-volume
emptyDir:
medium: Memory
sizeLimit: 8Gi
- op: add
path: /spec/templates/podTemplates/0/spec/containers/0/volumeMounts/-
value:
name: clickhouse-storage-volume
mountPath: /var/lib/clickhouse
116 changes: 114 additions & 2 deletions docs/network-flow-visibility.md
Expand Up @@ -33,7 +33,10 @@
- [About Grafana and ClickHouse](#about-grafana-and-clickhouse)
- [Deployment Steps](#deployment-steps-1)
- [Credentials Configuration](#credentials-configuration)
- [ClickHouse Configuration](#clickhouse-configuration)
- [ClickHouse Configuration](#clickhouse-configuration)
- [Service Customization](#service-customization)
- [Performance Configuration](#performance-configuration)
- [Persistent Volumes](#persistent-volumes)
- [Pre-built Dashboards](#pre-built-dashboards)
- [Flow Records Dashboard](#flow-records-dashboard)
- [Pod-to-Pod Flows Dashboard](#pod-to-pod-flows-dashboard)
Expand Down Expand Up @@ -750,7 +753,9 @@ type: Opaque
We recommend changing all the credentials above if you are going to run the Flow
Collector in production.

##### ClickHouse Configuration
#### ClickHouse Configuration

##### Service Customization

The ClickHouse database can be accessed through the Service `clickhouse-clickhouse`.
The Pod exposes HTTP port at 8123 and TCP port at 9000 by default. The ports are
Expand Down Expand Up @@ -800,6 +805,8 @@ metadata:
namespace: flow-visibility
```

##### Performance Configuration

The ClickHouse throughput depends on two factors - the storage size of the ClickHouse
and the time interval between the batch commits to the ClickHouse. Larger storage
size and longer commit interval provide higher throughput.
Expand All @@ -819,11 +826,114 @@ storage size, please modify the `sizeLimit` in the following section.
name: clickhouse-storage-volume
```

To deploy ClickHouse with Persistent Volumes and limited storage size, please refer
to [Persistent Volumes](#persistent-volumes).

The time interval between the batch commits to the ClickHouse is specified in the
[Flow Aggregator Configuration](#configuration-1) as `commitInterval`. The
ClickHouse throughput grows sightly when the commit interval grows from 1s to 8s.
A commit interval larger than 8s provides little improvement on the throughput.

##### Persistent Volumes

By default, ClickHouse is deployed in memory. From Antrea v1.7, the ClickHouse
supports deployment with Persistent Volumes.

[`PersistentVolume`](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
(PV) is a piece of storage in the K8s cluster, which requires to be manually
provisioned by an administrator or dynamically provisioned using Storage Classes.
A `PersistentVolumeClaim`(PVC) is a request for storage which consumes PV. As
the ClickHouse is deployed as a StatefulSet, the volume can be claimed using
`volumeClaimTemplate`.

To deploy the ClickHouse with Persistent Volumes, please follow the steps below:

1. Provision the `PersistentVolume`. K8s supports a great number of
[`PersistentVolume` types](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes).
You can provision your own `PersistentVolume` per your requirements. Here are
two simple samples for your reference.

- Local PV allows you to store the ClickHouse data at a pre-defined path on
a specific node. Refer to [createLocalPv.yml][local_pv_yaml] to create the
PV. Please replace `LOCAL_PATH` with the path to store the ClickHouse data
and label the node used to store the ClickHouse data with
`clickhouse/instance=data`.

- NFS PV allows you to store the ClickHouse data on an existing NFS server.
Refer to [createNfsPv.yml][nfs_pv_yaml] to create the PV. Please replace
`NFS_SERVER_ADDRESS` with the host name of the NFS server and `NFS_SERVER_PATH`
with the exported path on the NFS server.

In both examples, you can set `.spec.capacity.storage` in `PersistentVolume`
to your storage size. This value is for informative purpose as K8s does not
enforce the capacity of PVs. If you want to limit the storage usage, you need
to ask for your storage system to enforce that. For example, you can create
a Local PV on a partition with the limited size. We recommend using a dedicate
saving space for the ClickHouse if you are going to run the Flow Collector in
production.

1. Request the PV for the ClickHouse. Please add a `volumeClaimTemplate` section
under `.spec.templates` to the resource `ClickHouseInstallation` in
`flow-visibility.yml` as the sample below. `storageClassName` should be filled
as your own `StorageClass` name, and `.resources.requests.storage` should be set
to your storage size.

```yaml
volumeClaimTemplates:
- name: clickhouse-storage-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
storageClassName: clickhouse-storage
```

Then add this template as `dataVolumeClaimTemplate` to the section below.

```yaml
defaults:
templates:
dataVolumeClaimTemplate: clickhouse-storage-template
podTemplate: pod-template
serviceTemplate: service-template
```

1. Remove the in-memory deployement related contents, which is defined as
`volumeMounts` + `volumes` in the resource `ClickHouseInstallation` in
`flow-visibility.yml`.

The `volumeMounts` to be removed is shown as the following section.

```yaml
- mountPath: /var/lib/clickhouse
name: clickhouse-storage-volume
```

The `volumes` to be removed is shown as the following section.

```yaml
- emptyDir:
medium: Memory
sizeLimit: 8Gi
name: clickhouse-storage-volume
```

If you prefer to generate the manfitest automatically with default settings,
please clone the repository and run one of the following commands:

```yaml
# To generate a manifest with Local PV for the ClickHouse
./hack/generate-manifest-flow-visibility.sh --volume pv --local <local_path> > flow-visibility.yml

# To generate a manifest with NFS PV for the ClickHouse
./hack/generate-manifest-flow-visibility.sh --volume pv --nfs <nfs_server_address>:/<nfs_server_path> > flow-visibility.yml

# To generate a manifest with a customized StorageClass for the ClickHouse
./hack/generate-manifest-flow-visibility.sh --volume pv --storageclass <storageclass_name> > flow-visibility.yml
```

#### Pre-built Dashboards

The following dashboards are pre-built and are recommended for Antrea flow
Expand Down Expand Up @@ -1108,3 +1218,5 @@ With filters applied:
Visualization Network Policy Dashboard">

[flow_visibility_kustomization_yaml]: ../build/yamls/flow-visibility/base/kustomization.yml
[local_pv_yaml]: ../build/yamls/flow-visibility/patches/pv/createLocalPv.yml
[nfs_pv_yaml]: ../build/yamls/flow-visibility/patches/pv/createNfsPv.yml

0 comments on commit 93372b5

Please sign in to comment.