From 31e6763416b4eae8da872dfa5905842661ffa6a7 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 11:58:14 +0100 Subject: [PATCH 01/13] docs: reorganized usage guide and 'implementation.adoc' --- docs/modules/ROOT/nav.adoc | 8 +- docs/modules/ROOT/pages/implementation.adoc | 32 --- docs/modules/ROOT/pages/index.adoc | 28 +- .../configuration-environment-overrides.adoc | 75 +++++ .../modules/ROOT/pages/usage-guide/index.adoc | 1 + .../usage-guide/logging-log-aggregation.adoc | 25 ++ .../ROOT/pages/usage-guide/monitoring.adoc | 10 + .../ROOT/pages/usage-guide/resources.adoc | 87 ++++++ .../ROOT/pages/usage-guide/scaling.adoc | 3 + docs/modules/ROOT/pages/usage.adoc | 264 ------------------ 10 files changed, 232 insertions(+), 301 deletions(-) delete mode 100644 docs/modules/ROOT/pages/implementation.adoc create mode 100644 docs/modules/ROOT/pages/usage-guide/configuration-environment-overrides.adoc create mode 100644 docs/modules/ROOT/pages/usage-guide/index.adoc create mode 100644 docs/modules/ROOT/pages/usage-guide/logging-log-aggregation.adoc create mode 100644 docs/modules/ROOT/pages/usage-guide/monitoring.adoc create mode 100644 docs/modules/ROOT/pages/usage-guide/resources.adoc create mode 100644 docs/modules/ROOT/pages/usage-guide/scaling.adoc delete mode 100644 docs/modules/ROOT/pages/usage.adoc diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc index 2a7fca20..7ce4a64a 100644 --- a/docs/modules/ROOT/nav.adoc +++ b/docs/modules/ROOT/nav.adoc @@ -1,5 +1,9 @@ * xref:configuration.adoc[] -* xref:usage.adoc[] -* xref:implementation.adoc[] * Concepts ** xref:discovery.adoc[] +* xref:usage-guide/index.adoc[] +** xref:usage-guide/logging-log-aggregation.adoc[] +** xref:usage-guide/monitoring.adoc[] +** xref:usage-guide/resources.adoc[] +** xref:usage-guide/scaling.adoc[] +** xref:usage-guide/configuration-environment-overrides.adoc[] diff --git a/docs/modules/ROOT/pages/implementation.adoc b/docs/modules/ROOT/pages/implementation.adoc deleted file mode 100644 index a6eedeb3..00000000 --- a/docs/modules/ROOT/pages/implementation.adoc +++ /dev/null @@ -1,32 +0,0 @@ -= Implementation - -== Kubernetes objects - -This operator can be used to set up a highly available HDFS cluster. It implements three roles of the HDFS cluster: - -* Data Node - responsible for holding the actual data. *IMPORTANT* Currently the data is kept in `hostPath` volumes of 1GB and no configuration options are exposed to the user. Each data node has its own volume. -* Journal Node - responsible for keeping track of HDFS blocks and used to perform failovers in case the active name node fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html -* Name Node - responsible for keeping track of HDFS blocks and providing access to the data. - -The operator creates the following K8S objects per role group defined in the custom resource. - -* ClusterIP - used for intra-cluster communication. -* ConfigMap - HDFS configuration files like `core-site.xml` and `hdfs-site.xml` are defined here and mounted in the pods. -* StatefulSet - where the replica count of each role group is defined. By default, a cluster will have 2 name nodes, 3 journal nodes and 3 data nodes. - -In addition, a `NodePort` service is created for each pod that exposes all container ports to the outside world (from the perspective of K8S). - -== HDFS - -In the custom resource you can specify the number of replicas per role group (name node, data node or journal node) but the operator will make sure that: -* at least two name nodes are started -* at least one journal node is started -* no datanodes are started unless the number of replicas is greater than zero. - -== Monitoring - -The cluster can be monitored with Prometheus from inside or outside the K8S cluster. - -All services (with the exception of the Zookeeper daemon on the node names) run with the JMX exporter agent enabled and expose metrics on the `metrics` port. This port is available form the container level up to the NodePort services. - -The metrics endpoints are also used as liveliness probes by K8S. diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index 08729bc3..40fbf701 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -1,18 +1,40 @@ = Stackable Operator for Apache HDFS -This is an operator for Kubernetes that can manage https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] clusters. +The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] is used to set up HFDS in high-availability mode. It depends on the xref:zookeeper:ROOT:index.adoc[] to operate a ZooKeeper cluster to coordinate the active/passive name node. WARNING: This operator only works with images from the https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop[Stackable] repository +== Kubernetes objects + +This operator can be used to set up a highly available HDFS cluster. It implements three roles of the HDFS cluster: + +* Data Node - responsible for holding the actual data. *IMPORTANT* Currently the data is kept in `hostPath` volumes of 1GB and no configuration options are exposed to the user. Each data node has its own volume. +* Journal Node - responsible for keeping track of HDFS blocks and used to perform failovers in case the active name node fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html +* Name Node - responsible for keeping track of HDFS blocks and providing access to the data. + +The operator creates the following K8S objects per role group defined in the custom resource. + +* ClusterIP - used for intra-cluster communication. +* ConfigMap - HDFS configuration files like `core-site.xml` and `hdfs-site.xml` are defined here and mounted in the pods. +* StatefulSet - where the replica count of each role group is defined. By default, a cluster will have 2 name nodes, 3 journal nodes and 3 data nodes. + +In addition, a `NodePort` service is created for each pod that exposes all container ports to the outside world (from the perspective of K8S). + +In the custom resource you can specify the number of replicas per role group (name node, data node or journal node) but the operator will make sure that: + +* at least two name nodes are started +* at least one journal node is started +* no data nodes are started unless the number of replicas is greater than zero. + == Supported Versions The Stackable Operator for Apache HDFS currently supports the following versions of HDFS: include::partial$supported-versions.adoc[] -== Docker +== Docker image [source] ---- docker pull docker.stackable.tech/stackable/hadoop: ----- \ No newline at end of file +---- diff --git a/docs/modules/ROOT/pages/usage-guide/configuration-environment-overrides.adoc b/docs/modules/ROOT/pages/usage-guide/configuration-environment-overrides.adoc new file mode 100644 index 00000000..4cb705e7 --- /dev/null +++ b/docs/modules/ROOT/pages/usage-guide/configuration-environment-overrides.adoc @@ -0,0 +1,75 @@ + += Configuration & Environment Overrides + +The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role). + +IMPORTANT: Overriding certain properties can lead to faulty clusters. In general this means, do not change ports, hostnames or properties related to data dirs, high-availability or security. + +== Configuration Properties + +For a role or role group, at the same level of `config`, you can specify `configOverrides` for the `hdfs-site.xml` and `core-site.xml`. For example, if you want to set additional properties on the namenode servers, adapt the `nameNodes` section of the cluster resource like so: + +[source,yaml] +---- +nameNodes: + roleGroups: + default: + config: [...] + configOverrides: + core-site.xml: + fs.trash.interval: "5" + hdfs-site.xml: + dfs.namenode.num.checkpoints.retained: "3" + replicas: 2 +---- + +Just as for the `config`, it is possible to specify this at role level as well: + +[source,yaml] +---- +nameNodes: + configOverrides: + core-site.xml: + fs.trash.interval: "5" + hdfs-site.xml: + dfs.namenode.num.checkpoints.retained: "3" + roleGroups: + default: + config: [...] + replicas: 2 +---- + +All override property values must be strings. The properties will be formatted and escaped correctly into the XML file. + +For a full list of configuration options we refer to the Apache Hdfs documentation for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml[hdfs-site.xml] and https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/core-default.xml[core-site.xml] + + +== Environment Variables + +In a similar fashion, environment variables can be (over)written. For example per role group: + +[source,yaml] +---- +nameNodes: + roleGroups: + default: + config: {} + envOverrides: + MY_ENV_VAR: "MY_VALUE" + replicas: 1 +---- + +or per role: + +[source,yaml] +---- +nameNodes: + envOverrides: + MY_ENV_VAR: "MY_VALUE" + roleGroups: + default: + config: {} + replicas: 1 +---- + +IMPORTANT: Some environment variables will be overriden by the operator and cannot be set manually by the user. These are `HADOOP_HOME`, `HADOOP_CONF_DIR`, `POD_NAME` and `ZOOKEEPER`. diff --git a/docs/modules/ROOT/pages/usage-guide/index.adoc b/docs/modules/ROOT/pages/usage-guide/index.adoc new file mode 100644 index 00000000..146d37bc --- /dev/null +++ b/docs/modules/ROOT/pages/usage-guide/index.adoc @@ -0,0 +1 @@ += Usage Guide diff --git a/docs/modules/ROOT/pages/usage-guide/logging-log-aggregation.adoc b/docs/modules/ROOT/pages/usage-guide/logging-log-aggregation.adoc new file mode 100644 index 00000000..f150e806 --- /dev/null +++ b/docs/modules/ROOT/pages/usage-guide/logging-log-aggregation.adoc @@ -0,0 +1,25 @@ += Logging & log aggregation + +The logs can be forwarded to a Vector log aggregator by providing a discovery +ConfigMap for the aggregator and by enabling the log agent: + +[source,yaml] +---- +spec: + vectorAggregatorConfigMapName: vector-aggregator-discovery + nameNodes: + config: + logging: + enableVectorAgent: true + dataNodes: + config: + logging: + enableVectorAgent: true + journalNodes: + config: + logging: + enableVectorAgent: true +---- + +Further information on how to configure logging, can be found in +xref:home:concepts:logging.adoc[]. diff --git a/docs/modules/ROOT/pages/usage-guide/monitoring.adoc b/docs/modules/ROOT/pages/usage-guide/monitoring.adoc new file mode 100644 index 00000000..72cac8f5 --- /dev/null +++ b/docs/modules/ROOT/pages/usage-guide/monitoring.adoc @@ -0,0 +1,10 @@ += Monitoring + +The cluster can be monitored with Prometheus from inside or outside the K8S cluster. + +All services (with the exception of the Zookeeper daemon on the node names) run with the JMX exporter agent enabled and expose metrics on the `metrics` port. This port is available form the container level up to the NodePort services. + +The metrics endpoints are also used as liveliness probes by K8S. + +The managed HDFS instances are automatically configured to export Prometheus metrics. See +xref:home:operators:monitoring.adoc[] for more details. diff --git a/docs/modules/ROOT/pages/usage-guide/resources.adoc b/docs/modules/ROOT/pages/usage-guide/resources.adoc new file mode 100644 index 00000000..a0d08ee0 --- /dev/null +++ b/docs/modules/ROOT/pages/usage-guide/resources.adoc @@ -0,0 +1,87 @@ += Resources + +== Storage for data volumes + +You can mount volumes where data is stored by specifying https://kubernetes.io/docs/concepts/storage/persistent-volumes[PersistentVolumeClaims] for each individual role group: + +[source,yaml] +---- +dataNodes: + roleGroups: + default: + config: + resources: + storage: + data: + capacity: 128Gi +---- + +In the above example, all data nodes in the default group will store data (the location of `dfs.datanode.name.dir`) on a `128Gi` volume. + +By default, in case nothing is configured in the custom resource for a certain role group, each Pod will have a `5Gi` large volume mount for the data location. + +=== Multiple storage volumes + +Datanodes can have multiple disks attached to increase the storage size as well as speed. +They can be of different type, e.g. HDDs or SSDs. + +You can configure multiple [PersistentVolumeClaims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) (PVCs) for the datanodes as follows: + +[source,yaml] +---- +dataNodes: + roleGroups: + default: + config: + resources: + storage: + data: # We need to overwrite the data pvcs coming from the default value + count: 0 + my-disks: + count: 3 + capacity: 12Ti + hdfsStorageType: Disk + my-ssds: + count: 2 + capacity: 5Ti + storageClass: premium-ssd + hdfsStorageType: SSD +---- + +This will create the following PVCs: + +1. `my-disks-hdfs-datanode-default-0` (12Ti) +2. `my-disks-1-hdfs-datanode-default-0` (12Ti) +3. `my-disks-2-hdfs-datanode-default-0` (12Ti) +4. `my-ssds-hdfs-datanode-default-0` (5Ti) +5. `my-ssds-1-hdfs-datanode-default-0` (5Ti) + +By configuring and using a dedicated https://kubernetes.io/docs/concepts/storage/storage-classes/[StorageClass] you can configure your HDFS to use local disks attached to Kubernetes nodes. + +[NOTE] +==== +You might need to re-create the StatefulSet to apply the new PVC configuration because of https://github.com/kubernetes/kubernetes/issues/68737[this Kubernetes issue]. +You can delete the StatefulSet using `kubectl delete sts --cascade=false `. +The hdfs-operator will re-create the StatefulSet automatically. +==== + +== Resource Requests + +include::home:concepts:stackable_resource_requests.adoc[] + +If no resource requests are configured explicitly, the HDFS operator uses the following defaults: + +[source,yaml] +---- +dataNodes: + roleGroups: + default: + config: + resources: + cpu: + max: '4' + min: '100m' + storage: + data: + capacity: 2Gi +---- diff --git a/docs/modules/ROOT/pages/usage-guide/scaling.adoc b/docs/modules/ROOT/pages/usage-guide/scaling.adoc new file mode 100644 index 00000000..62ed0972 --- /dev/null +++ b/docs/modules/ROOT/pages/usage-guide/scaling.adoc @@ -0,0 +1,3 @@ += Scaling + +When scaling namenodes up, make sure to increase the replica count only by one and not more nodes at once. diff --git a/docs/modules/ROOT/pages/usage.adoc b/docs/modules/ROOT/pages/usage.adoc deleted file mode 100644 index e9ecce8d..00000000 --- a/docs/modules/ROOT/pages/usage.adoc +++ /dev/null @@ -1,264 +0,0 @@ -= Usage - -Since Apache Hdfs is installed in high-availability mode, an Apache Zookeeper cluster is required to coordinate the active/passive namenode. - -Install the Stackable Zookeeper operator and an Apache Zookeeper cluster like this: - -[source,bash] ----- -helm install zookeeper-operator stackable/zookeeper-operator -cat <`. -The hdfs-operator will re-create the StatefulSet automatically. -==== - -=== Resource Requests - -// The "nightly" version is needed because the "include" directive searches for -// files in the "stable" version by default. -// TODO: remove the "nightly" version after the next platform release (current: 22.09) -include::nightly@home:concepts:stackable_resource_requests.adoc[] - -If no resource requests are configured explicitly, the HDFS operator uses the following defaults: - -[source,yaml] ----- -dataNodes: - roleGroups: - default: - config: - resources: - cpu: - max: '4' - min: '100m' - storage: - data: - capacity: 2Gi ----- From a44fd0675f4c68f107e5077fdd89a39b1569c71d Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 12:14:54 +0100 Subject: [PATCH 02/13] more minor changes --- docs/modules/ROOT/nav.adoc | 2 +- docs/modules/ROOT/pages/index.adoc | 8 +++++--- docs/modules/ROOT/pages/usage-guide/index.adoc | 2 ++ docs/modules/ROOT/pages/usage-guide/monitoring.adoc | 3 +-- docs/modules/ROOT/pages/usage-guide/resources.adoc | 2 +- 5 files changed, 10 insertions(+), 7 deletions(-) diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc index 7ce4a64a..0c7df2fd 100644 --- a/docs/modules/ROOT/nav.adoc +++ b/docs/modules/ROOT/nav.adoc @@ -2,8 +2,8 @@ * Concepts ** xref:discovery.adoc[] * xref:usage-guide/index.adoc[] +** xref:usage-guide/resources.adoc[] ** xref:usage-guide/logging-log-aggregation.adoc[] ** xref:usage-guide/monitoring.adoc[] -** xref:usage-guide/resources.adoc[] ** xref:usage-guide/scaling.adoc[] ** xref:usage-guide/configuration-environment-overrides.adoc[] diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index 40fbf701..416aacd7 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -2,16 +2,18 @@ The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] is used to set up HFDS in high-availability mode. It depends on the xref:zookeeper:ROOT:index.adoc[] to operate a ZooKeeper cluster to coordinate the active/passive name node. -WARNING: This operator only works with images from the https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop[Stackable] repository +NOTE: This operator only works with images from the https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop[Stackable] repository -== Kubernetes objects +== Roles -This operator can be used to set up a highly available HDFS cluster. It implements three roles of the HDFS cluster: +Three xref:home:concepts:roles-and-role-groups.adoc[roles] of the HDFS cluster are implemented: * Data Node - responsible for holding the actual data. *IMPORTANT* Currently the data is kept in `hostPath` volumes of 1GB and no configuration options are exposed to the user. Each data node has its own volume. * Journal Node - responsible for keeping track of HDFS blocks and used to perform failovers in case the active name node fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html * Name Node - responsible for keeping track of HDFS blocks and providing access to the data. +== Kubernetes objects + The operator creates the following K8S objects per role group defined in the custom resource. * ClusterIP - used for intra-cluster communication. diff --git a/docs/modules/ROOT/pages/usage-guide/index.adoc b/docs/modules/ROOT/pages/usage-guide/index.adoc index 146d37bc..e58bdf11 100644 --- a/docs/modules/ROOT/pages/usage-guide/index.adoc +++ b/docs/modules/ROOT/pages/usage-guide/index.adoc @@ -1 +1,3 @@ = Usage Guide + +This Section will help you to use and configure the Stackable Operator for Apache HDFS in various ways. You should already be familiar with how to set up a basic instance. Follow the xref:getting_started:index.adoc[] guide to learn how to set up a basic instance with all the required dependencies (for example ZooKeeper). diff --git a/docs/modules/ROOT/pages/usage-guide/monitoring.adoc b/docs/modules/ROOT/pages/usage-guide/monitoring.adoc index 72cac8f5..6a72f164 100644 --- a/docs/modules/ROOT/pages/usage-guide/monitoring.adoc +++ b/docs/modules/ROOT/pages/usage-guide/monitoring.adoc @@ -6,5 +6,4 @@ All services (with the exception of the Zookeeper daemon on the node names) run The metrics endpoints are also used as liveliness probes by K8S. -The managed HDFS instances are automatically configured to export Prometheus metrics. See -xref:home:operators:monitoring.adoc[] for more details. +See xref:home:operators:monitoring.adoc[] for more details. diff --git a/docs/modules/ROOT/pages/usage-guide/resources.adoc b/docs/modules/ROOT/pages/usage-guide/resources.adoc index a0d08ee0..275d94e8 100644 --- a/docs/modules/ROOT/pages/usage-guide/resources.adoc +++ b/docs/modules/ROOT/pages/usage-guide/resources.adoc @@ -25,7 +25,7 @@ By default, in case nothing is configured in the custom resource for a certain r Datanodes can have multiple disks attached to increase the storage size as well as speed. They can be of different type, e.g. HDDs or SSDs. -You can configure multiple [PersistentVolumeClaims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) (PVCs) for the datanodes as follows: +You can configure multiple https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims[PersistentVolumeClaims] (PVCs) for the datanodes as follows: [source,yaml] ---- From 66b522dadcb26801473275e99026fa59850297bb Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 14:49:28 +0100 Subject: [PATCH 03/13] Update docs/modules/ROOT/pages/index.adoc Co-authored-by: Malte Sander --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index 416aacd7..413d10d5 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -1,6 +1,6 @@ = Stackable Operator for Apache HDFS -The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] is used to set up HFDS in high-availability mode. It depends on the xref:zookeeper:ROOT:index.adoc[] to operate a ZooKeeper cluster to coordinate the active/passive name node. +The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] is used to set up HFDS in high-availability mode. It depends on the xref:zookeeper:ROOT:index.adoc[] to operate a ZooKeeper cluster to coordinate the active and standby name nodes. NOTE: This operator only works with images from the https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop[Stackable] repository From 5fa2d78e891bce4bf2678dd28db2c9eff615c931 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 14:49:40 +0100 Subject: [PATCH 04/13] Update docs/modules/ROOT/pages/index.adoc Co-authored-by: Malte Sander --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index 413d10d5..e64276fc 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -8,7 +8,7 @@ NOTE: This operator only works with images from the https://repo.stackable.tech/ Three xref:home:concepts:roles-and-role-groups.adoc[roles] of the HDFS cluster are implemented: -* Data Node - responsible for holding the actual data. *IMPORTANT* Currently the data is kept in `hostPath` volumes of 1GB and no configuration options are exposed to the user. Each data node has its own volume. +* Data Node - responsible for storing the actual data. *IMPORTANT* Currently the data is kept in `hostPath` volumes of 1GB and no configuration options are exposed to the user. Each data node has its own volume. * Journal Node - responsible for keeping track of HDFS blocks and used to perform failovers in case the active name node fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html * Name Node - responsible for keeping track of HDFS blocks and providing access to the data. From 68ab22f9f9ccb49c5d76e068417b119152d4ecf6 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 14:49:57 +0100 Subject: [PATCH 05/13] Update docs/modules/ROOT/pages/index.adoc Co-authored-by: Malte Sander --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index e64276fc..dbe8e624 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -16,7 +16,7 @@ Three xref:home:concepts:roles-and-role-groups.adoc[roles] of the HDFS cluster a The operator creates the following K8S objects per role group defined in the custom resource. -* ClusterIP - used for intra-cluster communication. +* Service - ClusterIP used for intra-cluster communication. * ConfigMap - HDFS configuration files like `core-site.xml` and `hdfs-site.xml` are defined here and mounted in the pods. * StatefulSet - where the replica count of each role group is defined. By default, a cluster will have 2 name nodes, 3 journal nodes and 3 data nodes. From 0c7a5dc6ecc8623eb65f1b4821988a838a7e0c62 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 14:51:43 +0100 Subject: [PATCH 06/13] Update docs/modules/ROOT/pages/index.adoc Co-authored-by: Malte Sander --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index dbe8e624..d3324982 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -20,7 +20,7 @@ The operator creates the following K8S objects per role group defined in the cus * ConfigMap - HDFS configuration files like `core-site.xml` and `hdfs-site.xml` are defined here and mounted in the pods. * StatefulSet - where the replica count of each role group is defined. By default, a cluster will have 2 name nodes, 3 journal nodes and 3 data nodes. -In addition, a `NodePort` service is created for each pod that exposes all container ports to the outside world (from the perspective of K8S). +In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S). In the custom resource you can specify the number of replicas per role group (name node, data node or journal node) but the operator will make sure that: From c8b4eea21c5b98da2f0f6a608d54e529260fabbc Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 14:52:02 +0100 Subject: [PATCH 07/13] Update docs/modules/ROOT/pages/usage-guide/monitoring.adoc Co-authored-by: Malte Sander --- docs/modules/ROOT/pages/usage-guide/monitoring.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/usage-guide/monitoring.adoc b/docs/modules/ROOT/pages/usage-guide/monitoring.adoc index 6a72f164..1ae7cca2 100644 --- a/docs/modules/ROOT/pages/usage-guide/monitoring.adoc +++ b/docs/modules/ROOT/pages/usage-guide/monitoring.adoc @@ -2,7 +2,7 @@ The cluster can be monitored with Prometheus from inside or outside the K8S cluster. -All services (with the exception of the Zookeeper daemon on the node names) run with the JMX exporter agent enabled and expose metrics on the `metrics` port. This port is available form the container level up to the NodePort services. +All services (with the exception of the Zookeeper daemon on the node names) run with the JMX exporter agent enabled and expose metrics on the `metrics` port. This port is available from the container level up to the NodePort services. The metrics endpoints are also used as liveliness probes by K8S. From de6a8bd9d50ff2fd3445a522faf79360f0b16409 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 15:02:16 +0100 Subject: [PATCH 08/13] fix: removed incorrect statement about data nodes --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index d3324982..b04b0924 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -8,7 +8,7 @@ NOTE: This operator only works with images from the https://repo.stackable.tech/ Three xref:home:concepts:roles-and-role-groups.adoc[roles] of the HDFS cluster are implemented: -* Data Node - responsible for storing the actual data. *IMPORTANT* Currently the data is kept in `hostPath` volumes of 1GB and no configuration options are exposed to the user. Each data node has its own volume. +* Data Node - responsible for storing the actual data. * Journal Node - responsible for keeping track of HDFS blocks and used to perform failovers in case the active name node fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html * Name Node - responsible for keeping track of HDFS blocks and providing access to the data. From 9957d54d6b97e2ba7aa1f0f571e0ce3dbf5475b6 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 15:06:30 +0100 Subject: [PATCH 09/13] added log4j.properties --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index b04b0924..a77a4e30 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -17,7 +17,7 @@ Three xref:home:concepts:roles-and-role-groups.adoc[roles] of the HDFS cluster a The operator creates the following K8S objects per role group defined in the custom resource. * Service - ClusterIP used for intra-cluster communication. -* ConfigMap - HDFS configuration files like `core-site.xml` and `hdfs-site.xml` are defined here and mounted in the pods. +* ConfigMap - HDFS configuration files like `core-site.xml`, `hdfs-site.xml` and `log4j.properties` are defined here and mounted in the pods. * StatefulSet - where the replica count of each role group is defined. By default, a cluster will have 2 name nodes, 3 journal nodes and 3 data nodes. In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S). From c2c24bcc0b18a53e022f0208165446e371e5bb34 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 15:07:50 +0100 Subject: [PATCH 10/13] Removed incorrect statement about defaults --- docs/modules/ROOT/pages/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index a77a4e30..9dd344c5 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -18,7 +18,7 @@ The operator creates the following K8S objects per role group defined in the cus * Service - ClusterIP used for intra-cluster communication. * ConfigMap - HDFS configuration files like `core-site.xml`, `hdfs-site.xml` and `log4j.properties` are defined here and mounted in the pods. -* StatefulSet - where the replica count of each role group is defined. By default, a cluster will have 2 name nodes, 3 journal nodes and 3 data nodes. +* StatefulSet - where the replica count, volume mounts and more for each role group is defined. In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S). From d7863fa9dd89593993f59026e2298a0eaa0ac02e Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 15:13:20 +0100 Subject: [PATCH 11/13] Removed more outdated stuff --- docs/modules/ROOT/pages/index.adoc | 6 ------ 1 file changed, 6 deletions(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index 9dd344c5..e5c7a41e 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -22,12 +22,6 @@ The operator creates the following K8S objects per role group defined in the cus In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S). -In the custom resource you can specify the number of replicas per role group (name node, data node or journal node) but the operator will make sure that: - -* at least two name nodes are started -* at least one journal node is started -* no data nodes are started unless the number of replicas is greater than zero. - == Supported Versions The Stackable Operator for Apache HDFS currently supports the following versions of HDFS: From 6dacd9b95f23295dfe96652e30f47e03b98c787b Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 15:18:28 +0100 Subject: [PATCH 12/13] ~ --- docs/modules/ROOT/pages/index.adoc | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index e5c7a41e..7a6d79d6 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -22,6 +22,12 @@ The operator creates the following K8S objects per role group defined in the cus In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S). +In the custom resource you can specify the number of replicas per role group (name node, data node or journal node). A minimal working configuration requires: + +* 2 name nodes (HA) +* 1 journal node +* 1 data node (should match at least the `dfsReplication` factor) + == Supported Versions The Stackable Operator for Apache HDFS currently supports the following versions of HDFS: From 146d33d42ad24e9f85bf03a4f3379692cc6a4264 Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Mon, 23 Jan 2023 15:23:21 +0100 Subject: [PATCH 13/13] DataNode instead of data node (and similar for the other node types) --- docs/modules/ROOT/pages/index.adoc | 16 ++++++++-------- .../ROOT/pages/usage-guide/resources.adoc | 2 +- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc index 7a6d79d6..88155d92 100644 --- a/docs/modules/ROOT/pages/index.adoc +++ b/docs/modules/ROOT/pages/index.adoc @@ -1,6 +1,6 @@ = Stackable Operator for Apache HDFS -The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] is used to set up HFDS in high-availability mode. It depends on the xref:zookeeper:ROOT:index.adoc[] to operate a ZooKeeper cluster to coordinate the active and standby name nodes. +The Stackable Operator for https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html[Apache HDFS] is used to set up HFDS in high-availability mode. It depends on the xref:zookeeper:ROOT:index.adoc[] to operate a ZooKeeper cluster to coordinate the active and standby NameNodes. NOTE: This operator only works with images from the https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%2Fhadoop[Stackable] repository @@ -8,9 +8,9 @@ NOTE: This operator only works with images from the https://repo.stackable.tech/ Three xref:home:concepts:roles-and-role-groups.adoc[roles] of the HDFS cluster are implemented: -* Data Node - responsible for storing the actual data. -* Journal Node - responsible for keeping track of HDFS blocks and used to perform failovers in case the active name node fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html -* Name Node - responsible for keeping track of HDFS blocks and providing access to the data. +* DataNode - responsible for storing the actual data. +* JournalNode - responsible for keeping track of HDFS blocks and used to perform failovers in case the active NameNode fails. For details see: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html +* NameNode - responsible for keeping track of HDFS blocks and providing access to the data. == Kubernetes objects @@ -22,11 +22,11 @@ The operator creates the following K8S objects per role group defined in the cus In addition, a `NodePort` service is created for each pod labeled with `hdfs.stackable.tech/pod-service=true` that exposes all container ports to the outside world (from the perspective of K8S). -In the custom resource you can specify the number of replicas per role group (name node, data node or journal node). A minimal working configuration requires: +In the custom resource you can specify the number of replicas per role group (NameNode, DataNode or JournalNode). A minimal working configuration requires: -* 2 name nodes (HA) -* 1 journal node -* 1 data node (should match at least the `dfsReplication` factor) +* 2 NameNodes (HA) +* 1 JournalNode +* 1 DataNode (should match at least the `dfsReplication` factor) == Supported Versions diff --git a/docs/modules/ROOT/pages/usage-guide/resources.adoc b/docs/modules/ROOT/pages/usage-guide/resources.adoc index 275d94e8..09488e0d 100644 --- a/docs/modules/ROOT/pages/usage-guide/resources.adoc +++ b/docs/modules/ROOT/pages/usage-guide/resources.adoc @@ -16,7 +16,7 @@ dataNodes: capacity: 128Gi ---- -In the above example, all data nodes in the default group will store data (the location of `dfs.datanode.name.dir`) on a `128Gi` volume. +In the above example, all DataNodes in the default group will store data (the location of `dfs.datanode.name.dir`) on a `128Gi` volume. By default, in case nothing is configured in the custom resource for a certain role group, each Pod will have a `5Gi` large volume mount for the data location.