Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Allow enabling secure mode with Kerberos #334

Closed
wants to merge 107 commits into from
Closed
Show file tree
Hide file tree
Changes from 106 commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
6f6ec49
WIP: rekerberize
nightkr Feb 14, 2023
7c18711
Merge branch 'main' into spike/security2
nightkr Feb 14, 2023
1914ebc
Workingish but incredibly hacky Kerberized HDFS
nightkr Feb 17, 2023
277fc61
Merge branch 'main' into spike/security2
sbernauer Mar 15, 2023
bd090e8
Use SecretOperatorVolumeSourceBuilder to build kerberos volume
sbernauer Mar 15, 2023
941ff54
Add kdc and SecretClass to example
sbernauer Mar 15, 2023
eb6ad9e
Move ssl-server.xml and ssl-client.xml into constant
sbernauer Mar 15, 2023
33d6cb3
WIP
sbernauer Mar 15, 2023
8d63204
Only add volumes conditionally
sbernauer Mar 15, 2023
b9e7e72
Use actual SecretClasses instead of hardcoding
sbernauer Mar 15, 2023
2aadadf
Remove HADOOP_JAAS_DEBUG=true
sbernauer Mar 15, 2023
b77b418
Use add instead of extend
sbernauer Mar 15, 2023
1453819
Only add volume mounts when needed
sbernauer Mar 15, 2023
35c6a90
Only write either http or https nn address into hdfs-site
sbernauer Mar 15, 2023
3a3f744
Only mount tls certs to main containers and fix other stuff
sbernauer Mar 15, 2023
5044baf
Use dedicated init container to create tls bundles
sbernauer Mar 16, 2023
668dec4
Move kerberos service name into separate fn
sbernauer Mar 16, 2023
29fa168
Dynamically determine principal name
sbernauer Mar 16, 2023
21b9233
Read KERBEROS_REALM from krb5.conf rather than hardcoding
sbernauer Mar 16, 2023
001394d
WIP checkpoint, not working. Tries to use fqdn, nn fails to connect t…
sbernauer Mar 17, 2023
aa02167
Swtich back to using service names as principals. Works again, only t…
sbernauer Mar 17, 2023
0edee66
Only use hdfs_name within principal. It works!
sbernauer Mar 17, 2023
b750417
Test with more replicas
sbernauer Mar 17, 2023
320ec00
Switch back to use fqdn principals
sbernauer Mar 17, 2023
129dc95
fix warnings
sbernauer Mar 17, 2023
8cf7a3d
Only specify nn principal in wait-for-namenodes when kerberos is enabled
sbernauer Mar 17, 2023
119f29a
Use operator-rs main branch
sbernauer Mar 17, 2023
e029dfc
Use new image
sbernauer Mar 17, 2023
7ca3b03
Changed namespace of keytab secret to default so the example is direc…
soenkeliebau Mar 17, 2023
fb1a5fb
Specify concrete principal when connecting to NN from init container
sbernauer Mar 17, 2023
acd49d3
Add first basic test
sbernauer Mar 17, 2023
bfa6699
fix typo
sbernauer Mar 17, 2023
c58b877
Increase test number
sbernauer Mar 17, 2023
d3c7a15
Add hadoop.security.authentication=kerberos to discovery CM
sbernauer Mar 20, 2023
38f406b
Add a teststep to access hdfs. Get it running by adding stuff to disc…
sbernauer Mar 20, 2023
3f0cb66
linter
sbernauer Mar 20, 2023
a89ecd8
Set hdfs log level to DEBUG in test
sbernauer Mar 20, 2023
6cb6bd6
Increase log level and double number of test runs
sbernauer Mar 20, 2023
5ca4c23
Disable logging again
sbernauer Mar 21, 2023
8b5503c
Test PROD.MYCORP realm
sbernauer Mar 21, 2023
d0cb73c
Fix misc stuff
sbernauer Mar 22, 2023
864b4b5
Refactor stuff into kerberos.rs
sbernauer Mar 22, 2023
0d81588
Create create_tls_cert_bundle_init_container_and_volumes fn
sbernauer Mar 22, 2023
54f14bc
Revert "Create create_tls_cert_bundle_init_container_and_volumes fn"
sbernauer Mar 22, 2023
8d43afa
Add AD tests
nightkr Mar 22, 2023
0a8bc3c
Disable activeDirectory tests
sbernauer Mar 22, 2023
e9ae3e9
Improve comment
sbernauer Mar 22, 2023
ff60fae
Don't hard-code realm when ktiniting
nightkr Mar 22, 2023
3236028
Write hadoop.kerberos.keytab.login.autorenewal.enabled to config and …
sbernauer Mar 27, 2023
97f5dae
Remove uneeded volume in tests
sbernauer Mar 28, 2023
f9eff39
Disable node principals by default
sbernauer Mar 28, 2023
f13a0bb
Add wire encryption setting
sbernauer Mar 30, 2023
877fccc
Merge branch 'main' into spike/security2
sbernauer Mar 30, 2023
ee66453
Adopt to new secret-op crd
sbernauer Mar 30, 2023
a2a4759
Merge remote-tracking branch 'origin/main' into spike/security2
sbernauer Apr 25, 2023
aa15c3c
Fix merge mistake
sbernauer Apr 26, 2023
4202cbc
Bump tests to 23.4 and respect listenerClass
sbernauer Apr 26, 2023
4836e06
Increase assert timeout
sbernauer Apr 26, 2023
037eb1a
Only run kerberos tests
sbernauer Apr 26, 2023
74a648c
Merge branch 'main' into spike/security2
sbernauer May 16, 2023
cfb645c
Revert example replicas to 1
sbernauer May 16, 2023
8e548d2
Update rust/operator/src/container.rs
sbernauer May 16, 2023
87e8b26
Update rust/operator/src/container.rs
sbernauer May 16, 2023
06ca693
Remove commented out code regarding HADOOP_POLICY_XML
sbernauer May 16, 2023
5f9d600
fix format
sbernauer May 16, 2023
0856bc5
Address Arch discussion feedback
sbernauer May 19, 2023
5011c22
Add docs
sbernauer May 19, 2023
3cd0372
Update docs/modules/hdfs/pages/usage-guide/security.adoc
sbernauer May 22, 2023
6eddc49
Update docs/modules/hdfs/pages/usage-guide/security.adoc
sbernauer May 22, 2023
58738de
Update docs/modules/hdfs/pages/usage-guide/security.adoc
sbernauer May 22, 2023
dbb11bd
Apply suggestions from code review
sbernauer May 22, 2023
083d02d
Apply review comment
sbernauer May 22, 2023
61e08d0
Remove sentence
sbernauer May 22, 2023
693ceb2
Re-enable tests
sbernauer May 22, 2023
4a2afb9
Rework CRD accoring to review feedback
sbernauer May 22, 2023
5396e0a
Rename kerberos.kerberosSecretClass to kerberos.secretClass
sbernauer May 22, 2023
3a50250
Adress Arch meeting feedback
sbernauer May 24, 2023
cdefa59
charts
sbernauer May 24, 2023
8006564
Re-enable all test cases
sbernauer May 24, 2023
dd333a6
Re-add wire encryption privacy settings
sbernauer May 24, 2023
3bd3170
fix: Only add truststore settings when https is enabled
sbernauer May 26, 2023
0842ec7
test: Switch image to nightly
sbernauer May 26, 2023
aab00c7
Also add ssl.server.truststore.location setting
sbernauer May 30, 2023
2765015
Add comment
sbernauer May 31, 2023
d171093
Remove redundant set -e
sbernauer May 31, 2023
effae5a
We only support Kerberos for HDFS >= 3.3.x
sbernauer May 31, 2023
d69b2dd
Add chaos monkey to smoke test
sbernauer May 31, 2023
d51f97d
Merge remote-tracking branch 'origin/main' into spike/security2
sbernauer May 31, 2023
42e2dec
docs
sbernauer May 31, 2023
93e0a7c
Use constants consistently
sbernauer May 31, 2023
9e484cd
Simplify
sbernauer May 31, 2023
1ffa879
Minor improvements
sbernauer May 31, 2023
4a2ff34
Fix kerberos hdfs version check
sbernauer May 31, 2023
7817bc6
cleanup
sbernauer May 31, 2023
0e425bc
intendation
sbernauer May 31, 2023
701d97f
Sett hadoop.rpc.protection to privacy
sbernauer May 31, 2023
3b6d52d
Check exit code of zk format
sbernauer May 31, 2023
2b5405a
Reduce test duration by restricting chaosmonkey in smoke test
sbernauer Jun 1, 2023
d15620b
Force delte pods
sbernauer Jun 1, 2023
bfaafa4
Link to journalnodes bug ticket
sbernauer Jun 5, 2023
01d4a5a
Don't share /tmp ticket cache between containers
sbernauer Jun 7, 2023
8aadcde
Removed uneeded -D dfs.namenode.kerberos.principal=$PRINCIPAL
sbernauer Jun 7, 2023
277c550
Move stacktrace into dedicated file
sbernauer Jun 12, 2023
b38c7c7
Only kinit in init(!) contains that need a ticket
sbernauer Jun 12, 2023
ae2fec4
fix: Set KERBEROS_REALM in every container, not only the ones with kinit
sbernauer Jun 12, 2023
3368d1f
fix: Only kinit when Kerberos is enabled ;)
sbernauer Jun 12, 2023
9291fa3
changelog
sbernauer Jun 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
60 changes: 33 additions & 27 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions Cargo.toml
Expand Up @@ -4,5 +4,6 @@ members = [
"rust/crd", "rust/operator", "rust/operator-binary"
]

#[patch."https://github.com/stackabletech/operator-rs.git"]
#stackable-operator = { git = "https://github.com/stackabletech//operator-rs.git", branch = "main" }
# [patch."https://github.com/stackabletech/operator-rs.git"]
# stackable-operator = { path = "/home/sbernauer/stackabletech/operator-rs" }
# stackable-operator = { git = "https://github.com/stackabletech//operator-rs.git", branch = "main" }
20 changes: 20 additions & 0 deletions deploy/helm/hdfs-operator/crds/crds.yaml
Expand Up @@ -26,6 +26,26 @@ spec:
properties:
clusterConfig:
properties:
authentication:
description: Configuration to set up a cluster secured using Kerberos.
nullable: true
properties:
kerberos:
description: Kerberos configuration
properties:
secretClass:
description: Name of the SecretClass providing the keytab for the HDFS services.
type: string
required:
- secretClass
type: object
tlsSecretClass:
default: tls
description: Name of the SecretClass providing the tls certificates for the WebUIs.
type: string
required:
- kerberos
type: object
autoFormatFs:
nullable: true
type: boolean
Expand Down
Binary file added docs/modules/hdfs/images/hdfs_webui_kerberos.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 78 additions & 0 deletions docs/modules/hdfs/pages/usage-guide/security.adoc
@@ -0,0 +1,78 @@
= Security

== Authentication
Currently the only supported authentication mechanism is Kerberos, which is disabled by default.
For Kerberos to work a Kerberos KDC is needed, which the users needs to provide.
The xref:home:secret-operator:secretclass.adoc#backend-kerberoskeytab[secret-operator documentation] states which kind of Kerberos servers are supported and how they can be configured.

IMPORTANT: Kerberos is supported staring from HDFS version 3.3.x

=== 1. Prepare Kerberos server
To configure HDFS to use Kerberos you first need to collect information about your Kerberos server, e.g. hostname and port.
Additionally you need a service-user, which the secret-operator uses to create create principals for the HDFS services.

=== 2. Create Kerberos SecretClass
Afterwards you need to enter all the needed information into a SecretClass, as described in xref:home:secret-operator:secretclass.adoc#backend-kerberoskeytab[secret-operator documentation].
The following guide assumes you have named your SecretClass `kerberos-hdfs`.

=== 3. Configure HDFS to use SecretClass
The last step is to configure your HdfsCluster to use the newly created SecretClass.

[source,yaml]
----
spec:
clusterConfig:
authentication:
tlsSecretClass: tls # Optional, defaults to "tls"
kerberos:
secretClass: kerberos-hdfs # Put your SecretClass name in here
----

The `kerberos.secretClass` is used to give HDFS the possibility to request keytabs from the secret-operator.

The `tlsSecretClass` is needed to request TLS certificates, used e.g. for the Web UIs.


=== 4. Verify that Kerberos is used
Use `stackablectl services list --all-namespaces` to get the endpoints where the HDFS namenodes are reachable.
Open the link (note that the namenode is now using https).
You should see a Web UI similar to the following:

image:hdfs_webui_kerberos.png[]

The important part is

> Security is on.

You can also shell into the namenode and try to access the file system:
`kubectl exec -it hdfs-namenode-default-0 -c namenode -- bash -c 'kdestroy && bin/hdfs dfs -ls /'`

You should get the error message `org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]`.

=== 5. Access HDFS
In case you want to access your HDFS it is recommended to start up a client Pod that connects to HDFS, rather than shelling into the namenode.
We have an https://github.com/stackabletech/hdfs-operator/blob/main/tests/templates/kuttl/kerberos/20-access-hdfs.yaml.j2[integration test] for this exact purpose, where you can see how to connect and get a valid keytab.

== Authorization
We currently don't support authorization yet.
In the future support will be added by writing an opa-authorizer to match our general xref:home:concepts:opa.adoc[] mechanisms.

In the meantime a very basic level of authorization can be reached by using `configOverrides` to set the `hadoop.user.group.static.mapping.overrides` property.
In thew following example the `dr.who=;nn=;nm=;jn=;` part is needed for HDFS internal operations and the user `testuser` is granted admin permissions.

[source,yaml]
----
spec:
nameNodes:
configOverrides: &configOverrides
core-site.xml:
hadoop.user.group.static.mapping.overrides: "dr.who=;nn=;nm=;jn=;testuser=supergroup;"
dataNodes:
configOverrides: *configOverrides
journalNodes:
configOverrides: *configOverrides
----

== Wire encryption
In case kerberos is enabled, `Privacy` mode is used for best security.
Wire encryption without kerberos as well as https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html#Data_confidentiality[other wire encryption modes] are *not* supported.
1 change: 1 addition & 0 deletions docs/modules/hdfs/partials/nav.adoc
Expand Up @@ -7,6 +7,7 @@
* xref:hdfs:usage-guide/index.adoc[]
** xref:hdfs:usage-guide/resources.adoc[]
** xref:hdfs:usage-guide/logging-log-aggregation.adoc[]
** xref:hdfs:usage-guide/security.adoc[]
** xref:hdfs:usage-guide/monitoring.adoc[]
** xref:hdfs:usage-guide/scaling.adoc[]
** xref:hdfs:usage-guide/configuration-environment-overrides.adoc[]
Expand Down
6 changes: 6 additions & 0 deletions rust/crd/src/constants.rs
Expand Up @@ -12,6 +12,9 @@ pub const LABEL_STS_POD_NAME: &str = "statefulset.kubernetes.io/pod-name";

pub const HDFS_SITE_XML: &str = "hdfs-site.xml";
pub const CORE_SITE_XML: &str = "core-site.xml";
pub const HADOOP_POLICY_XML: &str = "hadoop-policy.xml";
pub const SSL_SERVER_XML: &str = "ssl-server.xml";
pub const SSL_CLIENT_XML: &str = "ssl-client.xml";
pub const LOG4J_PROPERTIES: &str = "log4j.properties";

pub const SERVICE_PORT_NAME_RPC: &str = "rpc";
Expand All @@ -23,10 +26,12 @@ pub const SERVICE_PORT_NAME_METRICS: &str = "metrics";

pub const DEFAULT_NAME_NODE_METRICS_PORT: u16 = 8183;
pub const DEFAULT_NAME_NODE_HTTP_PORT: u16 = 9870;
pub const DEFAULT_NAME_NODE_HTTPS_PORT: u16 = 9871;
pub const DEFAULT_NAME_NODE_RPC_PORT: u16 = 8020;

pub const DEFAULT_DATA_NODE_METRICS_PORT: u16 = 8082;
pub const DEFAULT_DATA_NODE_HTTP_PORT: u16 = 9864;
pub const DEFAULT_DATA_NODE_HTTPS_PORT: u16 = 9865;
pub const DEFAULT_DATA_NODE_DATA_PORT: u16 = 9866;
pub const DEFAULT_DATA_NODE_IPC_PORT: u16 = 9867;

Expand All @@ -40,6 +45,7 @@ pub const DFS_NAMENODE_NAME_DIR: &str = "dfs.namenode.name.dir";
pub const DFS_NAMENODE_SHARED_EDITS_DIR: &str = "dfs.namenode.shared.edits.dir";
pub const DFS_NAMENODE_RPC_ADDRESS: &str = "dfs.namenode.rpc-address";
pub const DFS_NAMENODE_HTTP_ADDRESS: &str = "dfs.namenode.http-address";
pub const DFS_NAMENODE_HTTPS_ADDRESS: &str = "dfs.namenode.https-address";
pub const DFS_DATANODE_DATA_DIR: &str = "dfs.datanode.data.dir";
pub const DFS_JOURNALNODE_EDITS_DIR: &str = "dfs.journalnode.edits.dir";
pub const DFS_JOURNALNODE_RPC_ADDRESS: &str = "dfs.journalnode.rpc-address";
Expand Down