Skip to content

Conversation

@iwasakims
Copy link
Member

No description provided.

@iwasakims
Copy link
Member Author

I manually tested that HDFS transparent encryption works on following config.yaml::

docker:
        memory_limit: "8g"
        image: "bigtop/puppet:trunk-centos-7"
distro: centos
components: [hdfs, yarn, kms]
enable_local_repo: true
smoke_test_components: [hdfs, yarn]

test steps::

$ cd provisioner/docker
$ ./docker-hadoop.sh -c 3
$ ./docker-hadoop.sh --exec 3 /bin/bash

 # hdfs dfs -mkdir /user/root/zone1
 
 # hadoop key create key1
 key1 has been successfully created with options Options{cipher='AES/CTR/NoPadding', bitLength=128, description='null', attributes=null}.
 org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider@1dde4cb2 has been updated.
 
 # sudo -u hdfs hdfs crypto -createZone -keyName key1 -path /user/root/zone1
 Added encryption zone /user/root/zone1
 
 # hdfs dfs -put /etc/hosts /user/root/zone1/
 # hdfs dfs -get /user/root/zone1/hosts /tmp/

@iwasakims
Copy link
Member Author

also tested on kerberos enabled cluster on config.yaml::

docker:
        memory_limit: "8g"
        image: "bigtop/puppet:trunk-centos-7"
distro: centos
components: [kerberos, hdfs, yarn, kms]
enable_local_repo: true
smoke_test_components: [hdfs]

with the following configs in bigtop-deploy/puppet/hieradata/bigtop/cluster.yaml::

# Kerberos
hadoop::hadoop_security_authentication: "kerberos"
kerberos::krb_site::domain: "bigtop.apache.org"
kerberos::krb_site::realm: "BIGTOP.APACHE.ORG"
kerberos::krb_site::kdc_server: "%{hiera('bigtop::hadoop_head_node')}"
kerberos::krb_site::kdc_port: "88"
kerberos::krb_site::admin_port: "749"
kerberos::krb_site::keytab_export_dir: "/var/lib/bigtop_keytabs"
hadoop::kerberos_realm: "%{hiera('kerberos::krb_site::realm')}"

test steps::

$ cd provisioner/docker
$ ./docker-hadoop.sh -c 3
$ ./docker-hadoop.sh --exec 3 /bin/bash

 # kinit -kt /etc/hdfs.keytab hdfs/$(hostname --fqdn)
 # hadoop key create key1
 # hdfs dfs -mkdir /zone1
 # hdfs crypto -createZone -keyName key1 -path /zone1
 # hdfs dfs -put /etc/hosts /zone1/
 # hdfs dfs -get /zone1/hosts /tmp/

@iwasakims
Copy link
Member Author

TODOs in follow-up JIRAs are

  • HA deployment with multiple KMS instances
  • https enabled configruation

#kerberos::krb_site::kdc_server: "%{hiera('bigtop::hadoop_head_node')}"
#kerberos::krb_site::kdc_port: "88"
#kerberos::krb_site::admin_port: "749"
#kerberos::krb_site::keytab_export_dir: "/var/lib/bigtop_keytabs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know why the name is changed to from site to krb_site? I think this may break compatibility.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Hiera variables are not injected unless the name space of properties match the class name of Puppet classes. The relevant class name was changed from site to krb_site in 3386a9d. The commented out configs (as example) seem to be left unchanged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. That was me ;) Glad you catch this. Thanks!

<% if @kms_host %>
<property>
<name>hadoop.security.key.provider.path</name>
<value>kms://http@<%= @kms_host %>:<%= @kms_port %>/kms</value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm zero knowledge about this, just wondering can this be https?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We need additional configurations in files such as ssl-client.xml, ssl-server.xml and server.xml (of Tomcat) for that. I would like to address that in another JIRA since all services of HDFS and YARN should be cared when we enable HTTPS on Web-UI and REST API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. No problem and thanks for the explanation.

default => [ ],
true => [ 'HTTP' ],
'enabled' => [ 'HTTP' ],
default => [ ],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code won't work hence you refactor here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The credential of HTTP/host@REALM did not written to keytab file since boolean true was not covered in the conditional of previous code.

@evans-ye
Copy link
Contributor

This is super awesome feature! I've left some comments. Thanks!

@evans-ye
Copy link
Contributor

Looks nice. +1.

<% if @hadoop_security_authentication == "kerberos" -%>
<property>
<name>hadoop.kms.authentication.kerberos.keytab</name>
<value>/etc/kms.keytab</value>
Copy link
Contributor

@evans-ye evans-ye Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware that there's a convention to put keytab under /etc, is it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the keytab created by kerberos::host_keytab resource. Other modules using the host_keytab resource such as HDFS and YARN seem to follow this convention as I can see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got It. Pretty comprehensive. Thanks!


<property>
<name>hadoop.kms.authentication.signer.secret.provider.zookeeper.kerberos.principal</name>
<value>kms/#HOSTNAME#</value>
Copy link
Contributor

@evans-ye evans-ye Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional for replacement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part came from the original kms-site.xml bundled with Hadoop. The value is used only if the value of hadoop.kms.authentication.signer.secret.provider is changed to zookeeper. ZKSignerSecretProvider is a feature for HA setup by which multiple KMS instances share the same signer secret via ZooKeeper. Since KMS HA is not supported in this patch, it is left as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation!

@evans-ye evans-ye merged commit 6307ef3 into apache:master Mar 15, 2020
@iwasakims
Copy link
Member Author

Thanks, @evans-ye.

evans-ye pushed a commit to evans-ye/bigtop that referenced this pull request May 10, 2020
* BIGTOP-3300. Add puppet manifests for hadoop-kms.

* fixed role assingment, kms kerberos configs and bugs in kerberos module.

* tightened permission of kms-env.sh containing keystore password.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants