Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3489] improvement(hive-catalog): Add user authentication e2e test for Hive catalog #3525

Merged
merged 63 commits into from
May 27, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
fd390d0
Add kerberos IT
yuqi1129 May 17, 2024
780e375
Fix
yuqi1129 May 18, 2024
9c0a2ed
Fix test error
yuqi1129 May 18, 2024
3a6a31b
Merge branch 'main' of github.com:datastrato/graviton into issue_3432
yuqi1129 May 20, 2024
3f06416
Add fix.
yuqi1129 May 20, 2024
85e6b73
fix
yuqi1129 May 20, 2024
66640fc
fix
yuqi1129 May 20, 2024
01dbf60
Fix
yuqi1129 May 20, 2024
d6960c8
Merge branch 'main' into issue_3432
yuqi1129 May 20, 2024
e71dde3
Remove unused code.
yuqi1129 May 21, 2024
5a26aa1
Merge remote-tracking branch 'me/issue_3432' into issue_3432
yuqi1129 May 21, 2024
6488d03
Merge branch 'main' into issue_3432
yuqi1129 May 21, 2024
8aa9b4d
optimize code.
yuqi1129 May 21, 2024
4609efb
Merge remote-tracking branch 'me/issue_3432' into issue_3432
yuqi1129 May 21, 2024
30d4c60
Fix mistake
yuqi1129 May 21, 2024
965f63a
Revert the code that check status of `show databases` for Hive contai…
yuqi1129 May 21, 2024
0f8051c
Merge branch 'main' into issue_3432
yuqi1129 May 21, 2024
6919203
Merge main and rebase the code.
yuqi1129 May 22, 2024
ac82116
Merge remote-tracking branch 'me/issue_3432' into issue_3432
yuqi1129 May 22, 2024
b30294c
Merge branch 'issue_3432' into issue_3489
yuqi1129 May 22, 2024
cf212b2
Merge branch 'issue_3432' of github.com:yuqi1129/gravitino into issue…
yuqi1129 May 22, 2024
fee42e5
Merge main and resolve conflicts
yuqi1129 May 23, 2024
3875934
Add user e2e test for Hive catalog
yuqi1129 May 23, 2024
a5d07cb
Fix
yuqi1129 May 23, 2024
56af48f
Fix
yuqi1129 May 23, 2024
05a40ac
Fix test error.
yuqi1129 May 23, 2024
d3a4e5f
fix
yuqi1129 May 23, 2024
291907d
Merge branch 'main' into issue_3489
yuqi1129 May 23, 2024
f94f118
fix
yuqi1129 May 24, 2024
7f50dba
Merge remote-tracking branch 'me/issue_3489' into issue_3489
yuqi1129 May 24, 2024
a231e49
fix
yuqi1129 May 24, 2024
b3cc997
fix
yuqi1129 May 24, 2024
fb29d97
Fix style
yuqi1129 May 24, 2024
845f393
Fix test error.
yuqi1129 May 24, 2024
7c571e4
Fix test error
yuqi1129 May 24, 2024
c73635b
fix ut again
yuqi1129 May 24, 2024
538f91c
Merge branch 'main' into issue_3489
yuqi1129 May 24, 2024
832719c
Fix compile error.
yuqi1129 May 24, 2024
e164532
Merge remote-tracking branch 'me/issue_3489' into issue_3489
yuqi1129 May 24, 2024
84648c0
Fix test
yuqi1129 May 24, 2024
8014be8
Fix test
yuqi1129 May 24, 2024
603bb54
Fix
yuqi1129 May 24, 2024
5bce6c1
Add debug info
yuqi1129 May 25, 2024
c71cbc6
fix
yuqi1129 May 25, 2024
9c200ba
fix
yuqi1129 May 25, 2024
5c11d79
fix
yuqi1129 May 25, 2024
3761b1a
fix
yuqi1129 May 25, 2024
d26e139
fix
yuqi1129 May 25, 2024
a60e6ff
fix
yuqi1129 May 25, 2024
1eab58f
fix again
yuqi1129 May 25, 2024
6524f00
Fix again
yuqi1129 May 25, 2024
05a810a
fix
yuqi1129 May 25, 2024
bb8d5f7
Fix
yuqi1129 May 25, 2024
e8d6a8c
Fix
yuqi1129 May 26, 2024
e6afd7f
Merge branch 'main' into issue_3489
yuqi1129 May 26, 2024
74d1b87
Revert some code.
yuqi1129 May 26, 2024
618e1ca
Merge remote-tracking branch 'me/issue_3489' into issue_3489
yuqi1129 May 26, 2024
f913b68
fix
yuqi1129 May 26, 2024
f63d1cf
Remove some unnecessary log.
yuqi1129 May 26, 2024
7cb5b8b
Revert some code again
yuqi1129 May 26, 2024
b1e04b6
Optimize code.
yuqi1129 May 27, 2024
e004750
Add more assertions
yuqi1129 May 27, 2024
9d9766b
Merge branch 'main' into issue_3489
yuqi1129 May 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
/** Helper methods to create SortOrders to pass into Gravitino. */
public class SortOrders {

/** NONE is used to indicate that there is no sort order. */
public static final SortOrder[] NONE = new SortOrder[0];
/**
yuqi1129 marked this conversation as resolved.
Show resolved Hide resolved
* Create a sort order by the given expression with the ascending sort direction and nulls first
* ordering.
Expand Down
1 change: 1 addition & 0 deletions catalogs/catalog-hive/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ tasks.test {

doFirst {
environment("GRAVITINO_CI_HIVE_DOCKER_IMAGE", "datastrato/gravitino-ci-hive:0.1.12")
environment("GRAVITINO_CI_KERBEROS_HIVE_DOCKER_IMAGE", "datastrato/gravitino-ci-kerberos-hive:0.1.1")
}

val init = project.extra.get("initIntegrationTest") as (Test) -> Unit
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ public Object doAs(
ops.getClientPool()
.run(
client -> {
return client.getDelegationToken(realUser.getUserName(), principal.getName());
return client.getDelegationToken(principal.getName(), realUser.getUserName());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a bug, just fixed it by the way.

});

Token<DelegationTokenIdentifier> delegationToken = new Token<DelegationTokenIdentifier>();
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
/*
* Copyright 2024 Datastrato Pvt Ltd.
* This software is licensed under the Apache License version 2.
*/

package com.datastrato.gravitino.catalog.hive.integration.test;

import static com.datastrato.gravitino.catalog.hive.HiveCatalogPropertiesMeta.IMPERSONATION_ENABLE;
import static com.datastrato.gravitino.catalog.hive.HiveCatalogPropertiesMeta.KET_TAB_URI;
import static com.datastrato.gravitino.catalog.hive.HiveCatalogPropertiesMeta.METASTORE_URIS;
import static com.datastrato.gravitino.catalog.hive.HiveCatalogPropertiesMeta.PRINCIPAL;
import static com.datastrato.gravitino.connector.BaseCatalog.CATALOG_BYPASS_PREFIX;
import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION;

import com.datastrato.gravitino.Catalog;
import com.datastrato.gravitino.NameIdentifier;
import com.datastrato.gravitino.client.GravitinoAdminClient;
import com.datastrato.gravitino.client.GravitinoMetalake;
import com.datastrato.gravitino.client.KerberosTokenProvider;
import com.datastrato.gravitino.integration.test.container.ContainerSuite;
import com.datastrato.gravitino.integration.test.container.HiveContainer;
import com.datastrato.gravitino.integration.test.util.AbstractIT;
import com.datastrato.gravitino.rel.Column;
import com.datastrato.gravitino.rel.expressions.distributions.Distributions;
import com.datastrato.gravitino.rel.expressions.sorts.SortOrders;
import com.datastrato.gravitino.rel.expressions.transforms.Transforms;
import com.datastrato.gravitino.rel.types.Types;
import com.google.common.base.Throwables;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Maps;
import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.util.Map;
import org.apache.commons.io.FileUtils;
import org.apache.hadoop.security.UserGroupInformation;
import org.junit.jupiter.api.AfterAll;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Tag;
import org.junit.jupiter.api.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

@Tag("gravitino-docker-it")
public class HiveUserAuthenticationIT extends AbstractIT {
private static final Logger LOG = LoggerFactory.getLogger(HiveUserAuthenticationIT.class);

private static final ContainerSuite containerSuite = ContainerSuite.getInstance();

private static final String GRAVITINO_CLIENT_PRINCIPAL = "gravitino_client@HADOOPKRB";
private static final String GRAVITINO_CLIENT_KEYTAB = "/gravitino_client.keytab";

private static final String GRAVITINO_SERVER_PRINCIPAL = "HTTP/localhost@HADOOPKRB";
private static final String GRAVITINO_SERVER_KEYTAB = "/gravitino_server.keytab";

private static final String HIVE_METASTORE_CLIENT_PRINCIPAL = "cli@HADOOPKRB";
private static final String HIVE_METASTORE_CLIENT_KEYTAB = "/client.keytab";

private static String TMP_DIR;

private static String HIVE_METASTORE_URI;

private static GravitinoAdminClient adminClient;

private static HiveContainer kerberosHiveContainer;

private static final String METALAKE_NAME = "test_metalake";
private static final String CATALOG_NAME = "test_catalog";
private static final String SCHEMA_NAME = "test_schema";
private static final String TABLE_NAME = "test_table";

private static final String HIVE_COL_NAME1 = "col1";
private static final String HIVE_COL_NAME2 = "col2";
private static final String HIVE_COL_NAME3 = "col3";

@BeforeAll
public static void startIntegrationTest() throws Exception {
containerSuite.startKerberosHiveContainer();
kerberosHiveContainer = containerSuite.getKerberosHiveContainer();

File baseDir = new File(System.getProperty("java.io.tmpdir"));
File file = Files.createTempDirectory(baseDir.toPath(), "test").toFile();
file.deleteOnExit();
TMP_DIR = file.getAbsolutePath();

HIVE_METASTORE_URI =
String.format(
"thrift://%s:%d",
kerberosHiveContainer.getContainerIpAddress(), HiveContainer.HIVE_METASTORE_PORT);

// Prepare kerberos related-config;
prepareKerberosConfig();

// Config kerberos configuration for Gravitino server
addKerberosConfig();

// Start Gravitino server
AbstractIT.startIntegrationTest();
}

@AfterAll
public static void stopIntegrationTest() {
// Reset the UGI
UserGroupInformation.reset();

// Clean up the kerberos configuration
System.clearProperty("java.security.krb5.conf");
System.clearProperty("sun.security.krb5.debug");
}

private static void prepareKerberosConfig() throws IOException {
// Keytab of the Gravitino SDK client
kerberosHiveContainer
.getContainer()
.copyFileFromContainer("/gravitino_client.keytab", TMP_DIR + GRAVITINO_CLIENT_KEYTAB);

// Keytab of the Gravitino server
kerberosHiveContainer
.getContainer()
.copyFileFromContainer("/gravitino_server.keytab", TMP_DIR + GRAVITINO_SERVER_KEYTAB);

// Keytab of Gravitino server to connector to Hive
kerberosHiveContainer
.getContainer()
.copyFileFromContainer("/etc/admin.keytab", TMP_DIR + HIVE_METASTORE_CLIENT_KEYTAB);

String tmpKrb5Path = TMP_DIR + "krb5.conf_tmp";
String krb5Path = TMP_DIR + "krb5.conf";
kerberosHiveContainer.getContainer().copyFileFromContainer("/etc/krb5.conf", tmpKrb5Path);

// Modify the krb5.conf and change the kdc and admin_server to the container IP
String ip = containerSuite.getKerberosHiveContainer().getContainerIpAddress();
String content = FileUtils.readFileToString(new File(tmpKrb5Path), StandardCharsets.UTF_8);
content = content.replace("kdc = localhost:88", "kdc = " + ip + ":88");
content = content.replace("admin_server = localhost", "admin_server = " + ip + ":749");
FileUtils.write(new File(krb5Path), content, StandardCharsets.UTF_8);

LOG.info("Kerberos kdc config:\n{}", content);
System.setProperty("java.security.krb5.conf", krb5Path);
System.setProperty("sun.security.krb5.debug", "true");
}

private static void addKerberosConfig() {
AbstractIT.customConfigs.put("gravitino.authenticator", "kerberos");
AbstractIT.customConfigs.put(
"gravitino.authenticator.kerberos.principal", GRAVITINO_SERVER_PRINCIPAL);
AbstractIT.customConfigs.put(
"gravitino.authenticator.kerberos.keytab", TMP_DIR + GRAVITINO_SERVER_KEYTAB);
}

@Test
public void testUserAuthentication() {
KerberosTokenProvider provider =
KerberosTokenProvider.builder()
.withClientPrincipal(GRAVITINO_CLIENT_PRINCIPAL)
.withKeyTabFile(new File(TMP_DIR + GRAVITINO_CLIENT_KEYTAB))
.build();
adminClient = GravitinoAdminClient.builder(serverUri).withKerberosAuth(provider).build();

GravitinoMetalake[] metalakes = adminClient.listMetalakes();
Assertions.assertEquals(0, metalakes.length);

GravitinoMetalake gravitinoMetalake =
adminClient.createMetalake(METALAKE_NAME, null, ImmutableMap.of());

Map<String, String> properties = Maps.newHashMap();
properties.put(METASTORE_URIS, HIVE_METASTORE_URI);
properties.put(IMPERSONATION_ENABLE, "true");
properties.put(KET_TAB_URI, TMP_DIR + HIVE_METASTORE_CLIENT_KEYTAB);
yuqi1129 marked this conversation as resolved.
Show resolved Hide resolved
properties.put(PRINCIPAL, HIVE_METASTORE_CLIENT_PRINCIPAL);

properties.put(CATALOG_BYPASS_PREFIX + HADOOP_SECURITY_AUTHENTICATION, "kerberos");
properties.put(
CATALOG_BYPASS_PREFIX + "hive.metastore.kerberos.principal",
"hive/_HOST@HADOOPKRB"
.replace("_HOST", containerSuite.getKerberosHiveContainer().getHostName()));
properties.put(CATALOG_BYPASS_PREFIX + "hive.metastore.sasl.enabled", "true");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to enable sasl here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@jerryshao jerryshao May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why, not what you did, to enable this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we enable Kerberos for Hive, we need to use SSL or SASL security for the transport layer, and compared to SSL, SASL is relatively simple to configure, so we choose to use SASL. That is why the Hive metastore and Hive client need to be enabled.


Catalog catalog =
gravitinoMetalake.createCatalog(
CATALOG_NAME, Catalog.Type.RELATIONAL, "hive", "comment", properties);

Exception exception =
Assertions.assertThrows(
Exception.class,
() -> catalog.asSchemas().createSchema(SCHEMA_NAME, "comment", ImmutableMap.of()));
String exceptionMessage = Throwables.getStackTraceAsString(exception);
// Make sure real user is 'gravitino_client'
Assertions.assertTrue(
exceptionMessage.contains("Permission denied: user=gravitino_client, access=WRITE"));

// Now try to give the user the permission to create schema again
kerberosHiveContainer.executeInContainer(
"hadoop", "fs", "-chmod", "-R", "777", "/user/hive/warehouse");
Assertions.assertDoesNotThrow(
() -> catalog.asSchemas().createSchema(SCHEMA_NAME, "comment", ImmutableMap.of()));

// Create table
NameIdentifier tableNameIdentifier =
NameIdentifier.of(METALAKE_NAME, CATALOG_NAME, SCHEMA_NAME, TABLE_NAME);
catalog
.asTableCatalog()
.createTable(
tableNameIdentifier,
createColumns(),
"",
ImmutableMap.of(),
Transforms.EMPTY_TRANSFORM,
Distributions.NONE,
SortOrders.NONE);
}

private static Column[] createColumns() {
Column col1 = Column.of(HIVE_COL_NAME1, Types.ByteType.get(), "col_1_comment");
Column col2 = Column.of(HIVE_COL_NAME2, Types.DateType.get(), "col_2_comment");
Column col3 = Column.of(HIVE_COL_NAME3, Types.StringType.get(), "col_3_comment");
return new Column[] {col1, col2, col3};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd better add more tests, like drop table, create and drop schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

}
}
40 changes: 40 additions & 0 deletions dev/docker/kerberos-hive/core-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,41 @@
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.hive.users</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.cli.hosts</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.cli.groups</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.cli.users</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.hadoop.users</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
Expand All @@ -34,6 +69,11 @@
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.root.users</name>
<value>*</value>
</property>

<property>
<name>hadoop.security.auth_to_local</name>
<value>
Expand Down
7 changes: 7 additions & 0 deletions dev/docker/kerberos-hive/start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@ kadmin.local -q "ktadd -norandkey -k ${KRB5_KTNAME} hive/${HOSTNAME}@${FQDN}"
kadmin.local -q "xst -k /hive.keytab -norandkey hive/${HOSTNAME}@${FQDN}"
kadmin.local -q "xst -k /cli.keytab -norandkey cli@${FQDN}"

# For Gravitino web server
echo -e "${PASS}\n${PASS}" | kadmin.local -q "addprinc gravitino_client@${FQDN}"
kadmin.local -q "ktadd -norandkey -k /gravitino_client.keytab gravitino_client@${FQDN}"

echo -e "${PASS}\n${PASS}" | kadmin.local -q "addprinc HTTP/localhost@${FQDN}"
kadmin.local -q "ktadd -norandkey -k /gravitino_server.keytab HTTP/localhost@${FQDN}"
yuqi1129 marked this conversation as resolved.
Show resolved Hide resolved

echo -e "${PASS}\n" | kinit hive/${HOSTNAME}

# Update the configuration file
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,8 @@
import com.datastrato.gravitino.Configs;
import com.datastrato.gravitino.auth.AuthenticatorType;
import com.datastrato.gravitino.auxiliary.AuxiliaryServiceManager;
import com.datastrato.gravitino.client.ErrorHandlers;
import com.datastrato.gravitino.client.HTTPClient;
import com.datastrato.gravitino.client.RESTClient;
import com.datastrato.gravitino.dto.responses.VersionResponse;
import com.datastrato.gravitino.exceptions.RESTException;
import com.datastrato.gravitino.integration.test.util.ITUtils;
import com.datastrato.gravitino.integration.test.util.KerberosProviderHelper;
import com.datastrato.gravitino.integration.test.util.OAuthMockDataProvider;
Expand All @@ -28,7 +25,6 @@
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;
Expand Down Expand Up @@ -242,26 +238,35 @@ private void customizeConfigFile(String configTempFileName, String configFileNam
}

private boolean checkIfServerIsRunning() {
String URI = String.format("http://%s:%d", host, port);
LOG.info("checkIfServerIsRunning() URI: {}", URI);
// String URI = String.format("http://%s:%d", host, port);
// LOG.info("checkIfServerIsRunning() URI: {}", URI);
//
// VersionResponse response = null;
// try {
// response =
// restClient.get(
// "api/version",
// VersionResponse.class,
// Collections.emptyMap(),
// ErrorHandlers.restErrorHandler());
// } catch (RESTException e) {
// LOG.warn("checkIfServerIsRunning() fails, GravitinoServer is not running {}",
// e.getMessage());
// return false;
// }
// if (response != null && response.getCode() == 0) {
// return true;
// } else {
// LOG.warn("checkIfServerIsRunning() fails, GravitinoServer is not running");
// return false;
// }

VersionResponse response = null;
try {
response =
restClient.get(
"api/version",
VersionResponse.class,
Collections.emptyMap(),
ErrorHandlers.restErrorHandler());
} catch (RESTException e) {
LOG.warn("checkIfServerIsRunning() fails, GravitinoServer is not running {}", e.getMessage());
return false;
}
if (response != null && response.getCode() == 0) {
return true;
} else {
LOG.warn("checkIfServerIsRunning() fails, GravitinoServer is not running");
Thread.sleep(5000);
} catch (Exception e) {
return false;
}

return true;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,6 @@ public void startKerberosHiveContainer() {
HiveContainer.Builder hiveBuilder =
HiveContainer.builder()
.withHostName("gravitino-ci-kerberos-hive")
.withEnvVars(ImmutableMap.<String, String>builder().build())
.withKerberosEnabled(true)
.withNetwork(network);
HiveContainer container = closer.register(hiveBuilder.build());
Expand Down
Loading
Loading