From 8c688bb887180d269effe183bffd9dc576c5b42e Mon Sep 17 00:00:00 2001 From: K0K0V0K Date: Sat, 22 Nov 2025 10:23:56 +0100 Subject: [PATCH 1/4] MAPREDUCE-7523. MapReduce Task-Level Security Enforcement The goal of this feature tp provide a configurable mechanism to control which users are allowed to execute specific MapReduce jobs. This feature aims to prevent unauthorized or potentially harmful mapper/reducer implementations from running within the Hadoop cluster. In the standard Hadoop MapReduce execution flow: 1) A MapReduce job is submitted by a user. 2) The job is registered with the Resource Manager (RM). 3) The RM assigns the job to a Node Manager (NM), where the Application Master (AM) for the job is launched. 4) The AM requests additional containers from the cluster, to be able to start tasks. 5) The NM launches those containers, and the containers execute the mapper/reducer tasks defined by the job. The proposed feature introduces a security filtering mechanism inside the Application Master. Before mapper or reducer tasks are launched, the AM will verify that the user-submitted MapReduce code complies with a cluster-defined security policy. This ensures that only approved classes or packages can be executed inside the containers. The goal is to protect the cluster from unwanted or unsafe task implementations, such as custom code that may introduce performance, stability, or security risks. Upon receiving job metadata, the Application Master will: 1) Check the feature is enabled. 2) Check the user who submitted the job is allowed to bypass the security check. 3) Compare classes in job config against the denied task list. 4) If job is not authorised an exception will be thrown and AM will fail. New Configs Enables MapReduce Task-Level Security Enforcement When enabled, the Application Master performs validation of user-submitted mapper, reducer, and other task-related classes before launching containers. This mechanism protects the cluster from running disallowed or unsafe task implementations as defined by administrator-controlled policies. - Property name: mapreduce.security.enabled - Property type: boolean - Default: false (security disabled) MapReduce Task-Level Security Enforcement: Property Domain Defines the set of MapReduce configuration keys that represent user-supplied class names involved in task execution (e.g., mapper, reducer, partitioner). The Application Master examines the values of these properties and checks whether any referenced class is listed in denied tasks. Administrators may override this list to expand or restrict the validation domain. - Property name: mapreduce.security.property-domain - Property type: list of configuration keys - Default: map.sort.class mapreduce.job.classloader.system.classes mapreduce.job.combine.class mapreduce.job.combiner.group.comparator.class mapreduce.job.end-notification.custom-notifier-class mapreduce.job.inputformat.class mapreduce.job.map.class mapreduce.job.map.output.collector.class mapreduce.job.output.group.comparator.class mapreduce.job.output.key.class mapreduce.job.output.key.comparator.class mapreduce.job.output.value.class mapreduce.job.outputformat.class mapreduce.job.partitioner.class mapreduce.job.reduce.class mapreduce.map.output.key.class mapreduce.map.output.value.class MapReduce Task-Level Security Enforcement: Denied Tasks Specifies the list of disallowed task implementation classes or packages. If a user submits a job whose mapper, reducer, or other task-related classes match any entry in this blacklist. - Property name: mapreduce.security.denied-tasks - Property type: list of class name or package patterns - Default: empty - Example: org.apache.hadoop.streaming,org.apache.hadoop.examples.QuasiMonteCarlo MapReduce Task-Level Security Enforcement: Allowed Users Specifies users who may bypass the blacklist defined in denied tasks. This whitelist is intended for trusted or system-level workflows that may legitimately require the use of restricted task implementations. If the submitting user is listed here, blacklist enforcement is skipped, although standard Hadoop authentication and ACL checks still apply. - Property name: mapreduce.security.allowed-users - Property type: list of usernames - Default: empty - Example: alice,bob --- .../hadoop/mapreduce/v2/app/MRAppMaster.java | 2 + .../authorize/TaskLevelSecurityEnforcer.java | 97 ++++++++++++++ .../authorize/TaskLevelSecurityException.java | 43 ++++++ .../TestTaskLevelSecurityEnforcer.java | 124 ++++++++++++++++++ .../org/apache/hadoop/mapreduce/MRConfig.java | 77 +++++++++++ .../src/main/resources/mapred-default.xml | 40 ++++++ .../markdown/TaskLevelSecurityEnforcement.md | 92 +++++++++++++ hadoop-project/src/site/site.xml | 1 + 8 files changed, 476 insertions(+) create mode 100644 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java create mode 100644 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityException.java create mode 100644 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java create mode 100644 hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/TaskLevelSecurityEnforcement.md diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java index 703f0b1f58778..20c52b093616a 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java @@ -114,6 +114,7 @@ import org.apache.hadoop.mapreduce.v2.app.rm.RMHeartbeatHandler; import org.apache.hadoop.mapreduce.v2.app.rm.preemption.AMPreemptionPolicy; import org.apache.hadoop.mapreduce.v2.app.rm.preemption.NoopAMPreemptionPolicy; +import org.apache.hadoop.mapreduce.v2.app.security.authorize.TaskLevelSecurityEnforcer; import org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator; import org.apache.hadoop.mapreduce.v2.app.speculate.Speculator; import org.apache.hadoop.mapreduce.v2.app.speculate.SpeculatorEvent; @@ -1683,6 +1684,7 @@ public static void main(String[] args) { String jobUserName = System .getenv(ApplicationConstants.Environment.USER.name()); conf.set(MRJobConfig.USER_NAME, jobUserName); + TaskLevelSecurityEnforcer.validate(conf); initAndStartAppMaster(appMaster, conf, jobUserName); } catch (Throwable t) { LOG.error("Error starting MRAppMaster", t); diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java new file mode 100644 index 0000000000000..0d469b05241ad --- /dev/null +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java @@ -0,0 +1,97 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.mapreduce.v2.app.security.authorize; + +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapreduce.MRConfig; +import org.apache.hadoop.mapreduce.MRJobConfig; + +/** + * Enforces task-level security rules for MapReduce jobs. + * + *

This security enforcement mechanism validates whether the user who submitted + * a job is allowed to execute the mapper/reducer/task classes defined in the job + * configuration. The check is performed inside the Application Master before + * task containers are launched.

+ *

If the user is not on the allowed list and any job property within the configured + * security property domain references a denied class/prefix, a + * {@link TaskLevelSecurityException} is thrown and the job is rejected.

+ *

This prevents unauthorized or unsafe custom code from running inside + * cluster containers.

+ */ +public class TaskLevelSecurityEnforcer { + private static final Logger LOG = LoggerFactory.getLogger(TaskLevelSecurityEnforcer.class); + + /** + * Validates a MapReduce job's configuration against the cluster's task-level + * security policy. + * + *

The method performs the following steps:

+ *
    + *
  1. Check whether task-level security is enabled.
  2. + *
  3. Allow the job immediately if the user is on the configured allowed-users list.
  4. + *
  5. Retrieve the security property domain (list of job configuration keys to inspect).
  6. + *
  7. Retrieve the list of denied task class prefixes.
  8. + *
  9. For each property in the domain, check whether its value begins with any denied prefix.
  10. + *
  11. If a match is found, reject the job by throwing {@link TaskLevelSecurityException}.
  12. + *
+ * + * @param conf the job configuration to validate + * @throws TaskLevelSecurityException if the user is not authorized to use one of the task classes + */ + public static void validate(JobConf conf) throws TaskLevelSecurityException { + if (!conf.getBoolean(MRConfig.SECURITY_ENABLED, MRConfig.DEFAULT_SECURITY_ENABLED)) { + LOG.debug("The {} is disabled", MRConfig.SECURITY_ENABLED); + return; + } + + String currentUser = conf.get(MRJobConfig.USER_NAME); + List allowedUsers = Arrays.asList(conf.getTrimmedStrings( + MRConfig.SECURITY_ALLOWED_USERS, + MRConfig.DEFAULT_SECURITY_ALLOWED_USERS + )); + if (allowedUsers.contains(currentUser)) { + LOG.debug("The {} is allowed to execute every task", currentUser); + return; + } + + String[] propertyDomain = conf.getTrimmedStrings( + MRConfig.SECURITY_PROPERTY_DOMAIN, + MRConfig.DEFAULT_SECURITY_PROPERTY_DOMAIN + ); + String[] deniedTasks = conf.getTrimmedStrings( + MRConfig.SECURITY_DENIED_TASKS, + MRConfig.DEFAULT_SECURITY_DENIED_TASKS + ); + for (String property : propertyDomain) { + String propertyValue = conf.get(property, ""); + for (String deniedTask : deniedTasks) { + if (propertyValue.startsWith(deniedTask)) { + throw new TaskLevelSecurityException(currentUser, property, propertyValue, deniedTask); + } + } + } + LOG.debug("The {} is allowed to execute the submitted job", currentUser); + } +} diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityException.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityException.java new file mode 100644 index 0000000000000..8b0c21d5bed2a --- /dev/null +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityException.java @@ -0,0 +1,43 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.mapreduce.v2.app.security.authorize; + +import org.apache.hadoop.security.AccessControlException; + +/** + * Exception thrown when a MapReduce job violates the Task-Level Security policy. + */ +public class TaskLevelSecurityException extends AccessControlException { + + /** + * Constructs a new TaskLevelSecurityException describing the specific policy violation. + * + * @param user the submitting user + * @param property the MapReduce configuration key that was checked + * @param propertyValue the value provided for that configuration property + * @param deniedTask the blacklist entry that the value matched + */ + public TaskLevelSecurityException( + String user, String property, String propertyValue, String deniedTask + ) { + super(String.format( + "The %s is not allowed to use %s = %s config, cause it match with %s denied task", + user, property, propertyValue, deniedTask + )); + } +} diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java new file mode 100644 index 0000000000000..675ebe81d8589 --- /dev/null +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java @@ -0,0 +1,124 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.mapreduce.v2.app.security.authorize; + +import org.junit.jupiter.api.Test; + +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapreduce.MRConfig; +import org.apache.hadoop.mapreduce.MRJobConfig; + +import static org.junit.jupiter.api.Assertions.assertDoesNotThrow; +import static org.junit.jupiter.api.Assertions.assertThrows; + +public class TestTaskLevelSecurityEnforcer { + + @Test + public void testServiceDisabled() { + JobConf conf = new JobConf(); + assertPass(conf); + } + + @Test + public void testServiceEnabled() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, true); + assertPass(conf); + } + + @Test + public void testDeniedPackage() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, true); + conf.setStrings(MRConfig.SECURITY_DENIED_TASKS, "org.apache.hadoop.streaming"); + conf.set(MRJobConfig.MAP_CLASS_ATTR, "org.apache.hadoop.streaming.PipeMapper"); + assertDenied(conf); + } + + @Test + public void testDeniedClass() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, true); + conf.setStrings(MRConfig.SECURITY_DENIED_TASKS, + "org.apache.hadoop.streaming", + "org.apache.hadoop.examples.QuasiMonteCarlo$QmcReducer"); + conf.set(MRJobConfig.REDUCE_CLASS_ATTR, + "org.apache.hadoop.examples.QuasiMonteCarlo$QmcReducer"); + assertDenied(conf); + } + + @Test + public void testIgnoreReducer() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, true); + conf.setStrings(MRConfig.SECURITY_PROPERTY_DOMAIN, + MRJobConfig.MAP_CLASS_ATTR, + MRJobConfig.COMBINE_CLASS_ATTR); + conf.setStrings(MRConfig.SECURITY_DENIED_TASKS, + "org.apache.hadoop.streaming", + "org.apache.hadoop.examples.QuasiMonteCarlo$QmcReducer"); + conf.set(MRJobConfig.REDUCE_CLASS_ATTR, + "org.apache.hadoop.examples.QuasiMonteCarlo$QmcReducer"); + assertPass(conf); + } + + @Test + public void testDeniedUser() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, true); + conf.setStrings(MRConfig.SECURITY_DENIED_TASKS, "org.apache.hadoop.streaming"); + conf.setStrings(MRConfig.SECURITY_ALLOWED_USERS, "alice"); + conf.set(MRJobConfig.MAP_CLASS_ATTR, "org.apache.hadoop.streaming.PipeMapper"); + conf.set(MRJobConfig.USER_NAME, "bob"); + assertDenied(conf); + } + + @Test + public void testAllowedUser() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, true); + conf.setStrings(MRConfig.SECURITY_DENIED_TASKS, "org.apache.hadoop.streaming"); + conf.setStrings(MRConfig.SECURITY_ALLOWED_USERS, "alice", "bob"); + conf.set(MRJobConfig.MAP_CLASS_ATTR, "org.apache.hadoop.streaming.PipeMapper"); + conf.set(MRJobConfig.USER_NAME, "bob"); + assertPass(conf); + } + + @Test + public void testTurnOff() { + JobConf conf = new JobConf(); + conf.setBoolean(MRConfig.SECURITY_ENABLED, false); + conf.setStrings(MRConfig.SECURITY_DENIED_TASKS, "org.apache.hadoop.streaming"); + conf.setStrings(MRConfig.SECURITY_ALLOWED_USERS, "alice"); + conf.set(MRJobConfig.MAP_CLASS_ATTR, "org.apache.hadoop.streaming.PipeMapper"); + conf.set(MRJobConfig.USER_NAME, "bob"); + assertPass(conf); + } + + private void assertPass(JobConf conf) { + assertDoesNotThrow( + () -> TaskLevelSecurityEnforcer.validate(conf), + "Config denied but validation pass was expected"); + } + + private void assertDenied(JobConf conf) { + assertThrows(TaskLevelSecurityException.class, + () -> TaskLevelSecurityEnforcer.validate(conf), + "Config validation pass but denied was expected"); + } +} diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java index 8671eb30b993a..167f0453ac792 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java @@ -133,5 +133,82 @@ public interface MRConfig { boolean DEFAULT_MASTER_WEBAPP_UI_ACTIONS_ENABLED = true; String MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = "mapreduce.multiple-outputs-close-threads"; int DEFAULT_MULTIPLE_OUTPUTS_CLOSE_THREAD_COUNT = 10; + + /** + * Enables MapReduce Task-Level Security Enforcement. + * + * When enabled, the Application Master performs validation of user-submitted + * mapper, reducer, and other task-related classes before launching containers. + * This mechanism protects the cluster from running disallowed or unsafe task + * implementations as defined by administrator-controlled policies. + * + * Property type: boolean + * Default: false (security disabled) + */ + String SECURITY_ENABLED = "mapreduce.security.enabled"; + boolean DEFAULT_SECURITY_ENABLED = false; + + /** + * MapReduce Task-Level Security Enforcement: Property Domain + * + * Defines the set of MapReduce configuration keys that represent user-supplied + * class names involved in task execution (e.g., mapper, reducer, partitioner). + * The Application Master examines the values of these properties and checks + * whether any referenced class is listed in {@link #SECURITY_DENIED_TASKS}. + * Administrators may override this list to expand or restrict the validation + * domain. + * + * Property type: list of configuration keys + * Default: all known task-level class properties (see list below) + */ + String SECURITY_PROPERTY_DOMAIN = "mapreduce.security.property-domain"; + String[] DEFAULT_SECURITY_PROPERTY_DOMAIN = { + "mapreduce.job.combine.class", + "mapreduce.job.combiner.group.comparator.class", + "mapreduce.job.end-notification.custom-notifier-class", + "mapreduce.job.inputformat.class", + "mapreduce.job.map.class", + "mapreduce.job.map.output.collector.class", + "mapreduce.job.output.group.comparator.class", + "mapreduce.job.output.key.class", + "mapreduce.job.output.key.comparator.class", + "mapreduce.job.output.value.class", + "mapreduce.job.outputformat.class", + "mapreduce.job.partitioner.class", + "mapreduce.job.reduce.class", + "mapreduce.map.output.key.class", + "mapreduce.map.output.value.class" + }; + + /** + * MapReduce Task-Level Security Enforcement: Denied Tasks + * + * Specifies the list of disallowed task implementation classes or packages. + * If a user submits a job whose mapper, reducer, or other task-related classes + * match any entry in this blacklist. + * + * Property type: list of class name or package patterns + * Default: empty (no restrictions) + * Example: org.apache.hadoop.streaming,org.apache.hadoop.examples.QuasiMonteCarlo + */ + String SECURITY_DENIED_TASKS = "mapreduce.security.denied-tasks"; + String[] DEFAULT_SECURITY_DENIED_TASKS = {}; + + /** + * MapReduce Task-Level Security Enforcement: Allowed Users + * + * Specifies users who may bypass the blacklist defined in + * {@link #SECURITY_DENIED_TASKS}. + * This whitelist is intended for trusted or system-level workflows that may + * legitimately require the use of restricted task implementations. + * If the submitting user is listed here, blacklist enforcement is skipped, + * although standard Hadoop authentication and ACL checks still apply. + * + * Property type: list of usernames + * Default: empty (no bypass users) + * Example: hue,hive + */ + String SECURITY_ALLOWED_USERS = "mapreduce.security.allowed-users"; + String[] DEFAULT_SECURITY_ALLOWED_USERS = {}; } diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml index 066d80a89c4eb..9c8cdb670899d 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml @@ -2282,4 +2282,44 @@ + + mapreduce.security.enabled + false + + Enables MapReduce Task-Level Security Enforcement + When enabled, the Application Master performs validation of user-submitted mapper, reducer, and other task-related classes before launching containers. + This mechanism protects the cluster from running disallowed or unsafe task implementations as defined by administrator-controlled policies. + + + + + mapreduce.security.property-domain + mapreduce.job.combine.class,mapreduce.job.combiner.group.comparator.class,mapreduce.job.end-notification.custom-notifier-class,mapreduce.job.inputformat.class,mapreduce.job.map.class,mapreduce.job.map.output.collector.class,mapreduce.job.output.group.comparator.class,mapreduce.job.output.key.class,mapreduce.job.output.key.comparator.class,mapreduce.job.output.value.class,mapreduce.job.outputformat.class,mapreduce.job.partitioner.class,mapreduce.job.reduce.class,mapreduce.map.output.key.class,mapreduce.map.output.value.class + + MapReduce Task-Level Security Enforcement: Property Domain + Defines the set of MapReduce configuration keys that represent user-supplied class names involved in task execution (e.g., mapper, reducer, partitioner). + The Application Master examines the values of these properties and checks whether any referenced class is listed in denied tasks. + Administrators may override this list to expand or restrict the validation domain. + + + + + mapreduce.security.denied-tasks + + + Specifies the list of disallowed task implementation classes or packages. + If a user submits a job whose mapper, reducer, or other task-related classes match any entry in this blacklist. + + + + + mapreduce.security.allowed-users + + + Specifies users who may bypass the blacklist defined in denied tasks. + This whitelist is intended for trusted or system-level workflows that may legitimately require the use of restricted task implementations. + If the submitting user is listed here, blacklist enforcement is skipped, although standard Hadoop authentication and ACL checks still apply. + + + diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/TaskLevelSecurityEnforcement.md b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/TaskLevelSecurityEnforcement.md new file mode 100644 index 0000000000000..c3e3a26073167 --- /dev/null +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/TaskLevelSecurityEnforcement.md @@ -0,0 +1,92 @@ + + +MR Task-Level Security Enforcement +================== + + + +Overview +------- +The goal of this feature tp provide a configurable mechanism to control which users are allowed to execute specific MapReduce jobs. +This feature aims to prevent unauthorized or potentially harmful mapper/reducer implementations from running within the Hadoop cluster. + +In the standard Hadoop MapReduce execution flow: +1) A MapReduce job is submitted by a user. +2) The job is registered with the Resource Manager (RM). +3) The RM assigns the job to a Node Manager (NM), where the Application Master (AM) for the job is launched. +4) The AM requests additional containers from the cluster, to be able to start tasks. +5) The NM launches those containers, and the containers execute the mapper/reducer tasks defined by the job. + +This feature introduces a security filtering mechanism inside the Application Master. +Before mapper or reducer tasks are launched, the AM will verify that the user-submitted MapReduce code complies with a cluster-defined security policy. +This ensures that only approved classes or packages can be executed inside the containers. +The goal is to protect the cluster from unwanted or unsafe task implementations, such as custom code that may introduce performance, stability, or security risks. + +Upon receiving job metadata, the Application Master will: +1) Check the feature is enabled. +2) Check the user who submitted the job is allowed to bypass the security check. +3) Compare classes in job config against the denied task list. +4) If job is not authorised an exception will be thrown and AM will fail. + +Configurations +------- + +#### Enables MapReduce Task-Level Security Enforcement +When enabled, the Application Master performs validation of user-submitted mapper, reducer, and other task-related classes before launching containers. +This mechanism protects the cluster from running disallowed or unsafe task implementations as defined by administrator-controlled policies. +- Property name: mapreduce.security.enabled +- Property type: boolean +- Default: false (security disabled) + + +#### MapReduce Task-Level Security Enforcement: Property Domain +Defines the set of MapReduce configuration keys that represent user-supplied class names involved in task execution (e.g., mapper, reducer, partitioner). +The Application Master examines the values of these properties and checks whether any referenced class is listed in denied tasks. +Administrators may override this list to expand or restrict the validation domain. +- Property name: mapreduce.security.property-domain +- Property type: list of configuration keys +- Default: + - mapreduce.job.combine.class + - mapreduce.job.combiner.group.comparator.class + - mapreduce.job.end-notification.custom-notifier-class + - mapreduce.job.inputformat.class + - mapreduce.job.map.class + - mapreduce.job.map.output.collector.class + - mapreduce.job.output.group.comparator.class + - mapreduce.job.output.key.class + - mapreduce.job.output.key.comparator.class + - mapreduce.job.output.value.class + - mapreduce.job.outputformat.class + - mapreduce.job.partitioner.class + - mapreduce.job.reduce.class + - mapreduce.map.output.key.class + - mapreduce.map.output.value.class + +#### MapReduce Task-Level Security Enforcement: Denied Tasks +Specifies the list of disallowed task implementation classes or packages. +If a user submits a job whose mapper, reducer, or other task-related classes match any entry in this blacklist. +- Property name: mapreduce.security.denied-tasks +- Property type: list of class name or package patterns +- Default: empty +- Example: org.apache.hadoop.streaming,org.apache.hadoop.examples.QuasiMonteCarlo + +#### MapReduce Task-Level Security Enforcement: Allowed Users +Specifies users who may bypass the blacklist defined in denied tasks. +This whitelist is intended for trusted or system-level workflows that may legitimately require the use of restricted task implementations. +If the submitting user is listed here, blacklist enforcement is skipped, although standard Hadoop authentication and ACL checks still apply. +- Property name: mapreduce.security.allowed-users +- Property type: list of usernames +- Default: empty +- Example: alice,bob \ No newline at end of file diff --git a/hadoop-project/src/site/site.xml b/hadoop-project/src/site/site.xml index 6cc69a082679a..094b4aee141cd 100644 --- a/hadoop-project/src/site/site.xml +++ b/hadoop-project/src/site/site.xml @@ -116,6 +116,7 @@ + From 40457a3359ccdd39ade9412eed077d166c960efa Mon Sep 17 00:00:00 2001 From: K0K0V0K Date: Sat, 22 Nov 2025 18:09:25 +0100 Subject: [PATCH 2/4] - add test to verify job config can not overwrite mapred-site.xml --- .../authorize/TestTaskLevelSecurityEnforcer.java | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java index 675ebe81d8589..400f56b3a608e 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TestTaskLevelSecurityEnforcer.java @@ -110,6 +110,22 @@ public void testTurnOff() { assertPass(conf); } + @Test + public void testJobConfigCanNotOverwriteMapreduceConfig() { + JobConf mapreduceConf = new JobConf(); + mapreduceConf.setBoolean(MRConfig.SECURITY_ENABLED, true); + mapreduceConf.setStrings(MRConfig.SECURITY_DENIED_TASKS, "org.apache.hadoop.streaming"); + mapreduceConf.setStrings(MRConfig.SECURITY_ALLOWED_USERS, "alice"); + + JobConf jobConf = new JobConf(); + jobConf.setStrings(MRConfig.SECURITY_ALLOWED_USERS, "bob"); + jobConf.set(MRJobConfig.MAP_CLASS_ATTR, "org.apache.hadoop.streaming.PipeMapper"); + jobConf.set(MRJobConfig.USER_NAME, "bob"); + + mapreduceConf.addResource(jobConf); + assertDenied(mapreduceConf); + } + private void assertPass(JobConf conf) { assertDoesNotThrow( () -> TaskLevelSecurityEnforcer.validate(conf), From 09ca94bee28f52378a2679bdbf35e2226c7bc7a5 Mon Sep 17 00:00:00 2001 From: K0K0V0K Date: Sat, 22 Nov 2025 18:58:05 +0100 Subject: [PATCH 3/4] - fix style --- .../app/security/authorize/TaskLevelSecurityEnforcer.java | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java index 0d469b05241ad..37d9284242521 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java @@ -43,6 +43,12 @@ public class TaskLevelSecurityEnforcer { private static final Logger LOG = LoggerFactory.getLogger(TaskLevelSecurityEnforcer.class); + /** + * Default constructor + */ + private TaskLevelSecurityEnforcer() { + } + /** * Validates a MapReduce job's configuration against the cluster's task-level * security policy. @@ -53,7 +59,7 @@ public class TaskLevelSecurityEnforcer { *
  • Allow the job immediately if the user is on the configured allowed-users list.
  • *
  • Retrieve the security property domain (list of job configuration keys to inspect).
  • *
  • Retrieve the list of denied task class prefixes.
  • - *
  • For each property in the domain, check whether its value begins with any denied prefix.
  • + *
  • For each domain property, check whether its value begins with any denied prefix.
  • *
  • If a match is found, reject the job by throwing {@link TaskLevelSecurityException}.
  • * * From b7172f1a78243b1ae28248dc03520108be4d08fa Mon Sep 17 00:00:00 2001 From: K0K0V0K Date: Sat, 22 Nov 2025 21:04:03 +0100 Subject: [PATCH 4/4] - fix style 2 --- .../v2/app/security/authorize/TaskLevelSecurityEnforcer.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java index 37d9284242521..887bffb40f372 100644 --- a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java +++ b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/security/authorize/TaskLevelSecurityEnforcer.java @@ -40,11 +40,11 @@ *

    This prevents unauthorized or unsafe custom code from running inside * cluster containers.

    */ -public class TaskLevelSecurityEnforcer { +public final class TaskLevelSecurityEnforcer { private static final Logger LOG = LoggerFactory.getLogger(TaskLevelSecurityEnforcer.class); /** - * Default constructor + * Default constructor. */ private TaskLevelSecurityEnforcer() { }