Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KNOX-2983 - Combine the functionality of different identity assertion providers #817

Merged
merged 4 commits into from
Nov 16, 2023

Conversation

zeroflag
Copy link
Contributor

@zeroflag zeroflag commented Nov 8, 2023

What changes were proposed in this pull request?

By incorporating the language which was introduced for the virtual group mapping, identity providers now support a flexible way to combine the functionality of different kind of principal mappings (e.g.: RegEx, Concat, Switch Case, ..).

How it works

The indentity-assertion provider supports a expression.principal.mapping config parameter that can contain an arbitrary expression. This expression is expected to return a String. This string will be the new, mapped principal.

For example, every user will be mapped to jozsi

<param>
   <name>expression.principal.mapping</name>
   <value>'jozsi'</value>
</param>

Function calls can be used to make transformations on the original principal

<param>
   <name>expression.principal.mapping</name>
   <value>(uppercase username)</value>
</param>

This will convert all users to their uppercase variant.

<param>
   <name>expression.principal.mapping</name>
   <value>(concat 'prefix_' username)</value>
</param>

This will prepend prefix_ in front of the username.

The combination of both:

<param>
   <name>expression.principal.mapping</name>
   <value>(uppercase (concat 'prefix_' username))</value>
</param>

E.g.: admin will become PREFIX_ADMIN

The functionality of the Regex identity assertion provider is exposed via a regex-template function.

E.g.:

(regex-template 'prefix_user-1_suffix' 'prefix_(\w+)\-(\d)_suffix' '{1}.{2}')

This will map prefix_user-1_suffix to user.1

Few other functions were added, like index-of, strlen, substr, starts-with, ends-with.

A new special form was also introduced to support if and if else.

How was this patch tested?

Constant user:

<provider>
         <role>identity-assertion</role>
         <name>Default</name>
         <enabled>true</enabled>
         <param>
              <name>expression.principal.mapping</name>
              <value>'jozsi'</value>
         </param>
</provider>
$ curl -u admin:admin-password -k https://localhost:8443/gateway/sandbox/hive
$ tail -n1 logs/gateway-audit.log 
23/11/09 11:29:25 ||ba8f6eb6-aa40-4a06-8677-c6c0b1195208|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|jozsi||access|uri|/gateway/sandbox/hive|failure|

Invalid result:

(strlen username)
2023-11-09 11:30:57,089 3d0c2c25-9b90-4469-bf95-11d7e7473011 WARN  knox.gateway (CommonIdentityAssertionFilter.java:evalAdvancedPrincipalMapping(270)) - Invalid result: 5. Expected String when evaluating mapping: [strlen, username] for user: admin

This mapping is invalid because it returned an integer, instead of a string.

Concat

(concat 'prefix_' username '_suffix')
23/11/09 11:34:58 ||a7097a1f-07a8-4199-8bd5-44d2b1466d26|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|prefix_admin_suffix||access|uri|/gateway/sandbox/hive|failure|

Uppercase

(uppercase username)
23/11/09 11:36:31 ||3f65cf9b-b2b7-44ad-a897-e083dc8e0386|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|ADMIN||access|uri|/gateway/sandbox/hive|failure|

If

(if (= (strlen username) 3)
    (concat username '_suffix')
    (substr username 0 3))
$ curl -u sam:sam-password -k https://localhost:8443/gateway/sandbox/hive
23/11/09 11:44:09 ||a33cffaf-909e-478a-8355-bc1245b1705c|audit|[0:0:0:0:0:0:0:1]|HIVE|sam|sam_suffix||access|uri|/gateway/sandbox/hive|failure|
$ curl -u admin:admin-password -k https://localhost:8443/gateway/sandbox/hive
23/11/09 11:43:04 ||afbea750-8140-4ff6-97c9-325d8774c2f0|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|adm||access|uri|/gateway/sandbox/hive|failure|

Capitalize first character

(concat
    (uppercase (substr username 0 1))
    (lowercase (substr username 1)))
23/11/09 12:32:42 ||e9205cae-0678-4005-8716-b976d64051c3|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|Admin||access|uri|/gateway/sandbox/hive|failure|

Cut first and last characters

(substr username 1 (- (strlen username) 1))                      
23/11/09 12:54:39 ||150521f9-8515-46ca-8b8a-dbb0a321b54a|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|dmi||access|uri|/gateway/sandbox/hive|failure|

Regex Tempalte

This works the same way as the Regex Identity Assertion Provider

(regex-template  username 'ad(\w+)' 'ad{[1]}' (hash 'min' 'max') true)
23/11/09 14:10:41 ||fce5098e-108a-46b7-952c-c7ac63f51d79|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|admax||access|uri|/gateway/sandbox/hive|success|Response status: 301

Map only one specific user

(if (= username 'sam') 'SAM')

admin stays the same:

23/11/09 14:15:36 ||e972d7ee-0ac8-4aa8-a6ae-6cd3e498dd89|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|||access|uri|/gateway/sandbox/hive|success|Response status: 301

sam is mapped to SAM

23/11/09 14:16:36 ||7b76cf61-ca2f-40dc-abcf-8efef55d84fa|audit|[0:0:0:0:0:0:0:1]|HIVE|sam|SAM||access|uri|/gateway/sandbox/hive|success|Response status: 301

Find a substring and remove everything after the substring

(if (contains 'dm' username)
    (substr username (+ (strlen 'dm') 
                        (index-of 'dm' username))))

admin becomes in

23/11/09 14:21:20 ||8ef3a0c6-3336-4c3b-8f36-28a99d636ada|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|in||access|uri|/gateway/sandbox/hive|success|Response status: 301

Hash lookup

(at username (hash 
                   'admin' 'Administrator' 
                   'tom'   'Thomas' 
                   'sam'   'Samuel'))
23/11/09 14:45:29 ||47caf063-abb1-40fa-a0cd-d2e985291912|audit|[0:0:0:0:0:0:0:1]|HIVE|admin|Administrator||access|uri|/gateway/sandbox/hive|failure|
23/11/09 14:45:59 ||fb5e4a45-4780-473a-9f6b-35a8a76cadec|audit|[0:0:0:0:0:0:0:1]|HIVE|tom|Thomas||access|uri|/gateway/sandbox/hive|failure|

@zeroflag zeroflag marked this pull request as draft November 8, 2023 17:00
@zeroflag zeroflag self-assigned this Nov 8, 2023
@zeroflag zeroflag force-pushed the KNOX-2983-comb branch 2 times, most recently from 0354758 to 07a0ad4 Compare November 8, 2023 18:25
@zeroflag zeroflag marked this pull request as ready for review November 9, 2023 15:03
Copy link
Contributor

@smolnar82 smolnar82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @zeroflag ! Ship it!

Copy link
Contributor

@lmccay lmccay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeroflag - this looks amazing! I have a couple questions/nits:

  1. Can we think of a more descriptive prefix than 'advanced'? This is only advanced until your next great innovation. :) Maybe 'expression.principal.mapping'? Something like that?
  2. One of the most common issues that this addresses is the fact that our HadoopGroupProvider is mutually exclusive with other identity assertion providers. Have you tested explicitly with that provider?
  3. Management tooling like Cloudera Manager and Ambari will need to be able to support this via saftey valves and or properties syntax within the UIs. Do we know that this will work without having to do some funky encoding, etc?

@zeroflag
Copy link
Contributor Author

@lmccay

  1. Thanks, I renamed it to expression.principal.mapping.

  2. The HadoopGroupProvider extends from CommonIdentityProvider, so it inherits all this functionality. I verified it with this config:

       <provider>
            <role>identity-assertion</role>
            <name>HadoopGroupProvider</name>
            <enabled>true</enabled>

            <param>
                <name>expression.principal.mapping</name>
                <value>(concat username '_SUFFIX')</value>
            </param>
            <param>
                <name>group.mapping.vgrp1</name>
                <value>(starts-with username 'sam')</value>
            </param>

            <param>
                <name>hadoop.security.group.mapping</name>
                <value>org.apache.hadoop.security.LdapGroupsMapping</value>
            </param>
            <param>
                <name>hadoop.security.group.mapping.ldap.url</name>
                <value>ldap://localhost:33389</value>
            </param>
            [....]
      </provider>
$ curl -v -k -u sam:sam-password https://localhost:8443/gateway/sandbox/hive
23/11/16 16:02:14 ||4887dab7-ffa5-4d94-9343-9eedc140fd73|audit|[0:0:0:0:0:0:0:1]|HIVE|sam|sam_SUFFIX||identity-mapping|principal|sam_SUFFIX|success|Groups: [vgrp1]
  1. The parentheses won't cause any encoding problem. The only thing I can think of as a problem is the greater than / less than signs. But those are rarely used and they can be encoded.

For example:

(> (strlen username) 10)

Should be encoded as:

(&gt; (strlen username) 10)

Copy link
Contributor

@lmccay lmccay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it - I love this!

@zeroflag zeroflag merged commit 083dc89 into apache:master Nov 16, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants