NIFI-8258: Add support for Service Principal authentication in ADLS p… by turcsanyip · Pull Request #4843 · apache/nifi

turcsanyip · 2021-02-25T11:14:15Z

…rocessors

Also fixed EL handling in ADLSCredentialsControllerService.

https://issues.apache.org/jira/browse/NIFI-8258

Thank you for submitting a contribution to Apache NiFi.

Please provide a short description of the PR here:

Description of PR

Enables X functionality; fixes bug NIFI-YYYY.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
Have you written or updated unit tests to verify your changes?
Have you verified that the full build is successful on JDK 8?
Have you verified that the full build is successful on JDK 11?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.

…rocessors Also fixed EL handling in ADLSCredentialsControllerService.

exceptionfactory · 2021-02-25T16:00:43Z

...rs/src/main/java/org/apache/nifi/processors/azure/AbstractAzureDataLakeStorageProcessor.java

+                    .endpoint(endpoint)
+                    .credential(credential)
+                    .buildClient();
+        } else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) {


Using StringUtils.isNoneBlank() would make this a bit more concise:

Suggested change

} else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) {

} else if (StringUtils.isNoneBlank(servicePrincipalTenantId, servicePrincipalClientId, servicePrincipalClientCertificatePath, servicePrincipalClientCertificatePassword)) {

exceptionfactory · 2021-02-25T16:08:16Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

+            .sensitive(true)
+            .required(false)
+            .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
+            .expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)


If this property is considered sensitive, should it support expression language? Retrieving a sensitive property from flow file attributes would expose the value in provenance events. This same question applies to the other properties marked as sensitive.

That is definitely an issue, but there is an important use case for it -- when you need to shard the data across a lot of storage accounts for performance reasons or you're using storage accounts that may only be known at runtime. When I've seen this, though, the volatile provenance repository was being used so the surface area was small.

It is called out in the property documentation but maybe it could be louder ("certain risks" doesn't sound very scary) or include a more detailed explanation in additional details. What's there now:

There are certain risks in allowing the account name to be stored as a flowfile attribute. While it does provide for a more flexible flow by allowing the account name to be fetched dynamically from a flowfile attribute, care must be taken to restrict access to the event provenance data (e.g. by strictly controlling the policies governing provenance for this Processor). In addition, the provenance repositories may be put on encrypted disk partitions.

@jfrazee Thanks for providing that background reference to the existing documentation. Understanding that these new properties fall in the same category of concerns, is it possible to implement the use cases described using Parameter Contexts? It seems like that would work for retrieving the account information at runtime, but it could make flows a bit more complicated when it is necessary to shard data across storage accounts. With the goal of moving in a more secure direction, would it be better to avoid introducing new properties supporting expression language here to encourage moving to Parameter Contexts?

I don't think the parameter contexts gets at the original desire since you'd have to know in advance which and how many parameters to create, but maybe since it's a new new feature, there's no harm in not allowing EL. We can change our mind later without creating any upgrade issues. The opposite isn't true if we try to remove it later.

The "right" thing to do here could be to say that the use case should use managed identities, or maybe SAS tokens.

Question: how much pain do you think would be induced by rolling this back on Access Keys?

The concept of supporting EL for the sensitive Account Name / Key came from the Blob credential service (the Blob counterpart of this CS) and from the use case described by @jfrazee. I followed this "pattern" with the new Service Principal properties.

However, there was a bug in this ADLS controller service and ELs were not evaluated at all (this PR would fix it). So at the moment nobody can use the ADLS service with EL and therefore removing EL support for Account Name / Key would not arise backward compatibility questions here.

I think we can safely go ahead with removing EL support from all (old and new) sensitive properties if that is preferred. We can add it later if needed.
Implementing a lookup service (like AzureStorageCredentialsControllerServiceLookup) is also an option. It can support multiple credentials within the same flow (though it cannot be so dynamic as EL).

Thanks for pointing that out @turcsanyip. I am in favor of removing EL support since it is not currently functional.

So it sounds like the only hard decision is whether to remove EL support from the SAS Token property? If created properly they provide fairly fine-grained security so the risk is significantly less than with Access Keys or SPs.

That said, maybe the user should have to work harder to do it. I think a CS could do this though since the client is created on every onTrigger()?

@jfrazee EL was dysfunctional at all in this CS so SAS Token with EL did not work either (sorry, I mentioned only Account Name / Key because usually I use those).
ADLS processors were released in 1.12 and I think they are not widely adapted yet. I guess this is the reason why the EL problem has not been noticed so far.

Ok. So seems consensus is to not support EL? And we can entertain an alternate CS, lookup service, or change if there's specific demand?

Yes, I believe this is the consensus and I'll remove EL support.

exceptionfactory · 2021-02-25T16:12:32Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

+        boolean servicePrincipalClientSecretSet = StringUtils.isNotBlank(validationContext.getProperty(SERVICE_PRINCIPAL_CLIENT_SECRET).getValue());
+        boolean servicePrincipalClientCertificateSet = validationContext.getProperty(SERVICE_PRINCIPAL_CLIENT_CERTIFICATE).isSet();
+
+        boolean servicePrincipalSet = servicePrincipalTenantIdSet || servicePrincipalClientIdSet || servicePrincipalClientSecretSet || servicePrincipalClientCertificateSet;


Instead of using the presence of any of these properties to imply Service Principal Authentication, what about introducing one more property named something like Authentication Type? The value of that property could take one of an enumerated list of values. With that property in place, the remaining Service Principal properties could use the dependsOn feature of Property Descriptors. This would provide a cleaner user experience and should also make the determination of authentication type easier to follow.

Using Authentication Type with dependsOn would be great here. I just don't know how we could initialize this new field in existing flows. We cannot use a default value there. If Account Key property is filled in, then the Authentication Type should be initialized as "Account Key", if SAS Token is filled in, then "SAS Token", etc.
Managed Identity is more problematic because if Authentication Type is "Managed Identity", then no more property needed. The existing Use Managed Identity property with true/false would be redundant / inconsistent but property deletion is not really possible.

That makes sense, what about having a default value of AUTO that would preserve existing behavior, and having other values that would make the Authentication Type explicit?

I like this, but for an existing flow the UX will be a bit weird at first because all the fields will initially be hidden I think.

I was trying to think through whether there's a way to repurpose "Use Managed Identity" for this without being excessively dirty? I think there is. It's possible to add allowable values without breaking anything, display name can of course change, if you change to AllowableValue the existing values will look natural in the UI, and everything except the name() value in the code might look pretty much like this is how it originally was done.

Can we get past having this property still being identified as "storage-use-managed-identity"?

"the UX will be a bit weird at first because all the fields will initially be hidden" : I don't think so because the default AUTO would mean all properties displayed
if I understand @exceptionfactory 's idea correctly

Regarding repurposing "Use Managed Identity": we could rename the property on the UI and also the existing true to Managed Identity but false covers Account Key and SAS Token and we don;t know which one, so it is not clear for me how to handle it

As @turcsanyip described, the default value of a new Authentication Type property would be AUTO, indicating that currently visible properties would be displayed and the existing logic to infer the desired authentication type would be followed. Selecting a more specific Authentication Type value would hide properties that are not applicable using the depend on feature.

That makes sense. I think it'll be easier to see how it feels with it in action.

We don't have an answer re: the redundancy with the value of Auth Type = Managed Identity and Use Managed Identity = t/f, right? Is the idea for them to co-exist and then validation ensures it's consistent?

My idea would be:

in case of Auth Type = AUTO: all props seen, the current validation running (eg. Account Key cannot be filled in and Use Managed Identity cannot be true at the same time)

in case of Auth Type = Account Key: only Account Key prop seen, only this property validated (must be filled in), the other invisible properties can have any values (it would be weird to show warnings for those properties in this case)

in case of Auth Type = Managed Identity: no other property can be seen, neither Use Managed Identity, no further validation needed, this Auth Type simply turns on Managed Identity authentication regardless of the value of the invisible Use Managed Identity property

With these rules, the visible properties are always consistent. The invisible ones may not be but that seems to be acceptable for me.

What do you think of it?

@turcsanyip That sounds like a good approach.

exceptionfactory · 2021-02-25T16:17:12Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

+            .displayName("Service Principal Client Certificate")
+            .description("SSL Context Service referencing the keystore with the client certificate of the Client/Application. Only PKCS12 (.pfx) keystore type is supported. " +
+                    "The keystore must contain a single key and the password of the keystore and the key must be the same.")
+            .identifiesControllerService(SSLContextService.class)


The property name is somewhat confusing in light of this property referencing the SSLContextService. Given that SSLContextService is not really used to provide an SSLContext object, what about changing this property to be just the file path? That would also avoid the need for introducing the additional dependency on nifi-ssl-context-service-api. In that case, the File Path Validator could be used to ensure that the file exists.

I believe we use SSLContextService just to provide the bare keystore attributes (like path and password) at other places in the code too.
I would prefer to reuse the existing way of configuring a keystore via a CS. Otherwise 2 property would be needed that already exist in SSLContextService.

It is understandable that SSLContextService is used in other components where both key store and trust store are necessary. In this case, however, only the key store and associated password are necessary, so using SSLContextService seems to imply that more properties are necessary to make this work. Also given that only a PKCS12 key store is supported, it seems better to have the two specific properties for key store and key store password. In addition to the unused trust store properties, the SSLContextService also has the TLS Protocol property, which is would not apply to this service.

exceptionfactory · 2021-02-25T16:18:33Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

+            .sensitive(true)
+            .required(false)
+            .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
+            .expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)


Based on the approach of some other components, it seems best to avoid supporting expression language for password properties. This can still be parameterized using parameter contexts, which preserves the ability to use sensitive property encryption, and also supports passwords that may look like expression language strings.

exceptionfactory · 2021-02-25T16:21:37Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

+    public static final PropertyDescriptor SERVICE_PRINCIPAL_CLIENT_CERTIFICATE = new PropertyDescriptor.Builder()
+            .name("service-principal-client-certificate")
+            .displayName("Service Principal Client Certificate")
+            .description("SSL Context Service referencing the keystore with the client certificate of the Client/Application. Only PKCS12 (.pfx) keystore type is supported. " +


In light of the description and the Azure SDK allowing only PKCS12 certificates, it would be helpful to add a check in the customValidate method to ensure that the file provided is actually a PKCS12. Leveraging the Azure ClientCertificateCredentialBuilder to load the certificate with the password provided would be a good way to ensure that the configured properties meet the requirements described.

NIFI-8258: Removed Service Principal Client Certificate authentication

turcsanyip · 2021-02-28T22:10:08Z

I removed the EL support from the sensitive properties.

I also removed SSLContextService and implemented the keystore handling locally but in the end I felt it is something that should rather go into a separate KeyStoreService CS. To leave this question open, I removed the Client Certificate way from this PR and I would handle it in a follow-up ticket.
You can check the implementation here: https://github.com/turcsanyip/nifi/compare/NIFI-8258...turcsanyip:NIFI-8258_Client_Certificate?expand=1
I can open another PR from it or we can implement KeyStoreService CS.

Regarding Authentication Type: I ran into a bug with dependsOn(). Properties depending on the default value of another property (like AUTO in our case) do not show up on the UI initially (when I just created the CS and open the properties tab).
I filed a Jira: https://issues.apache.org/jira/browse/NIFI-8270
I would not wait for this to be fixed and would also implement this Authentication Type in a follow-up ticket.

@exceptionfactory, @jfrazee Would it be fine with you to reduce the scope of this PR to Client Secret only? Service Principal with Client Secret is a feature alone. It can be extended with the Client Certificate option and Authentication Type convenience property later.

exceptionfactory · 2021-03-01T14:19:50Z

I removed the EL support from the sensitive properties.

I also removed SSLContextService and implemented the keystore handling locally but in the end I felt it is something that should rather go into a separate KeyStoreService CS. To leave this question open, I removed the Client Certificate way from this PR and I would handle it in a follow-up ticket.
You can check the implementation here: https://github.com/turcsanyip/nifi/compare/NIFI-8258...turcsanyip:NIFI-8258_Client_Certificate?expand=1
I can open another PR from it or we can implement KeyStoreService CS.

Regarding Authentication Type: I ran into a bug with dependsOn(). Properties depending on the default value of another property (like AUTO in our case) do not show up on the UI initially (when I just created the CS and open the properties tab).
I filed a Jira: https://issues.apache.org/jira/browse/NIFI-8270
I would not wait for this to be fixed and would also implement this Authentication Type in a follow-up ticket.

@exceptionfactory, @jfrazee Would it be fine with you to reduce the scope of this PR to Client Secret only? Service Principal with Client Secret is a feature alone. It can be extended with the Client Certificate option and Authentication Type convenience property later.

@turcsanyip Reducing the scope of this PR to just the Service Principal with Client Secret sounds good. Based on the issue described in NIFI-8270, waiting on resolution there to implement the Authentication Type property and Client Certificate properties seems like a good way forward.

exceptionfactory

The updates and current implementation to support Service Principal authentication with Client Secret, and introducing Variable Registry EL support for Endpoint Suffix look good. Any additional comments @jfrazee?

jfrazee

+1

I think waiting and sorting out the dependsOn and SSL context separately makes sense. I have a few minor comments but nothing I'd call blocking -- can do in the follow ups maybe.

I ran manual tests for service principals, access keys, and SAS tokens and ran the ITs.

@turcsanyip Were you planning to push any additional changes?

jfrazee · 2021-03-01T19:05:58Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

            results.add(new ValidationResult.Builder().subject(this.getClass().getSimpleName())
                .valid(false)
-                .explanation("one and only one of [" + options + "] should be set")
+                .explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")


Would it make sense to use displayName() here?

Suggested change

.explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")

.explanation(String.format("one and only one authentication method of [%s, %s, %s, Service Principal] should be used",

ACCOUNT_KEY.displayName(), SAS_TOKEN.displayName(), USE_MANAGED_IDENTITY.displayName()))

Account Key and SAS Token would be fine but I would simply use Managed Identity (without the "Use" prefix from the displayname).
I think it is something we should rather fix when the new Authentication Type property is added (which will have the same AllowableValue-s that would be needed here too).

jfrazee · 2021-03-01T19:06:47Z

...s/src/main/java/org/apache/nifi/services/azure/storage/ADLSCredentialsControllerService.java

+                .explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")
                .build());
+        } else if (servicePrincipalSet) {
+            String template = "'%s' must be set when Service Principal authentication is being configured";


Suggested change

String template = "'%s' must be set when Service Principal authentication is being configured";

final String template = "'%s' must be set when Service Principal authentication is being configured";

Just for this, I would not add a new commit. Will fix it in the follow-up jira (where customValidate() will be modified).

turcsanyip · 2021-03-01T22:51:40Z

Follow-up jiras:

Client Certificate: https://issues.apache.org/jira/browse/NIFI-8277
Authentication Type: https://issues.apache.org/jira/browse/NIFI-8278

exceptionfactory · 2021-03-02T04:30:30Z

Thanks for your work on this @turcsanyip and for the additional issues referenced. Thanks for your input as well @jfrazee. +1 Merging.

…rocessors - Removed Expression Language support indicators from sensitive properties This closes apache#4843 Signed-off-by: David Handermann <exceptionfactory@apache.org>

NIFI-8258: Add support for Service Principal authentication in ADLS p…

a494ad1

…rocessors Also fixed EL handling in ADLSCredentialsControllerService.

exceptionfactory reviewed Feb 25, 2021

View reviewed changes

NIFI-8258: Removed EL support from sensitive properties

2ff2de3

NIFI-8258: Removed Service Principal Client Certificate authentication

exceptionfactory approved these changes Mar 1, 2021

View reviewed changes

jfrazee approved these changes Mar 1, 2021

View reviewed changes

asfgit closed this in 7876f41 Mar 2, 2021

exceptionfactory mentioned this pull request Mar 9, 2021

NIFI-8279: Adds secure controller service using keyvault secrets and tests #4863

Closed

13 tasks

jfrazee mentioned this pull request Jun 3, 2021

NIFI-8656: Support expression language for SAS Token in the ADLS Gen2… #5119

Closed

13 tasks

	} else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) {
	} else if (StringUtils.isNoneBlank(servicePrincipalTenantId, servicePrincipalClientId, servicePrincipalClientCertificatePath, servicePrincipalClientCertificatePassword)) {

	.explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")
	.explanation(String.format("one and only one authentication method of [%s, %s, %s, Service Principal] should be used",
	ACCOUNT_KEY.displayName(), SAS_TOKEN.displayName(), USE_MANAGED_IDENTITY.displayName()))

	String template = "'%s' must be set when Service Principal authentication is being configured";
	final String template = "'%s' must be set when Service Principal authentication is being configured";

Conversation

turcsanyip commented Feb 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of PR

For all changes:

For code changes:

For documentation related changes:

Note:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jfrazee Feb 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

turcsanyip commented Feb 28, 2021

Uh oh!

exceptionfactory commented Mar 1, 2021

Uh oh!

exceptionfactory left a comment

Choose a reason for hiding this comment

Uh oh!

jfrazee left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

turcsanyip commented Mar 1, 2021

turcsanyip commented Feb 25, 2021 •

edited

Loading

jfrazee Feb 26, 2021 •

edited

Loading

exceptionfactory commented Mar 2, 2021 •

edited

Loading