NIFI-8258: Add support for Service Principal authentication in ADLS p…#4843
NIFI-8258: Add support for Service Principal authentication in ADLS p…#4843turcsanyip wants to merge 2 commits intoapache:mainfrom
Conversation
…rocessors Also fixed EL handling in ADLSCredentialsControllerService.
| .endpoint(endpoint) | ||
| .credential(credential) | ||
| .buildClient(); | ||
| } else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) { |
There was a problem hiding this comment.
Using StringUtils.isNoneBlank() would make this a bit more concise:
| } else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) { | |
| } else if (StringUtils.isNoneBlank(servicePrincipalTenantId, servicePrincipalClientId, servicePrincipalClientCertificatePath, servicePrincipalClientCertificatePassword)) { |
| .sensitive(true) | ||
| .required(false) | ||
| .addValidator(StandardValidators.NON_BLANK_VALIDATOR) | ||
| .expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES) |
There was a problem hiding this comment.
If this property is considered sensitive, should it support expression language? Retrieving a sensitive property from flow file attributes would expose the value in provenance events. This same question applies to the other properties marked as sensitive.
There was a problem hiding this comment.
That is definitely an issue, but there is an important use case for it -- when you need to shard the data across a lot of storage accounts for performance reasons or you're using storage accounts that may only be known at runtime. When I've seen this, though, the volatile provenance repository was being used so the surface area was small.
It is called out in the property documentation but maybe it could be louder ("certain risks" doesn't sound very scary) or include a more detailed explanation in additional details. What's there now:
There are certain risks in allowing the account name to be stored as a flowfile attribute. While it does provide for a more flexible flow by allowing the account name to be fetched dynamically from a flowfile attribute, care must be taken to restrict access to the event provenance data (e.g. by strictly controlling the policies governing provenance for this Processor). In addition, the provenance repositories may be put on encrypted disk partitions.
There was a problem hiding this comment.
@jfrazee Thanks for providing that background reference to the existing documentation. Understanding that these new properties fall in the same category of concerns, is it possible to implement the use cases described using Parameter Contexts? It seems like that would work for retrieving the account information at runtime, but it could make flows a bit more complicated when it is necessary to shard data across storage accounts. With the goal of moving in a more secure direction, would it be better to avoid introducing new properties supporting expression language here to encourage moving to Parameter Contexts?
There was a problem hiding this comment.
I don't think the parameter contexts gets at the original desire since you'd have to know in advance which and how many parameters to create, but maybe since it's a new new feature, there's no harm in not allowing EL. We can change our mind later without creating any upgrade issues. The opposite isn't true if we try to remove it later.
The "right" thing to do here could be to say that the use case should use managed identities, or maybe SAS tokens.
Question: how much pain do you think would be induced by rolling this back on Access Keys?
There was a problem hiding this comment.
The concept of supporting EL for the sensitive Account Name / Key came from the Blob credential service (the Blob counterpart of this CS) and from the use case described by @jfrazee. I followed this "pattern" with the new Service Principal properties.
However, there was a bug in this ADLS controller service and ELs were not evaluated at all (this PR would fix it). So at the moment nobody can use the ADLS service with EL and therefore removing EL support for Account Name / Key would not arise backward compatibility questions here.
I think we can safely go ahead with removing EL support from all (old and new) sensitive properties if that is preferred. We can add it later if needed.
Implementing a lookup service (like AzureStorageCredentialsControllerServiceLookup) is also an option. It can support multiple credentials within the same flow (though it cannot be so dynamic as EL).
There was a problem hiding this comment.
Thanks for pointing that out @turcsanyip. I am in favor of removing EL support since it is not currently functional.
There was a problem hiding this comment.
So it sounds like the only hard decision is whether to remove EL support from the SAS Token property? If created properly they provide fairly fine-grained security so the risk is significantly less than with Access Keys or SPs.
That said, maybe the user should have to work harder to do it. I think a CS could do this though since the client is created on every onTrigger()?
There was a problem hiding this comment.
@jfrazee EL was dysfunctional at all in this CS so SAS Token with EL did not work either (sorry, I mentioned only Account Name / Key because usually I use those).
ADLS processors were released in 1.12 and I think they are not widely adapted yet. I guess this is the reason why the EL problem has not been noticed so far.
There was a problem hiding this comment.
Ok. So seems consensus is to not support EL? And we can entertain an alternate CS, lookup service, or change if there's specific demand?
There was a problem hiding this comment.
Yes, I believe this is the consensus and I'll remove EL support.
| boolean servicePrincipalClientSecretSet = StringUtils.isNotBlank(validationContext.getProperty(SERVICE_PRINCIPAL_CLIENT_SECRET).getValue()); | ||
| boolean servicePrincipalClientCertificateSet = validationContext.getProperty(SERVICE_PRINCIPAL_CLIENT_CERTIFICATE).isSet(); | ||
|
|
||
| boolean servicePrincipalSet = servicePrincipalTenantIdSet || servicePrincipalClientIdSet || servicePrincipalClientSecretSet || servicePrincipalClientCertificateSet; |
There was a problem hiding this comment.
Instead of using the presence of any of these properties to imply Service Principal Authentication, what about introducing one more property named something like Authentication Type? The value of that property could take one of an enumerated list of values. With that property in place, the remaining Service Principal properties could use the dependsOn feature of Property Descriptors. This would provide a cleaner user experience and should also make the determination of authentication type easier to follow.
There was a problem hiding this comment.
Using Authentication Type with dependsOn would be great here. I just don't know how we could initialize this new field in existing flows. We cannot use a default value there. If Account Key property is filled in, then the Authentication Type should be initialized as "Account Key", if SAS Token is filled in, then "SAS Token", etc.
Managed Identity is more problematic because if Authentication Type is "Managed Identity", then no more property needed. The existing Use Managed Identity property with true/false would be redundant / inconsistent but property deletion is not really possible.
There was a problem hiding this comment.
That makes sense, what about having a default value of AUTO that would preserve existing behavior, and having other values that would make the Authentication Type explicit?
There was a problem hiding this comment.
I like this, but for an existing flow the UX will be a bit weird at first because all the fields will initially be hidden I think.
I was trying to think through whether there's a way to repurpose "Use Managed Identity" for this without being excessively dirty? I think there is. It's possible to add allowable values without breaking anything, display name can of course change, if you change to AllowableValue the existing values will look natural in the UI, and everything except the name() value in the code might look pretty much like this is how it originally was done.
Can we get past having this property still being identified as "storage-use-managed-identity"?
There was a problem hiding this comment.
"the UX will be a bit weird at first because all the fields will initially be hidden" : I don't think so because the default AUTO would mean all properties displayed
if I understand @exceptionfactory 's idea correctly
Regarding repurposing "Use Managed Identity": we could rename the property on the UI and also the existing true to Managed Identity but false covers Account Key and SAS Token and we don;t know which one, so it is not clear for me how to handle it
There was a problem hiding this comment.
As @turcsanyip described, the default value of a new Authentication Type property would be AUTO, indicating that currently visible properties would be displayed and the existing logic to infer the desired authentication type would be followed. Selecting a more specific Authentication Type value would hide properties that are not applicable using the depend on feature.
There was a problem hiding this comment.
That makes sense. I think it'll be easier to see how it feels with it in action.
We don't have an answer re: the redundancy with the value of Auth Type = Managed Identity and Use Managed Identity = t/f, right? Is the idea for them to co-exist and then validation ensures it's consistent?
There was a problem hiding this comment.
My idea would be:
- in case of
Auth Type = AUTO: all props seen, the current validation running (eg.Account Keycannot be filled in andUse Managed Identitycannot be true at the same time) - in case of
Auth Type = Account Key: onlyAccount Keyprop seen, only this property validated (must be filled in), the other invisible properties can have any values (it would be weird to show warnings for those properties in this case) - in case of
Auth Type = Managed Identity: no other property can be seen, neitherUse Managed Identity, no further validation needed, this Auth Type simply turns on Managed Identity authentication regardless of the value of the invisibleUse Managed Identityproperty
With these rules, the visible properties are always consistent. The invisible ones may not be but that seems to be acceptable for me.
What do you think of it?
| .displayName("Service Principal Client Certificate") | ||
| .description("SSL Context Service referencing the keystore with the client certificate of the Client/Application. Only PKCS12 (.pfx) keystore type is supported. " + | ||
| "The keystore must contain a single key and the password of the keystore and the key must be the same.") | ||
| .identifiesControllerService(SSLContextService.class) |
There was a problem hiding this comment.
The property name is somewhat confusing in light of this property referencing the SSLContextService. Given that SSLContextService is not really used to provide an SSLContext object, what about changing this property to be just the file path? That would also avoid the need for introducing the additional dependency on nifi-ssl-context-service-api. In that case, the File Path Validator could be used to ensure that the file exists.
There was a problem hiding this comment.
I believe we use SSLContextService just to provide the bare keystore attributes (like path and password) at other places in the code too.
I would prefer to reuse the existing way of configuring a keystore via a CS. Otherwise 2 property would be needed that already exist in SSLContextService.
There was a problem hiding this comment.
It is understandable that SSLContextService is used in other components where both key store and trust store are necessary. In this case, however, only the key store and associated password are necessary, so using SSLContextService seems to imply that more properties are necessary to make this work. Also given that only a PKCS12 key store is supported, it seems better to have the two specific properties for key store and key store password. In addition to the unused trust store properties, the SSLContextService also has the TLS Protocol property, which is would not apply to this service.
| .sensitive(true) | ||
| .required(false) | ||
| .addValidator(StandardValidators.NON_BLANK_VALIDATOR) | ||
| .expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES) |
There was a problem hiding this comment.
Based on the approach of some other components, it seems best to avoid supporting expression language for password properties. This can still be parameterized using parameter contexts, which preserves the ability to use sensitive property encryption, and also supports passwords that may look like expression language strings.
| public static final PropertyDescriptor SERVICE_PRINCIPAL_CLIENT_CERTIFICATE = new PropertyDescriptor.Builder() | ||
| .name("service-principal-client-certificate") | ||
| .displayName("Service Principal Client Certificate") | ||
| .description("SSL Context Service referencing the keystore with the client certificate of the Client/Application. Only PKCS12 (.pfx) keystore type is supported. " + |
There was a problem hiding this comment.
In light of the description and the Azure SDK allowing only PKCS12 certificates, it would be helpful to add a check in the customValidate method to ensure that the file provided is actually a PKCS12. Leveraging the Azure ClientCertificateCredentialBuilder to load the certificate with the password provided would be a good way to ensure that the configured properties meet the requirements described.
NIFI-8258: Removed Service Principal Client Certificate authentication
|
I removed the EL support from the sensitive properties. I also removed Regarding @exceptionfactory, @jfrazee Would it be fine with you to reduce the scope of this PR to Client Secret only? Service Principal with Client Secret is a feature alone. It can be extended with the Client Certificate option and Authentication Type convenience property later. |
@turcsanyip Reducing the scope of this PR to just the Service Principal with Client Secret sounds good. Based on the issue described in NIFI-8270, waiting on resolution there to implement the |
exceptionfactory
left a comment
There was a problem hiding this comment.
The updates and current implementation to support Service Principal authentication with Client Secret, and introducing Variable Registry EL support for Endpoint Suffix look good. Any additional comments @jfrazee?
jfrazee
left a comment
There was a problem hiding this comment.
+1
I think waiting and sorting out the dependsOn and SSL context separately makes sense. I have a few minor comments but nothing I'd call blocking -- can do in the follow ups maybe.
I ran manual tests for service principals, access keys, and SAS tokens and ran the ITs.
@turcsanyip Were you planning to push any additional changes?
| results.add(new ValidationResult.Builder().subject(this.getClass().getSimpleName()) | ||
| .valid(false) | ||
| .explanation("one and only one of [" + options + "] should be set") | ||
| .explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used") |
There was a problem hiding this comment.
Would it make sense to use displayName() here?
| .explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used") | |
| .explanation(String.format("one and only one authentication method of [%s, %s, %s, Service Principal] should be used", | |
| ACCOUNT_KEY.displayName(), SAS_TOKEN.displayName(), USE_MANAGED_IDENTITY.displayName())) |
There was a problem hiding this comment.
Account Key and SAS Token would be fine but I would simply use Managed Identity (without the "Use" prefix from the displayname).
I think it is something we should rather fix when the new Authentication Type property is added (which will have the same AllowableValue-s that would be needed here too).
| .explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used") | ||
| .build()); | ||
| } else if (servicePrincipalSet) { | ||
| String template = "'%s' must be set when Service Principal authentication is being configured"; |
There was a problem hiding this comment.
| String template = "'%s' must be set when Service Principal authentication is being configured"; | |
| final String template = "'%s' must be set when Service Principal authentication is being configured"; |
There was a problem hiding this comment.
Just for this, I would not add a new commit. Will fix it in the follow-up jira (where customValidate() will be modified).
|
Follow-up jiras:
|
|
Thanks for your work on this @turcsanyip and for the additional issues referenced. Thanks for your input as well @jfrazee. +1 Merging. |
…rocessors - Removed Expression Language support indicators from sensitive properties This closes apache#4843 Signed-off-by: David Handermann <exceptionfactory@apache.org>
…rocessors - Removed Expression Language support indicators from sensitive properties This closes apache#4843 Signed-off-by: David Handermann <exceptionfactory@apache.org>
…rocessors
Also fixed EL handling in ADLSCredentialsControllerService.
https://issues.apache.org/jira/browse/NIFI-8258
Thank you for submitting a contribution to Apache NiFi.
Please provide a short description of the PR here:
Description of PR
Enables X functionality; fixes bug NIFI-YYYY.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically
main)?Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not
squashor use--forcewhen pushing to allow for clean monitoring of changes.For code changes:
mvn -Pcontrib-check clean installat the rootnififolder?LICENSEfile, including the mainLICENSEfile undernifi-assembly?NOTICEfile, including the mainNOTICEfile found undernifi-assembly?.displayNamein addition to .name (programmatic access) for each of the new properties?For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.