Skip to content

NIFI-8258: Add support for Service Principal authentication in ADLS p…#4843

Closed
turcsanyip wants to merge 2 commits intoapache:mainfrom
turcsanyip:NIFI-8258
Closed

NIFI-8258: Add support for Service Principal authentication in ADLS p…#4843
turcsanyip wants to merge 2 commits intoapache:mainfrom
turcsanyip:NIFI-8258

Conversation

@turcsanyip
Copy link
Contributor

@turcsanyip turcsanyip commented Feb 25, 2021

…rocessors

Also fixed EL handling in ADLSCredentialsControllerService.

https://issues.apache.org/jira/browse/NIFI-8258

Thank you for submitting a contribution to Apache NiFi.

Please provide a short description of the PR here:

Description of PR

Enables X functionality; fixes bug NIFI-YYYY.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
  • Have you written or updated unit tests to verify your changes?
  • Have you verified that the full build is successful on JDK 8?
  • Have you verified that the full build is successful on JDK 11?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
  • If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.

…rocessors

Also fixed EL handling in ADLSCredentialsControllerService.
.endpoint(endpoint)
.credential(credential)
.buildClient();
} else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using StringUtils.isNoneBlank() would make this a bit more concise:

Suggested change
} else if (servicePrincipalTenantId != null && servicePrincipalClientId != null && servicePrincipalClientCertificatePath != null && servicePrincipalClientCertificatePassword != null) {
} else if (StringUtils.isNoneBlank(servicePrincipalTenantId, servicePrincipalClientId, servicePrincipalClientCertificatePath, servicePrincipalClientCertificatePassword)) {

.sensitive(true)
.required(false)
.addValidator(StandardValidators.NON_BLANK_VALIDATOR)
.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this property is considered sensitive, should it support expression language? Retrieving a sensitive property from flow file attributes would expose the value in provenance events. This same question applies to the other properties marked as sensitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is definitely an issue, but there is an important use case for it -- when you need to shard the data across a lot of storage accounts for performance reasons or you're using storage accounts that may only be known at runtime. When I've seen this, though, the volatile provenance repository was being used so the surface area was small.

It is called out in the property documentation but maybe it could be louder ("certain risks" doesn't sound very scary) or include a more detailed explanation in additional details. What's there now:

There are certain risks in allowing the account name to be stored as a flowfile attribute. While it does provide for a more flexible flow by allowing the account name to be fetched dynamically from a flowfile attribute, care must be taken to restrict access to the event provenance data (e.g. by strictly controlling the policies governing provenance for this Processor). In addition, the provenance repositories may be put on encrypted disk partitions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfrazee Thanks for providing that background reference to the existing documentation. Understanding that these new properties fall in the same category of concerns, is it possible to implement the use cases described using Parameter Contexts? It seems like that would work for retrieving the account information at runtime, but it could make flows a bit more complicated when it is necessary to shard data across storage accounts. With the goal of moving in a more secure direction, would it be better to avoid introducing new properties supporting expression language here to encourage moving to Parameter Contexts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the parameter contexts gets at the original desire since you'd have to know in advance which and how many parameters to create, but maybe since it's a new new feature, there's no harm in not allowing EL. We can change our mind later without creating any upgrade issues. The opposite isn't true if we try to remove it later.

The "right" thing to do here could be to say that the use case should use managed identities, or maybe SAS tokens.

Question: how much pain do you think would be induced by rolling this back on Access Keys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept of supporting EL for the sensitive Account Name / Key came from the Blob credential service (the Blob counterpart of this CS) and from the use case described by @jfrazee. I followed this "pattern" with the new Service Principal properties.

However, there was a bug in this ADLS controller service and ELs were not evaluated at all (this PR would fix it). So at the moment nobody can use the ADLS service with EL and therefore removing EL support for Account Name / Key would not arise backward compatibility questions here.

I think we can safely go ahead with removing EL support from all (old and new) sensitive properties if that is preferred. We can add it later if needed.
Implementing a lookup service (like AzureStorageCredentialsControllerServiceLookup) is also an option. It can support multiple credentials within the same flow (though it cannot be so dynamic as EL).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out @turcsanyip. I am in favor of removing EL support since it is not currently functional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it sounds like the only hard decision is whether to remove EL support from the SAS Token property? If created properly they provide fairly fine-grained security so the risk is significantly less than with Access Keys or SPs.

That said, maybe the user should have to work harder to do it. I think a CS could do this though since the client is created on every onTrigger()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfrazee EL was dysfunctional at all in this CS so SAS Token with EL did not work either (sorry, I mentioned only Account Name / Key because usually I use those).
ADLS processors were released in 1.12 and I think they are not widely adapted yet. I guess this is the reason why the EL problem has not been noticed so far.

Copy link
Member

@jfrazee jfrazee Feb 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. So seems consensus is to not support EL? And we can entertain an alternate CS, lookup service, or change if there's specific demand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe this is the consensus and I'll remove EL support.

boolean servicePrincipalClientSecretSet = StringUtils.isNotBlank(validationContext.getProperty(SERVICE_PRINCIPAL_CLIENT_SECRET).getValue());
boolean servicePrincipalClientCertificateSet = validationContext.getProperty(SERVICE_PRINCIPAL_CLIENT_CERTIFICATE).isSet();

boolean servicePrincipalSet = servicePrincipalTenantIdSet || servicePrincipalClientIdSet || servicePrincipalClientSecretSet || servicePrincipalClientCertificateSet;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using the presence of any of these properties to imply Service Principal Authentication, what about introducing one more property named something like Authentication Type? The value of that property could take one of an enumerated list of values. With that property in place, the remaining Service Principal properties could use the dependsOn feature of Property Descriptors. This would provide a cleaner user experience and should also make the determination of authentication type easier to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Authentication Type with dependsOn would be great here. I just don't know how we could initialize this new field in existing flows. We cannot use a default value there. If Account Key property is filled in, then the Authentication Type should be initialized as "Account Key", if SAS Token is filled in, then "SAS Token", etc.
Managed Identity is more problematic because if Authentication Type is "Managed Identity", then no more property needed. The existing Use Managed Identity property with true/false would be redundant / inconsistent but property deletion is not really possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, what about having a default value of AUTO that would preserve existing behavior, and having other values that would make the Authentication Type explicit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, but for an existing flow the UX will be a bit weird at first because all the fields will initially be hidden I think.

I was trying to think through whether there's a way to repurpose "Use Managed Identity" for this without being excessively dirty? I think there is. It's possible to add allowable values without breaking anything, display name can of course change, if you change to AllowableValue the existing values will look natural in the UI, and everything except the name() value in the code might look pretty much like this is how it originally was done.

Can we get past having this property still being identified as "storage-use-managed-identity"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the UX will be a bit weird at first because all the fields will initially be hidden" : I don't think so because the default AUTO would mean all properties displayed
if I understand @exceptionfactory 's idea correctly

Regarding repurposing "Use Managed Identity": we could rename the property on the UI and also the existing true to Managed Identity but false covers Account Key and SAS Token and we don;t know which one, so it is not clear for me how to handle it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @turcsanyip described, the default value of a new Authentication Type property would be AUTO, indicating that currently visible properties would be displayed and the existing logic to infer the desired authentication type would be followed. Selecting a more specific Authentication Type value would hide properties that are not applicable using the depend on feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I think it'll be easier to see how it feels with it in action.

We don't have an answer re: the redundancy with the value of Auth Type = Managed Identity and Use Managed Identity = t/f, right? Is the idea for them to co-exist and then validation ensures it's consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea would be:

  • in case of Auth Type = AUTO: all props seen, the current validation running (eg. Account Key cannot be filled in and Use Managed Identity cannot be true at the same time)
  • in case of Auth Type = Account Key: only Account Key prop seen, only this property validated (must be filled in), the other invisible properties can have any values (it would be weird to show warnings for those properties in this case)
  • in case of Auth Type = Managed Identity: no other property can be seen, neither Use Managed Identity, no further validation needed, this Auth Type simply turns on Managed Identity authentication regardless of the value of the invisible Use Managed Identity property

With these rules, the visible properties are always consistent. The invisible ones may not be but that seems to be acceptable for me.

What do you think of it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@turcsanyip That sounds like a good approach.

.displayName("Service Principal Client Certificate")
.description("SSL Context Service referencing the keystore with the client certificate of the Client/Application. Only PKCS12 (.pfx) keystore type is supported. " +
"The keystore must contain a single key and the password of the keystore and the key must be the same.")
.identifiesControllerService(SSLContextService.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property name is somewhat confusing in light of this property referencing the SSLContextService. Given that SSLContextService is not really used to provide an SSLContext object, what about changing this property to be just the file path? That would also avoid the need for introducing the additional dependency on nifi-ssl-context-service-api. In that case, the File Path Validator could be used to ensure that the file exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we use SSLContextService just to provide the bare keystore attributes (like path and password) at other places in the code too.
I would prefer to reuse the existing way of configuring a keystore via a CS. Otherwise 2 property would be needed that already exist in SSLContextService.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is understandable that SSLContextService is used in other components where both key store and trust store are necessary. In this case, however, only the key store and associated password are necessary, so using SSLContextService seems to imply that more properties are necessary to make this work. Also given that only a PKCS12 key store is supported, it seems better to have the two specific properties for key store and key store password. In addition to the unused trust store properties, the SSLContextService also has the TLS Protocol property, which is would not apply to this service.

.sensitive(true)
.required(false)
.addValidator(StandardValidators.NON_BLANK_VALIDATOR)
.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the approach of some other components, it seems best to avoid supporting expression language for password properties. This can still be parameterized using parameter contexts, which preserves the ability to use sensitive property encryption, and also supports passwords that may look like expression language strings.

public static final PropertyDescriptor SERVICE_PRINCIPAL_CLIENT_CERTIFICATE = new PropertyDescriptor.Builder()
.name("service-principal-client-certificate")
.displayName("Service Principal Client Certificate")
.description("SSL Context Service referencing the keystore with the client certificate of the Client/Application. Only PKCS12 (.pfx) keystore type is supported. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In light of the description and the Azure SDK allowing only PKCS12 certificates, it would be helpful to add a check in the customValidate method to ensure that the file provided is actually a PKCS12. Leveraging the Azure ClientCertificateCredentialBuilder to load the certificate with the password provided would be a good way to ensure that the configured properties meet the requirements described.

NIFI-8258: Removed Service Principal Client Certificate authentication
@turcsanyip
Copy link
Contributor Author

I removed the EL support from the sensitive properties.

I also removed SSLContextService and implemented the keystore handling locally but in the end I felt it is something that should rather go into a separate KeyStoreService CS. To leave this question open, I removed the Client Certificate way from this PR and I would handle it in a follow-up ticket.
You can check the implementation here: https://github.com/turcsanyip/nifi/compare/NIFI-8258...turcsanyip:NIFI-8258_Client_Certificate?expand=1
I can open another PR from it or we can implement KeyStoreService CS.

Regarding Authentication Type: I ran into a bug with dependsOn(). Properties depending on the default value of another property (like AUTO in our case) do not show up on the UI initially (when I just created the CS and open the properties tab).
I filed a Jira: https://issues.apache.org/jira/browse/NIFI-8270
I would not wait for this to be fixed and would also implement this Authentication Type in a follow-up ticket.

@exceptionfactory, @jfrazee Would it be fine with you to reduce the scope of this PR to Client Secret only? Service Principal with Client Secret is a feature alone. It can be extended with the Client Certificate option and Authentication Type convenience property later.

@exceptionfactory
Copy link
Contributor

I removed the EL support from the sensitive properties.

I also removed SSLContextService and implemented the keystore handling locally but in the end I felt it is something that should rather go into a separate KeyStoreService CS. To leave this question open, I removed the Client Certificate way from this PR and I would handle it in a follow-up ticket.
You can check the implementation here: https://github.com/turcsanyip/nifi/compare/NIFI-8258...turcsanyip:NIFI-8258_Client_Certificate?expand=1
I can open another PR from it or we can implement KeyStoreService CS.

Regarding Authentication Type: I ran into a bug with dependsOn(). Properties depending on the default value of another property (like AUTO in our case) do not show up on the UI initially (when I just created the CS and open the properties tab).
I filed a Jira: https://issues.apache.org/jira/browse/NIFI-8270
I would not wait for this to be fixed and would also implement this Authentication Type in a follow-up ticket.

@exceptionfactory, @jfrazee Would it be fine with you to reduce the scope of this PR to Client Secret only? Service Principal with Client Secret is a feature alone. It can be extended with the Client Certificate option and Authentication Type convenience property later.

@turcsanyip Reducing the scope of this PR to just the Service Principal with Client Secret sounds good. Based on the issue described in NIFI-8270, waiting on resolution there to implement the Authentication Type property and Client Certificate properties seems like a good way forward.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updates and current implementation to support Service Principal authentication with Client Secret, and introducing Variable Registry EL support for Endpoint Suffix look good. Any additional comments @jfrazee?

Copy link
Member

@jfrazee jfrazee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

I think waiting and sorting out the dependsOn and SSL context separately makes sense. I have a few minor comments but nothing I'd call blocking -- can do in the follow ups maybe.

I ran manual tests for service principals, access keys, and SAS tokens and ran the ITs.

@turcsanyip Were you planning to push any additional changes?

results.add(new ValidationResult.Builder().subject(this.getClass().getSimpleName())
.valid(false)
.explanation("one and only one of [" + options + "] should be set")
.explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use displayName() here?

Suggested change
.explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")
.explanation(String.format("one and only one authentication method of [%s, %s, %s, Service Principal] should be used",
ACCOUNT_KEY.displayName(), SAS_TOKEN.displayName(), USE_MANAGED_IDENTITY.displayName()))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Account Key and SAS Token would be fine but I would simply use Managed Identity (without the "Use" prefix from the displayname).
I think it is something we should rather fix when the new Authentication Type property is added (which will have the same AllowableValue-s that would be needed here too).

.explanation("one and only one authentication method of [Account Key, SAS Token, Managed Identity, Service Principal] should be used")
.build());
} else if (servicePrincipalSet) {
String template = "'%s' must be set when Service Principal authentication is being configured";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String template = "'%s' must be set when Service Principal authentication is being configured";
final String template = "'%s' must be set when Service Principal authentication is being configured";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for this, I would not add a new commit. Will fix it in the follow-up jira (where customValidate() will be modified).

@turcsanyip
Copy link
Contributor Author

Follow-up jiras:

@exceptionfactory
Copy link
Contributor

exceptionfactory commented Mar 2, 2021

Thanks for your work on this @turcsanyip and for the additional issues referenced. Thanks for your input as well @jfrazee. +1 Merging.

@asfgit asfgit closed this in 7876f41 Mar 2, 2021
driesva pushed a commit to driesva/nifi that referenced this pull request Mar 19, 2021
…rocessors

- Removed Expression Language support indicators from sensitive properties

This closes apache#4843

Signed-off-by: David Handermann <exceptionfactory@apache.org>
krisztina-zsihovszki pushed a commit to krisztina-zsihovszki/nifi that referenced this pull request Jun 28, 2022
…rocessors

- Removed Expression Language support indicators from sensitive properties

This closes apache#4843

Signed-off-by: David Handermann <exceptionfactory@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants