Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key vault trusted service list should mention that Data Factory is supported only for Azure-Hosted IR and Dataflows #119969

Open
jikuja opened this issue Feb 18, 2024 · 5 comments

Comments

@jikuja
Copy link

jikuja commented Feb 18, 2024

Table on linked documentation mentions following:

Azure Data Factory | Fetch data store credentials in Key Vault from Data Factory

That's not fully true anymore. In spring 2023 there was changes in Data Factory token generation and Self-Hosted integration runtimes

The response I got from the support(TrackingID#2306260050002520):

As discussed in our meeting the connection issue from Self-Hosted IR towards the Azure Key vault its being caused by an improvement done on our Data Factory product for security reasons.
 
Our core engineer team identified that the permissions of the Managed Identity tokens used within untrusted environment (SHIR) were too high.
The high privilege token can bypass the AKV firewall as a trusted service Managed Identity token. However,  Self-Hosted IR isn't a trusted service as it is installed on a machine/environment managed outside Data Factory/Azure. Therefore, the high privilege Managed Identity token could leak and cause unexpected results (e.g. the network isolation will be broken).
 
There is a zero tolerance in terms of security and in order to make sure that Data Factory follows the best practices, the Self-Hosted IR Managed was restricted to only low-privilege Managed Identity tokens. This change caused the Self-Hosted IR machine to be blocked by AKV firewall.

Documentation should clearly mention which Data Factory components are being trusted by the service. Linking now outdated blog post as extra documentation is not good solution in the long run. Data factory documentation team should create their own documentation page if more information than single line on table is needed.

The blog post "Fetch data store credentials in Key Vault from Data Factory" links, is outdated.


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

@AjayBathini-MSFT
Copy link
Contributor

@jikuja
Thanks for your feedback! We will investigate and update as appropriate.

@ManoharLakkoju-MSFT
Copy link
Contributor

@jikuja
Thank you for bringing this to our attention.
I've delegated this to content author @msmbaldwin, who will review it and offer their insightful opinions.

@jlichwa
Copy link
Contributor

jlichwa commented Feb 27, 2024

@jikuja is there public document saying that Data Factory is supported only for Azure-Hosted IR and Dataflows?

@jikuja
Copy link
Author

jikuja commented Feb 28, 2024

@jikuja is there public document saying that Data Factory is supported only for Azure-Hosted IR and Dataflows?

No. The only public documentation about this topic is the table on linked content page and the blog post.

I know about those new MSI token limitations on SHIR because the change support described broke our system.

Data Factory PG/PM should be on the loop to get this properly documented in ADF documentation: single blog post is not good enough. For this I could try posting something into QA site. (Will not try ask @AzureSupport on Twitter. Did it earlier and they refuse to read Github issues and ping SMEs/PG/PM with Github issue)

@jlichwa
Copy link
Contributor

jlichwa commented Mar 26, 2024

@jikuja issue should be created on Data Factory documentation for specific use case. Otherwise the DataFactory PMs will never see it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants