-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CoE Starter Kit - QUESTION] [BYODL] when to delete old files from datalake / storage account? #6550
Comments
Hello, great question, here's some thoughts: There's two "types" of files that are exported to the Datalake
Hope that helps, |
Lets consider adding this to the BYODL FAQ documentation. |
Thanks a lot for your answer! My understanding is that the entries of the files in the data lake are reflected in the dataverse-instance of the CoE - and if files in the data lake are removed, then the data in the dataverse-instance of the CoE, which is based upon the removed files, is removed as well. |
For inventory, that's correct. If a file is deleted in the data lake, we assume that the environment/app/flow/etc does not exist anymore and remove it from the CoE inventory (or mark it as deleted there). For usage, we don't write the usage information for apps or flows to Dataverse anymore - we only consume it into the Power BI dashboard via Power BI Dataflows - so once you delete those, they're lost. But yes, overall it's not needed to remove data from Dataverse if you remove it from the Datalake. |
@MSFT-klpinhac This has been fixed in the latest release. Please install the latest version of the toolkit following the instructions for installing updates. Note that if you do not remove the unmanaged layers as described there you will not receive updates from us. |
Does this question already exist in our backlog?
What is your question?
The data export feature creates lots of JSON-Files in the storage account / data lake, and my customer needs to implement a clean-up for multiple reasons, e.g. compliance (i.e. old files need to be deleted from the storage account / data lake).
Is there any recommendation regarding such a clean-up?
I would assume a scheduled task/cronjob/etc that runs e.g. once per month and deletes all files in the storage account / data lake that have been created more then 6 month ago should work fine as a clean-up.
Is there anything my customer should look out for, e.g. are there any dependencies that could trouble if files that have been created more then 6 month ago are deleted from the storage account / data lake?
What solution are you experiencing the issue with?
Core
What solution version are you using?
No response
What app or flow are you having the issue with?
No response
What method are you using to get inventory and telemetry?
None
The text was updated successfully, but these errors were encountered: