Azure Data Assets

kedro-azureml adds support for two new datasets that can be used in the Kedro catalog. Right now we support both Azure ML v1 SDK (direct Python) and Azure ML v2 SDK (fsspec-based) APIs.

For v2 API (fspec-based) - use AzureMLAssetDataset that enables to use Azure ML v2 SDK Folder/File datasets for remote and local runs. Currently only the uri_file and uri_folder types are supported. Because of limitations of the Azure ML SDK, the uri_file type can only be used for pipeline inputs, not for outputs. The uri_folder type can be used for both inputs and outputs.

For v1 API (deprecated ⚠️) use the AzureMLFileDataset and the AzureMLPandasDataset which translate to File/Folder dataset and Tabular dataset respectively in Azure Machine Learning. Both fully support the Azure versioning mechanism and can be used in the same way as any other dataset in Kedro.

Apart from these, kedro-azureml also adds the AzureMLPipelineDataset which is used to pass data between pipeline nodes when the pipeline is run on Azure ML and the pipeline data passing feature is enabled. By default, data is then saved and loaded using the PickleDataset as underlying dataset. Any other underlying dataset can be used instead by adding a AzureMLPipelineDataset to the catalog.

All of these can be found under the kedro_azureml.datasets module.

For details on usage, see the API Reference below

API Reference

Pipeline data passing ^^^^^^^^^^^^^

⚠️ Cannot be used when run locally.

kedro_azureml.datasets.AzureMLPipelineDataset

V2 SDK

Use the dataset below when you're using Azure ML SDK v2 (fsspec-based).

✅ Can be used for both remote and local runs.

kedro_azureml.datasets.asset_dataset.AzureMLAssetDataset

V1 SDK

Use the datasets below when you're using Azure ML SDK v1 (direct Python).

⚠️ Deprecated - will be removed in future version of kedro-azureml.

kedro_azureml.datasets.AzureMLPandasDataset

kedro_azureml.datasets.AzureMLFileDataset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

05_data_assets.rst

05_data_assets.rst

Azure Data Assets

API Reference

V2 SDK

V1 SDK

Files

05_data_assets.rst

Latest commit

History

05_data_assets.rst

File metadata and controls

Azure Data Assets

API Reference

V2 SDK

V1 SDK