Skip to content

Commit

Permalink
Change default to no code upload and update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
tomasvanpottelbergh committed Nov 6, 2022
1 parent 5bf7801 commit 65e7841
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 44 deletions.
81 changes: 38 additions & 43 deletions docs/source/03_quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,66 +51,61 @@ created in Azure and have their **names** ready to input to the plugin:
or set appopriate settings
(`https://github.com/kedro-org/kedro-plugins/tree/main/kedro-telemetry <https://github.com/kedro-org/kedro-plugins/tree/main/kedro-telemetry>`__).
6. Install the requirements ``pip install -r src/requirements.txt``
7. Create an Azure ML Environment for the project:

For the project's code to run on Azure ML it needs to have an environment
with the necessary dependencies. Here is it shown how to do this from a
local Docker build context. Please refer to the
`Azure ML CLI documentation <https://learn.microsoft.com/en-us/azure/machine-learning/how-to-manage-environments-v2#create-an-environment>`
for more options.

Start by executing the following command:
7. Initialize Kedro Azure ML plugin, it requires the Azure resource
names as stated above. Experiment name can be anything you like (as
long as it's allowed by Azure ML). The environment name is the name
of the Azure ML Environment to be created in the next step. You can
use the syntax <environment_name>@latest for the latest version or
<environment-name>:<version> for a specific version.

.. code:: console
kedro docker init
#Usage: kedro azureml init [OPTIONS] RESOURCE_GROUP WORKSPACE_NAME
# EXPERIMENT_NAME CLUSTER_NAME STORAGE_ACCOUNT_NAME
# STORAGE_CONTAINER ENVIRONMENT_NAME
kedro azureml init <resource-group-name> <workspace-name> <experiment-name> <compute-cluster-name> <storage-account-name> <storage-container-name> <environment-name>
This command creates a several files, including ``Dockerfile`` and
``.dockerignore``. These can be adjusted to match the workflow for
your project.
Depending on whether you want to use code upload when submitting an
experiment or not, you would need to add the code and any possible input
data to the Docker image.
8. Create an Azure ML Environment for the project:

- If using code upload:
Everything apart from the section "install project requirements"
can be removed from the ``Dockerfile``. You can add a
``.amlignore`` file to specify which files should be uploaded.
- If not using code upload:
Keep the sections in the ``Dockerfile`` and adjust the ``.dockerignore``
file to add any other files to be added to the Docker image,
such as ``!data/01_raw`` for the raw data files.

Set ``code_directory: null`` in the ``azureml.yml`` config file.
For the project's code to run on Azure ML it needs to have an environment
with the necessary dependencies. Here is it shown how to do this from a
local Docker build context. Please refer to the
`Azure ML CLI documentation <https://learn.microsoft.com/en-us/azure/machine-learning/how-to-manage-environments-v2#create-an-environment>`
for more options.

Create or update an Azure ML Environment by running the following command:
Start by executing the following command:

.. code:: console
az ml environment create --name <environment-name> --version <version> --build-context . --dockerfile-path Dockerfile
kedro docker init
This command creates a several files, including ``Dockerfile`` and
``.dockerignore``. These can be adjusted to match the workflow for
your project.
8. Initialize Kedro Azure ML plugin, it requires the Azure resource
names as stated above. Experiment name can be anything you like (as
long as it's allowed by Azure ML). The environment name is the name
of the Azure ML Environment created in the previous step. You can
use the syntax <environment_name>@latest for the latest version or
<environment-name>:<version> for a specific version.
Depending on whether you want to use code upload when submitting an
experiment or not, you would need to add the code and any possible input
data to the Docker image.
.. code:: console
- If using code upload:
Everything apart from the section "install project requirements"
can be removed from the ``Dockerfile``. You can add a
``.amlignore`` file to specify which files should be uploaded.
#Usage: kedro azureml init [OPTIONS] RESOURCE_GROUP WORKSPACE_NAME
# EXPERIMENT_NAME CLUSTER_NAME STORAGE_ACCOUNT_NAME
# STORAGE_CONTAINER ENVIRONMENT_NAME
kedro azureml init <resource-group-name> <workspace-name> <experiment-name> <compute-cluster-name> <storage-account-name> <storage-container-name> <environment-name>
Set ``code_directory: "."`` (or a subdirectory containing the code to upload)
in the ``azureml.yml`` config file.
- If not using code upload:
Keep the sections in the ``Dockerfile`` and adjust the ``.dockerignore``
file to add any other files to be added to the Docker image,
such as ``!data/01_raw`` for the raw data files.
Create or update an Azure ML Environment by running the following command:
.. code:: console
Configuration generated in /Users/marcin/Dev/tmp/kedro-azureml-demo/conf/base/azureml.yml
It's recommended to set Lifecycle management rule for storage container kedro-azure-storage to avoid costs of long-term storage of the temporary data.
Temporary data will be stored under abfs://kedro-azure-storage/kedro-azureml-temp path
See https://docs.microsoft.com/en-us/azure/storage/blobs/lifecycle-management-policy-configure?tabs=azure-portal
az ml environment create --name <environment-name> --version <version> --build-context . --dockerfile-path Dockerfile
9. Adjust the Data Catalog - the default one stores all data locally,
whereas the plugin will automatically use Azure Blob Storage. Only
Expand Down
2 changes: 1 addition & 1 deletion kedro_azureml/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ class KedroAzureRunnerConfig(BaseModel):
# Azure ML Environment to use during pipeline execution
environment_name: "{environment_name}"
# Path to directory to upload, or null to disable code upload
code_directory: "."
code_directory: null
# Path to the directory in the Docker image to run the code from
# Ignored when code_directory is set
working_directory: /home/kedro
Expand Down

0 comments on commit 65e7841

Please sign in to comment.