Skip to content

Commit

Permalink
Merge pull request #103 from getindata/release-0.8.0
Browse files Browse the repository at this point in the history
Release 0.8.0
  • Loading branch information
em-pe committed May 9, 2024
2 parents e7199a9 + e8b8eb7 commit 474ade0
Show file tree
Hide file tree
Showing 16 changed files with 1,736 additions and 1,651 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.7.0
current_version = 0.8.0

[bumpversion:file:pyproject.toml]

Expand Down
2 changes: 1 addition & 1 deletion .copier-answers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Kedro plugin with Azure ML Pipelines support
docs_url: https://kedro-azureml.readthedocs.io/
full_name: Kedro Azure ML Pipelines plugin
github_url: https://github.com/getindata/kedro-azureml
initial_version: 0.7.0
initial_version: 0.8.0
keywords:
- kedro
- mlops
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/prepare-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8]
python-version: [3.9]
env:
PYTHON_PACKAGE: kedro_azureml
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v1
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Validate inputs
Expand All @@ -44,7 +44,7 @@ jobs:
git push -u origin release-${{ steps.bump_version.outputs.package_version }}
- name: Open a PR to merge the release to master
id: open_pr
uses: vsoch/pull-request-action@1.0.12
uses: vsoch/pull-request-action@1.1.1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PULL_REQUEST_BRANCH: master
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/spellcheck.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ jobs:
steps:
# Spellcheck
- uses: actions/checkout@v2
- uses: rojopolis/spellcheck-github-actions@0.25.0
- uses: rojopolis/spellcheck-github-actions@0.35.0
name: Spellcheck
36 changes: 22 additions & 14 deletions .github/workflows/tests_and_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,32 @@ on:
- develop
paths-ignore:
- "docs/**"
- CHANGELOG.md
- README.md
- CONTRIBUTING.md
pull_request:
branches:
- master
- develop
paths-ignore:
- "docs/**"

- CHANGELOG.md
- README.md
- CONTRIBUTING.md
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
jobs:
unit_tests:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [ '3.8', '3.9', '3.10']
python-version: [ '3.8', '3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- name: Setup python ${{ matrix.python-version }}
uses: actions/setup-python@v2.2.1
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

Expand All @@ -45,7 +52,7 @@ jobs:
tox -v
- name: Store coverage reports
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: coverage-${{ matrix.python-version }}
path: coverage.xml
Expand All @@ -56,11 +63,11 @@ jobs:
needs: unit_tests
steps:

- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
fetch-depth: 0

- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: coverage-3.9
path: .
Expand All @@ -83,19 +90,19 @@ jobs:

steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v4

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@v3
with:
languages: python

- name: Autobuild
uses: github/codeql-action/autobuild@v2
uses: github/codeql-action/autobuild@v3

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@v3

### PROJECT SPECIFIC CONFIGURATION HERE

Expand All @@ -107,10 +114,10 @@ jobs:
matrix:
e2e_config: ["e2e", "e2e_pipeline_data_passing"]
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- name: Setup python
uses: actions/setup-python@v2.2.1
uses: actions/setup-python@v5
with:
python-version: "3.10"

Expand All @@ -133,9 +140,10 @@ jobs:
run: |
find "../dist" -name "*.tar.gz" | xargs -I@ cp @ kedro-azureml.tar.gz
echo -e "\n./kedro-azureml.tar.gz\n" >> src/requirements.txt
echo -e "kedro-docker\n" >> src/requirements.txt
echo -e "kedro-docker<0.5.0\n" >> src/requirements.txt
echo -e "openpyxl\n" >> src/requirements.txt # temp fix for kedro-datasets issues with optional packages
sed -i '/kedro-telemetry/d' src/requirements.txt
sed -i '/kedro-viz/d' src/requirements.txt # starter version requirements make tests fail
echo $(cat src/requirements.txt)
pip install -r src/requirements.txt
Expand Down
11 changes: 10 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@

## [Unreleased]

## [0.8.0] - 2024-05-09

- Added support for python 3.11 by [@em-pe](https://github.com/em-pe)
- Added support for pydantic v2 and bumped minimal required pydantic version to `2.6.4` by [@froessler](https://github.com/fdroessler)
- Added support for `metadata` argument in Azure ML datasets by [@tomasvanpottelbergh](https://github.com/tomasvanpottelbergh)
- Fixed azureml-fsspec version update changed return type of `_infer_storage_options` and pinned fsspec version to patch only [@froessler](https://github.com/fdroessler)

## [0.7.0] - 2023-11-15

- [💔 Breaking change] Renamed all `*DataSet` classes to `*Dataset` to follow Kedro's naming convention which will be introduced in 0.19.
Expand Down Expand Up @@ -86,7 +93,9 @@

- Initial plugin release

[Unreleased]: https://github.com/getindata/kedro-azureml/compare/0.7.0...HEAD
[Unreleased]: https://github.com/getindata/kedro-azureml/compare/0.8.0...HEAD

[0.8.0]: https://github.com/getindata/kedro-azureml/compare/0.7.0...0.8.0

[0.7.0]: https://github.com/getindata/kedro-azureml/compare/0.6.0...0.7.0

Expand Down
2 changes: 1 addition & 1 deletion docs/source/03_quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ For **code upload flow** (2.), use the following ``init`` command:
If you want to pass data between nodes using the built-in Azure ML pipeline data passing, specify
option ``--use-pipeline-data-passing`` instead of `-a` and `-c` options.

Note that pipeline data passing feature is experimental 🧑‍🔬 See :doc:`04_data_assets` for more information about this.
Note that pipeline data passing feature is experimental 🧑‍🔬 See :doc:`05_data_assets` for more information about this.

Adjusting the Data Catalog
--------------------------
Expand Down
2 changes: 1 addition & 1 deletion kedro_azureml/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.7.0"
__version__ = "0.8.0"

import warnings

Expand Down
18 changes: 11 additions & 7 deletions kedro_azureml/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
from typing import Dict, Optional, Type

import yaml
from pydantic import BaseModel, validator
from pydantic import BaseModel, Field, field_validator
from typing_extensions import Annotated

from kedro_azureml.utils import update_dict

Expand Down Expand Up @@ -39,7 +40,8 @@ def _create_default_dict_with(
default_value = (value := value or {}).get("__default__", default)
return dict_cls(lambda: default_value, value)

@validator("compute", always=True)
@field_validator("compute")
@classmethod
def _validate_compute(cls, value):
return AzureMLConfig._create_default_dict_with(
value, ComputeConfig(cluster_name="{cluster_name}")
Expand All @@ -49,11 +51,13 @@ def _validate_compute(cls, value):
resource_group: str
workspace_name: str
experiment_name: str
compute: Optional[Dict[str, ComputeConfig]]
temporary_storage: Optional[AzureTempStorageConfig]
environment_name: Optional[str]
code_directory: Optional[str]
working_directory: Optional[str]
compute: Annotated[
Optional[Dict[str, ComputeConfig]], Field(validate_default=True)
] = None
temporary_storage: Optional[AzureTempStorageConfig] = None
environment_name: Optional[str] = None
code_directory: Optional[str] = None
working_directory: Optional[str] = None
pipeline_data_passing: Optional[PipelineDataPassingConfig] = None


Expand Down
14 changes: 11 additions & 3 deletions kedro_azureml/datasets/asset_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ def __init__(
filepath_arg: str = "filepath",
azureml_type: AzureMLDataAssetType = "uri_folder",
version: Optional[Version] = None,
metadata: Dict[str, Any] = None,
):
"""
azureml_dataset: Name of the AzureML file azureml_dataset.
Expand All @@ -90,8 +91,15 @@ def __init__(
filepath_arg: Filepath arg on the wrapped dataset, defaults to `filepath`
azureml_type: Either `uri_folder` or `uri_file`
version: Version of the AzureML dataset to be used in kedro format.
metadata: Any arbitrary metadata.
This is ignored by Kedro, but may be consumed by users or external plugins.
"""
super().__init__(dataset=dataset, root_dir=root_dir, filepath_arg=filepath_arg)
super().__init__(
dataset=dataset,
root_dir=root_dir,
filepath_arg=filepath_arg,
metadata=metadata,
)

self._azureml_dataset = azureml_dataset
self._version = version
Expand Down Expand Up @@ -187,10 +195,10 @@ def _load(self) -> Any:
# relative (to storage account root) path of the file dataset on azure
# Note that path is converted to str for compatibility reasons with
# fsspec AbstractFileSystem expand_path function
path_on_azure = str(fs._infer_storage_options(azureml_ds.path)[-1])
path_on_azure = str(fs._infer_storage_options(azureml_ds.path)[1])
elif azureml_ds.type == "uri_folder":
# relative (to storage account root) path of the folder dataset on azure
dataset_root_on_azure = fs._infer_storage_options(azureml_ds.path)[-1]
dataset_root_on_azure = fs._infer_storage_options(azureml_ds.path)[1]
# relative (to storage account root) path of the dataset in the folder on azure
path_on_azure = str(
Path(dataset_root_on_azure)
Expand Down
4 changes: 4 additions & 0 deletions kedro_azureml/datasets/pipeline_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ def __init__(
dataset: Union[str, Type[AbstractDataset], Dict[str, Any]],
root_dir: str = "data",
filepath_arg: str = "filepath",
metadata: Dict[str, Any] = None,
):
"""Creates a new instance of ``AzureMLPipelineDataset``.
Expand All @@ -73,6 +74,8 @@ def __init__(
filepath_arg: Underlying dataset initializer argument that will
set the filepath.
If unspecified, defaults to "filepath".
metadata: Any arbitrary metadata.
This is ignored by Kedro, but may be consumed by users or external plugins.
Raises:
DatasetError: If versioning is enabled for the underlying dataset.
Expand All @@ -83,6 +86,7 @@ def __init__(

self.root_dir = root_dir
self._filepath_arg = filepath_arg
self.metadata = metadata
try:
# Convert filepath to relative path
self._dataset_config[self._filepath_arg] = str(
Expand Down

0 comments on commit 474ade0

Please sign in to comment.