Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add download_files method for ArtifactVersion #2434

Merged
merged 17 commits into from
Feb 15, 2024

Conversation

strickvl
Copy link
Contributor

@strickvl strickvl commented Feb 9, 2024

This pull request adds a new method called save_binary to the ArtifactVersionResponse class. The save_binary method allows users to save the binary data associated with an artifact version as a zip file. It also includes tests to ensure the functionality is working as expected.

Summary by CodeRabbit

  • New Features
    • Introduced a method to bypass materialization and directly download binary data associated with artifact versions as zip files.
    • Added documentation on how to use the .save_binary method for downloading binary data.
  • Documentation
    • Updated the user guide with a new section on managing artifacts, specifically on downloading binary data.
  • Tests
    • Implemented new integration tests for verifying the functionality of saving and loading binary artifacts.

@strickvl strickvl added the enhancement New feature or request label Feb 9, 2024
@strickvl strickvl requested a review from bcdurak February 9, 2024 17:19
@strickvl strickvl added the internal To filter out internal PRs and issues label Feb 9, 2024
@strickvl strickvl requested a review from htahir1 February 9, 2024 17:19
Copy link
Contributor

coderabbitai bot commented Feb 9, 2024

Important

Auto Review Skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Walkthrough

The recent updates introduce a feature for managing artifacts more efficiently by allowing users to bypass materialization and directly download binary data associated with specific artifact versions. This is facilitated through the addition of a new method and utility function, which handle the saving of artifacts as zip files. The changes also include necessary imports to support this functionality and integration tests to ensure its reliability, focusing on handling zip files, including overwriting existing files.

Changes

File Path Change Summary
docs/.../manage-artifacts.md Added section on using .save_binary method to download artifact binary data as zip.
src/zenml/.../utils.py Added save_artifact_binary_from_response function; imports for contextlib, zipfile, Path.
src/zenml/.../artifact_version.py Added save_binary method for saving artifact binary data as a zip file with overwrite option.
tests/.../test_utils.py Added tests for saving/loading binary artifacts, handling zip files, and overwrite functionality.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@strickvl
Copy link
Contributor Author

strickvl commented Feb 9, 2024

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 6

Configuration used: .coderabbit.yaml

Commits Files that changed from the base of the PR and between 5c785c1 and 06bfacf.
Files selected for processing (4)
  • docs/book/user-guide/starter-guide/manage-artifacts.md (1 hunks)
  • src/zenml/artifacts/utils.py (2 hunks)
  • src/zenml/models/v2/core/artifact_version.py (1 hunks)
  • tests/integration/functional/artifacts/test_utils.py (3 hunks)
Additional comments: 2
src/zenml/artifacts/utils.py (2)
  • 20-20: Import zipfile is correctly used for handling zip file operations.
  • 21-21: Import Path is correctly used for handling file paths.

src/zenml/artifacts/utils.py Show resolved Hide resolved
src/zenml/artifacts/utils.py Outdated Show resolved Hide resolved
tests/integration/functional/artifacts/test_utils.py Outdated Show resolved Hide resolved
tests/integration/functional/artifacts/test_utils.py Outdated Show resolved Hide resolved
src/zenml/models/v2/core/artifact_version.py Outdated Show resolved Hide resolved
docs/book/user-guide/starter-guide/manage-artifacts.md Outdated Show resolved Hide resolved
@strickvl strickvl changed the title Add save_binary method Add save_binary method for ArtifactVersion` Feb 12, 2024
@strickvl strickvl changed the title Add save_binary method for ArtifactVersion` Add save_binary method for ArtifactVersion Feb 13, 2024
Copy link
Contributor

@bcdurak bcdurak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, otherwise, it looks good.

src/zenml/artifacts/utils.py Outdated Show resolved Hide resolved
tests/integration/functional/artifacts/test_utils.py Outdated Show resolved Hide resolved
@strickvl strickvl changed the title Add save_binary method for ArtifactVersion Add download_files method for ArtifactVersion Feb 15, 2024
@strickvl strickvl merged commit 72dbb31 into develop Feb 15, 2024
56 of 57 checks passed
@strickvl strickvl deleted the feature/OSSK-420-pull-binary-artifact branch February 15, 2024 22:03
avishniakov added a commit that referenced this pull request Feb 21, 2024
* Fix conditional statements in GitHub workflows (#2404)

* Fix conditional statements in GitHub workflows

* rename core CI flows

* slow CI check doesn't happen when draft

* Auto-update of Starter template

* fix double conditional

---------

Co-authored-by: GitHub Actions <actions@github.com>

* Ensure proper spacing in error messages (#2399)

* Ensure proper spacing in error messages

* update TOC (#2406)

---------

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>

* Fix hyperai markdown table (#2426)

* build: ⬆️ Upgrade min required google-cloud-aiplatform to 1.34.0 (#2428)

* Close code block left open in the docs (#2432)

* Fix docs

* wrong ticks!

* Simplify HF example and notify when cache is down (#2300)

* starter files for the new CI paradigm

* disable fast/slow ci on base branch

* disable core workflow

* Fast/slow CI core scaffold (#2274)

* give darglint check its own job

* fastCI

* add slowCI

* reenable fast CI

* remove comment

* add integration tests

* fix spellcheck context

* enable slow CI for testing

* remove unit test dependency

* fix dependency installations

* yamlfixed

* Comment-driven CI (#2275)

* test comment-driven approach

* delete unused test file

* slow CI is comment-driven

* restore CI

* conditionally respond to comments depending on team status

* add the whole team

* Update .github/workflows/ci-slow.yml

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>

---------

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>

* delete old CI

---------

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>

* remove spellcheck from slow CI

* update spellcheck run conditions

* Add GitHub issue creation on cache miss

* test failing cache

* Update Minio endpoint in setup_environment action.yml

* Update minio-service endpoint in setup_environment action.yml

* Sharded integration tests for Ubuntu (#2286)

* add pytest-shard dev dependency

* update script for sharded testing

* add ubuntu sharding

* fix naming

* Use `pytest-split` to shard CI (#2296)

* add split test to action

* Update user authentication logic

* Fix bug in login functionality

* Refactor test coverage script

* Update excluded directories in pyproject.toml

* Update integration test script to include shard number

* Update integration test script to use matrix.shard

* Update caching key in setup_environment action.yml

* Update durations path in test-coverage-xml.sh

* Update cache key in setup_environment action.yml

* Auto-update of Starter template

* Fix formatting issue in setup_environment action.yml

* Refactor code to improve performance and readability

---------

Co-authored-by: GitHub Actions <actions@github.com>

* Update pyproject.toml

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>

* add extra final line

* make workflows use normal ubuntu)

* fix durations path and update docstring

---------

Co-authored-by: Safoine El Khabich <34200873+safoinme@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>

* Auto-update of Starter template

* Auto-update of NLP template

* Auto-update of E2E template

* add docker testing back in

* Auto-update of E2E template

* temporarily trigger slow CI

* revert to comment-driven CI

* run full slow CI

* CI as it should be

* pyyaml fix

* fix docker compose installation

* test

* test

* update templates test

* ubuntu-unit tests

* restore unit tests back to normal

* fix matrix for slow CI

* uncomment the conditional checks

* add input variable

* remove mac and windows for testing

* split out slow and fast integration testing

* naming fix

* confirm mac and windows ok"

* improve hf and neuralprofet example

* update the issue

* update TOC (#2406)

* Correct docstring in integration init file (#2408)

* Fixed precedence

* adding the new version to the migration tests (#2411)

* update js code for github cache miss

* update context to github

* add discord webhooks

* Add Discord webhook support for notifications

* allow fallback of cache failure

* ignore if weebhock fails to to many request

* Add PYTORCH integration to DockerSettings

---------

Co-authored-by: Alex Strick van Linschoten <stricksubscriptions@fastmail.fm>
Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>
Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>
Co-authored-by: Christian Versloot <c.versloot@infoplaza.nl>
Co-authored-by: Hamza Tahir <htahir111@gmail.com>
Co-authored-by: Barış Can Durak <36421093+bcdurak@users.noreply.github.com>

* Adding the latest version id and name to the artifact response (#2430)

* update TOC (#2406)

* Correct docstring in integration init file (#2408)

* Fixed precedence

* adding the new version to the migration tests (#2411)

* adding latest version name and id to artifact response

* removed optional column from the conftest

---------

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>
Co-authored-by: Christian Versloot <c.versloot@infoplaza.nl>
Co-authored-by: Hamza Tahir <htahir111@gmail.com>

* Adding the ID of the producer pipeline run to artifact versions (#2431)

* adding producer pipeline run id to artifact versions

* reverting one of the changes

* fixing type

* Add vulnerability notice to README (#2437)

* Add security vulnerability notice to README

* add CVE ID

* Allow more recent `adlfs` and `s3fs` versions (#2402)

* bump azure integration

* bump s3

* Add new property for filtering service account events (#2405)

* add new property for filtering service account activities

* Auto-update of Starter template

---------

Co-authored-by: GitHub Actions <actions@github.com>

* Add `download_files` method for `ArtifactVersion` (#2434)

* add save_binary method

* Fix file overwrite issue in save_artifact_binary_from_response() and improve error handling

* refactor

* tests ofc

* add docs

* linting

* mypy fixes

* ruff fix

* coderabbit suggestions

* missing docstring

* docstring fix

* Update artifact method name from save_binary to download_binary

* more renaming (save -> download)

* final rename (binary -> files)

* update settings syntax

* Fixing `update_model`s and revert #2402 (#2440)

* fixing update models

* reverting the update model changes

* linting

* linting

* revert #2402

* revert adlfs changes

---------

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>
Co-authored-by: Alex Strick van Linschoten <stricksubscriptions@fastmail.fm>

* Prepare release 0.55.3 (#2445)

* alembic migration and bump version

* release notes

* add `save_models_to_registry` to CLI

---------

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>
Co-authored-by: Christian Versloot <c.versloot@infoplaza.nl>
Co-authored-by: François SERRA <francois.serra@adeo.com>
Co-authored-by: jlopezpena <jlopezpena@users.noreply.github.com>
Co-authored-by: Safoine El Khabich <34200873+safoinme@users.noreply.github.com>
Co-authored-by: Alex Strick van Linschoten <stricksubscriptions@fastmail.fm>
Co-authored-by: Hamza Tahir <htahir111@gmail.com>
Co-authored-by: Barış Can Durak <36421093+bcdurak@users.noreply.github.com>
Co-authored-by: Jayesh Sharma <wjayesh@outlook.com>
adtygan pushed a commit to adtygan/zenml that referenced this pull request Mar 21, 2024
* add save_binary method

* Fix file overwrite issue in save_artifact_binary_from_response() and improve error handling

* refactor

* tests ofc

* add docs

* linting

* mypy fixes

* ruff fix

* coderabbit suggestions

* missing docstring

* docstring fix

* Update artifact method name from save_binary to download_binary

* more renaming (save -> download)

* final rename (binary -> files)
adtygan pushed a commit to adtygan/zenml that referenced this pull request Mar 21, 2024
…l-io#2447)

* Fix conditional statements in GitHub workflows (zenml-io#2404)

* Fix conditional statements in GitHub workflows

* rename core CI flows

* slow CI check doesn't happen when draft

* Auto-update of Starter template

* fix double conditional

---------

Co-authored-by: GitHub Actions <actions@github.com>

* Ensure proper spacing in error messages (zenml-io#2399)

* Ensure proper spacing in error messages

* update TOC (zenml-io#2406)

---------

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>

* Fix hyperai markdown table (zenml-io#2426)

* build: ⬆️ Upgrade min required google-cloud-aiplatform to 1.34.0 (zenml-io#2428)

* Close code block left open in the docs (zenml-io#2432)

* Fix docs

* wrong ticks!

* Simplify HF example and notify when cache is down (zenml-io#2300)

* starter files for the new CI paradigm

* disable fast/slow ci on base branch

* disable core workflow

* Fast/slow CI core scaffold (zenml-io#2274)

* give darglint check its own job

* fastCI

* add slowCI

* reenable fast CI

* remove comment

* add integration tests

* fix spellcheck context

* enable slow CI for testing

* remove unit test dependency

* fix dependency installations

* yamlfixed

* Comment-driven CI (zenml-io#2275)

* test comment-driven approach

* delete unused test file

* slow CI is comment-driven

* restore CI

* conditionally respond to comments depending on team status

* add the whole team

* Update .github/workflows/ci-slow.yml

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>

---------

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>

* delete old CI

---------

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>

* remove spellcheck from slow CI

* update spellcheck run conditions

* Add GitHub issue creation on cache miss

* test failing cache

* Update Minio endpoint in setup_environment action.yml

* Update minio-service endpoint in setup_environment action.yml

* Sharded integration tests for Ubuntu (zenml-io#2286)

* add pytest-shard dev dependency

* update script for sharded testing

* add ubuntu sharding

* fix naming

* Use `pytest-split` to shard CI (zenml-io#2296)

* add split test to action

* Update user authentication logic

* Fix bug in login functionality

* Refactor test coverage script

* Update excluded directories in pyproject.toml

* Update integration test script to include shard number

* Update integration test script to use matrix.shard

* Update caching key in setup_environment action.yml

* Update durations path in test-coverage-xml.sh

* Update cache key in setup_environment action.yml

* Auto-update of Starter template

* Fix formatting issue in setup_environment action.yml

* Refactor code to improve performance and readability

---------

Co-authored-by: GitHub Actions <actions@github.com>

* Update pyproject.toml

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>

* add extra final line

* make workflows use normal ubuntu)

* fix durations path and update docstring

---------

Co-authored-by: Safoine El Khabich <34200873+safoinme@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>

* Auto-update of Starter template

* Auto-update of NLP template

* Auto-update of E2E template

* add docker testing back in

* Auto-update of E2E template

* temporarily trigger slow CI

* revert to comment-driven CI

* run full slow CI

* CI as it should be

* pyyaml fix

* fix docker compose installation

* test

* test

* update templates test

* ubuntu-unit tests

* restore unit tests back to normal

* fix matrix for slow CI

* uncomment the conditional checks

* add input variable

* remove mac and windows for testing

* split out slow and fast integration testing

* naming fix

* confirm mac and windows ok"

* improve hf and neuralprofet example

* update the issue

* update TOC (zenml-io#2406)

* Correct docstring in integration init file (zenml-io#2408)

* Fixed precedence

* adding the new version to the migration tests (zenml-io#2411)

* update js code for github cache miss

* update context to github

* add discord webhooks

* Add Discord webhook support for notifications

* allow fallback of cache failure

* ignore if weebhock fails to to many request

* Add PYTORCH integration to DockerSettings

---------

Co-authored-by: Alex Strick van Linschoten <stricksubscriptions@fastmail.fm>
Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>
Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>
Co-authored-by: Christian Versloot <c.versloot@infoplaza.nl>
Co-authored-by: Hamza Tahir <htahir111@gmail.com>
Co-authored-by: Barış Can Durak <36421093+bcdurak@users.noreply.github.com>

* Adding the latest version id and name to the artifact response (zenml-io#2430)

* update TOC (zenml-io#2406)

* Correct docstring in integration init file (zenml-io#2408)

* Fixed precedence

* adding the new version to the migration tests (zenml-io#2411)

* adding latest version name and id to artifact response

* removed optional column from the conftest

---------

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>
Co-authored-by: Christian Versloot <c.versloot@infoplaza.nl>
Co-authored-by: Hamza Tahir <htahir111@gmail.com>

* Adding the ID of the producer pipeline run to artifact versions (zenml-io#2431)

* adding producer pipeline run id to artifact versions

* reverting one of the changes

* fixing type

* Add vulnerability notice to README (zenml-io#2437)

* Add security vulnerability notice to README

* add CVE ID

* Allow more recent `adlfs` and `s3fs` versions (zenml-io#2402)

* bump azure integration

* bump s3

* Add new property for filtering service account events (zenml-io#2405)

* add new property for filtering service account activities

* Auto-update of Starter template

---------

Co-authored-by: GitHub Actions <actions@github.com>

* Add `download_files` method for `ArtifactVersion` (zenml-io#2434)

* add save_binary method

* Fix file overwrite issue in save_artifact_binary_from_response() and improve error handling

* refactor

* tests ofc

* add docs

* linting

* mypy fixes

* ruff fix

* coderabbit suggestions

* missing docstring

* docstring fix

* Update artifact method name from save_binary to download_binary

* more renaming (save -> download)

* final rename (binary -> files)

* update settings syntax

* Fixing `update_model`s and revert zenml-io#2402 (zenml-io#2440)

* fixing update models

* reverting the update model changes

* linting

* linting

* revert zenml-io#2402

* revert adlfs changes

---------

Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com>
Co-authored-by: Alex Strick van Linschoten <stricksubscriptions@fastmail.fm>

* Prepare release 0.55.3 (zenml-io#2445)

* alembic migration and bump version

* release notes

* add `save_models_to_registry` to CLI

---------

Co-authored-by: Alex Strick van Linschoten <strickvl@users.noreply.github.com>
Co-authored-by: GitHub Actions <actions@github.com>
Co-authored-by: Christian Versloot <c.versloot@infoplaza.nl>
Co-authored-by: François SERRA <francois.serra@adeo.com>
Co-authored-by: jlopezpena <jlopezpena@users.noreply.github.com>
Co-authored-by: Safoine El Khabich <34200873+safoinme@users.noreply.github.com>
Co-authored-by: Alex Strick van Linschoten <stricksubscriptions@fastmail.fm>
Co-authored-by: Hamza Tahir <htahir111@gmail.com>
Co-authored-by: Barış Can Durak <36421093+bcdurak@users.noreply.github.com>
Co-authored-by: Jayesh Sharma <wjayesh@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues run-slow-ci
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants