New core architecture #305

schustmi · 2022-01-10T10:25:34Z

Pre-requisites

Please ensure you have done the following:

I have read the CONTRIBUTING.md document.
If my change requires a change to the documentation, I have updated the documentation accordingly.
I have added tests to cover my changes.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Describe changes

This PR implements fundamental changes to the core architecture to solve some of the issues we previously had and hopefully provides a more extensible design to support quicker implementations of different stack components/integrations.

This PR modifies lots of files, but the main changes are the following:

Refactor the Repository/Stack/StackComponent architecture (see src/zenml/repository.py and src/zenml/stack/*.py)
Update the existing components (e.g. GCPArtifactStore) to work with these new classes
Update the relevant CLI code to work with these new classes (this includes combining the duplicated CLI code for all previous stack components into src/zenml/cli/stack_components.py)
Write tests for all of the above

Most of the remaining changes are:

small updates to the imports to use the new repository and stack classes
deleting old and unused files/code

src/zenml/enums.py

bcdurak · 2022-01-10T14:12:43Z

Looking great so far. So many big and significant changes. Really looking forward to the final version.

htahir1

Ok I tried my best but could only come with two comments really. Therefore, LGTM!!! What an amazing branch and effort!

htahir1 · 2022-01-19T14:12:40Z

src/zenml/config/global_config.py

+            # set back the old value as we don't want to permanently store
+            # the environment variable value here
+            super().__setattr__(key, value)
+            return return_value


Ok thats interesting how you did it here. I hope there are tests for this in the PR below but its a nice workaround for now!

There is now :P

htahir1 · 2022-01-19T14:13:05Z

src/zenml/config/global_config.py

+        options from a legacy config file or returns an empty dictionary.
+        """
+        legacy_config_file = os.path.join(
+            GlobalConfig.config_directory(), ".zenglobal.json"


".zenglobal.json" -> probably best as a constant somewhere

alex-zenml · 2022-01-19T15:28:53Z

@schustmi I guess it's a breaking change so people will realise, but I think people will have to delete their .zen folder and re-initialise their repositories. Am I mistaken in that? Is there any way they can hook up their old metadata stores with this new paradigm?

(When I was trying some CLI commands, it said that there was no orchestrator registered etc, which makes sense but might be unexpected for anyone upgrading.)

alex-zenml

No real comments of substance from me (aside from some tiny nits here and there). I appreciated the ZenShare last week which gave me some context for reading around in the code.

I love the extensive tests, too, and yeah, in general seems like this PR will set us up really well for the future.

alex-zenml · 2022-01-19T15:41:04Z

src/zenml/repository.py

+                recursively searching in the parent directories of the current
+                working directory.
+
+        Raises:


Mainly for me to understand this: is it normal for a method's docstring to state that it could raise an error if that error is raised in a separate function/method? In this case it's Repository.find_repository() that's going to raise the error if it needs raising.

My personal option, but I feel especially for user-facing API it's important to mention as many potential exceptions (even coming from calls to other functions) as possible so they now what to catch in their code.

src/zenml/stack/stack_validator.py

src/zenml/cli/stack_components.py

alex-zenml · 2022-01-19T16:03:17Z

tests/unit/test_repository.py

+    """Creates a local stack with components with the given names. If the
+    names are not given, a random string is used instead."""
+
+    def _random_name():


Elsewhere in the repository we use hypothesis for doing things like this. It's potentially worth the extra effort to integrate that in since it might raise some weirdnesses that we'd otherwise not expect.

How would you use hypothesis in this case when it is not an actual test function but a helper function?

You'd get rid of the helper function and instead use the hypothesis @given decorator (I think) on the test and pass whatever random string(s) were generated by hypothesis. You'd need to make sure to pass in a min_size=1 along with the text() strategy.

I tried but it doesn't work in combination with our clean repo fixture

E E Function-scoped fixtures are not reset between examples generated by E `@given(...)`, which is often surprising and can cause subtle test bugs. E E If you were expecting the fixture to run separately for each generated example, E then unfortunately you will need to find a different way to achieve your goal E (e.g. using a similar context manager instead of a fixture).```

Ah yes, I'd forgotten about that. Do you hit a test strategy scope error?

The error was called FailedHealthCheck

Not sure if this applies in this case, but you can change the scope with @pytest.fixture(scope="module")

Unfortunately not, this fixture creates a clean repo for one specific test so has to stay with function scope. Anyway I think the random strings in this case don't really matter as it's just for the names, but I agree we should definitely use this in cases where the strings are more relevant!

schustmi · 2022-01-19T16:47:28Z

@alex-zenml Maybe we could implement some logic for migrating this specific release, but I feel like the effort isn't worth it. We've asked users to delete and recreate their .zen folder before and I feel like that's the way to go with this release as well

alex-zenml · 2022-01-19T16:51:10Z

@schustmi agreed. I guess I'm only wondering about the people with several hundred runs in their belt. As long as we're clear that this is a breaking change in the release notes it's probably fine.

bcdurak · 2022-01-19T17:02:16Z

Such an amazing effort! Thanks to the ZenShare, it was relatively easy to comprehend what was going on with the PR and it looks solid almost from all angles.

schustmi added 3 commits January 10, 2022 11:18

First draft of new core architecture implementation

270a177

Use safe loading when reading yaml files

a4eee1f

Refactor stack component CLI

9f50587

schustmi requested review from strickvl, stefannica, htahir1, jwwwb, bcdurak and AlexejPenner January 10, 2022 10:25

github-actions bot added the internal To filter out internal PRs and issues label Jan 10, 2022

schustmi added 3 commits January 10, 2022 11:28

Remove some old files

22b2cde

Convert existing stack components to new api

e562df6

Use new repository in kubeflow orchestrator

ebffe85

AlexejPenner reviewed Jan 10, 2022

View reviewed changes

src/zenml/enums.py Show resolved Hide resolved

[ci skip] Update some methods of stack CLI

3fb787f

schustmi added 14 commits January 11, 2022 09:24

Improved error message if stack component class is not registered

d5c5eb6

BaseMetadataStore is MLMD Store for now

af4724a

Use new repository in source utils

43da81d

Hide repository config and expose immutable values in public api

54ea8aa

Update names for local deployment methods

129f344

Update orchestrators for new stack component API

c684c40

Add post execution entrypoint method to new repository

9ebc957

Add missing quotes in type annotations

e9d51f5

[ci skip] Fix import cycle, add nicer represenation for stack components

37ddda3

Implement new up/down logic in kubeflow orchestrator

f9251bc

Implement new up/down logic in airflow orchestrator

5de609a

Update orchestrator up/down logic

f0d5d09

Fix missing logger in base metadata store

19a748e

Implement pipeline deployment using the new stack

e2eb153

schustmi added 15 commits January 14, 2022 15:06

Fix stack deprovisioning logic

964f374

[ci skip] Delete old core tests

6785dad

Fix mypy issue

9090b33

Implement new version of global config

892c8fd

Remove old core files

fa410cc

Remove duplicate code

16df836

Move some old test files

8a7b8a1

Fix cli init test

4d02582

Fix CLI analytics tests

0ada066

Update example/doc import of repository

ba8ea62

Merge branch 'develop' into michael/ENG-23-core-architecture

468f1d4

Bump pydantic version

302597c

Make sure kubeflow is installed for tests

f468576

Mock global config directory

252fb75

Update integration tests for new repo api

4b7d909

schustmi marked this pull request as ready for review January 17, 2022 18:13

Update global config superclass

7f3c533

schustmi requested a review from AlexejPenner January 18, 2022 09:13

Merge branch 'develop' into michael/ENG-23-core-architecture

7691c25

htahir1 approved these changes Jan 19, 2022

View reviewed changes

alex-zenml reviewed Jan 19, 2022

View reviewed changes

Some formatting fixes from PR

115f441

Test for global config environment variable overwriting

8d3d597

bcdurak approved these changes Jan 19, 2022

View reviewed changes

schustmi merged commit ea7c613 into develop Jan 20, 2022

schustmi deleted the michael/ENG-23-core-architecture branch January 20, 2022 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New core architecture #305

New core architecture #305

schustmi commented Jan 10, 2022 •

edited

bcdurak commented Jan 10, 2022

htahir1 left a comment

htahir1 Jan 19, 2022

schustmi Jan 19, 2022

htahir1 Jan 19, 2022

schustmi Jan 19, 2022

alex-zenml commented Jan 19, 2022

alex-zenml left a comment

alex-zenml Jan 19, 2022

schustmi Jan 19, 2022

alex-zenml Jan 19, 2022

schustmi Jan 19, 2022

alex-zenml Jan 19, 2022 •

edited

schustmi Jan 19, 2022

schustmi Jan 19, 2022

alex-zenml Jan 19, 2022

schustmi Jan 19, 2022

alex-zenml Jan 19, 2022

schustmi Jan 19, 2022

schustmi commented Jan 19, 2022

alex-zenml commented Jan 19, 2022

bcdurak commented Jan 19, 2022

Navigation Menu

New core architecture #305

New core architecture #305

Conversation

schustmi commented Jan 10, 2022 • edited

Pre-requisites

Types of changes

Describe changes

bcdurak commented Jan 10, 2022

htahir1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alex-zenml commented Jan 19, 2022

alex-zenml left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alex-zenml Jan 19, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schustmi commented Jan 19, 2022

alex-zenml commented Jan 19, 2022

bcdurak commented Jan 19, 2022

schustmi commented Jan 10, 2022 •

edited

alex-zenml Jan 19, 2022 •

edited