Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New core architecture #305

Merged
merged 85 commits into from Jan 20, 2022
Merged

New core architecture #305

merged 85 commits into from Jan 20, 2022

Conversation

schustmi
Copy link
Collaborator

@schustmi schustmi commented Jan 10, 2022

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • If my change requires a change to the documentation, I have updated the documentation accordingly.
  • I have added tests to cover my changes.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Describe changes

This PR implements fundamental changes to the core architecture to solve some of the issues we previously had and hopefully provides a more extensible design to support quicker implementations of different stack components/integrations.

This PR modifies lots of files, but the main changes are the following:

  • Refactor the Repository/Stack/StackComponent architecture (see src/zenml/repository.py and src/zenml/stack/*.py)
  • Update the existing components (e.g. GCPArtifactStore) to work with these new classes
  • Update the relevant CLI code to work with these new classes (this includes combining the duplicated CLI code for all previous stack components into src/zenml/cli/stack_components.py)
  • Write tests for all of the above

Most of the remaining changes are:

  • small updates to the imports to use the new repository and stack classes
  • deleting old and unused files/code

@github-actions github-actions bot added the internal To filter out internal PRs and issues label Jan 10, 2022
@bcdurak
Copy link
Contributor

bcdurak commented Jan 10, 2022

Looking great so far. So many big and significant changes. Really looking forward to the final version.

@schustmi schustmi marked this pull request as ready for review January 17, 2022 18:13
Copy link
Contributor

@htahir1 htahir1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I tried my best but could only come with two comments really. Therefore, LGTM!!! What an amazing branch and effort!

# set back the old value as we don't want to permanently store
# the environment variable value here
super().__setattr__(key, value)
return return_value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thats interesting how you did it here. I hope there are tests for this in the PR below but its a nice workaround for now!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now :P

options from a legacy config file or returns an empty dictionary.
"""
legacy_config_file = os.path.join(
GlobalConfig.config_directory(), ".zenglobal.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

".zenglobal.json" -> probably best as a constant somewhere

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@alex-zenml
Copy link
Contributor

@schustmi I guess it's a breaking change so people will realise, but I think people will have to delete their .zen folder and re-initialise their repositories. Am I mistaken in that? Is there any way they can hook up their old metadata stores with this new paradigm?

(When I was trying some CLI commands, it said that there was no orchestrator registered etc, which makes sense but might be unexpected for anyone upgrading.)

Copy link
Contributor

@alex-zenml alex-zenml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real comments of substance from me (aside from some tiny nits here and there). I appreciated the ZenShare last week which gave me some context for reading around in the code.

I love the extensive tests, too, and yeah, in general seems like this PR will set us up really well for the future.

recursively searching in the parent directories of the current
working directory.

Raises:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly for me to understand this: is it normal for a method's docstring to state that it could raise an error if that error is raised in a separate function/method? In this case it's Repository.find_repository() that's going to raise the error if it needs raising.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal option, but I feel especially for user-facing API it's important to mention as many potential exceptions (even coming from calls to other functions) as possible so they now what to catch in their code.

src/zenml/stack/stack_validator.py Outdated Show resolved Hide resolved
src/zenml/cli/stack_components.py Outdated Show resolved Hide resolved
"""Creates a local stack with components with the given names. If the
names are not given, a random string is used instead."""

def _random_name():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere in the repository we use hypothesis for doing things like this. It's potentially worth the extra effort to integrate that in since it might raise some weirdnesses that we'd otherwise not expect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you use hypothesis in this case when it is not an actual test function but a helper function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd get rid of the helper function and instead use the hypothesis @given decorator (I think) on the test and pass whatever random string(s) were generated by hypothesis. You'd need to make sure to pass in a min_size=1 along with the text() strategy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried but it doesn't work in combination with our clean repo fixture

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E                               
E                               Function-scoped fixtures are not reset between examples generated by
E                               `@given(...)`, which is often surprising and can cause subtle test bugs.
E                               
E                               If you were expecting the fixture to run separately for each generated example,
E                               then unfortunately you will need to find a different way to achieve your goal
E                               (e.g. using a similar context manager instead of a fixture).```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I'd forgotten about that. Do you hit a test strategy scope error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error was called FailedHealthCheck

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this applies in this case, but you can change the scope with @pytest.fixture(scope="module")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not, this fixture creates a clean repo for one specific test so has to stay with function scope. Anyway I think the random strings in this case don't really matter as it's just for the names, but I agree we should definitely use this in cases where the strings are more relevant!

@schustmi
Copy link
Collaborator Author

@alex-zenml Maybe we could implement some logic for migrating this specific release, but I feel like the effort isn't worth it. We've asked users to delete and recreate their .zen folder before and I feel like that's the way to go with this release as well

@alex-zenml
Copy link
Contributor

@schustmi agreed. I guess I'm only wondering about the people with several hundred runs in their belt. As long as we're clear that this is a breaking change in the release notes it's probably fine.

@bcdurak
Copy link
Contributor

bcdurak commented Jan 19, 2022

Such an amazing effort! Thanks to the ZenShare, it was relatively easy to comprehend what was going on with the PR and it looks solid almost from all angles.

@schustmi schustmi merged commit ea7c613 into develop Jan 20, 2022
@schustmi schustmi deleted the michael/ENG-23-core-architecture branch January 20, 2022 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal To filter out internal PRs and issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants