Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New core architecture #305

Merged
merged 85 commits into from Jan 20, 2022
Merged
Show file tree
Hide file tree
Changes from 83 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
270a177
First draft of new core architecture implementation
schustmi Jan 10, 2022
a4eee1f
Use safe loading when reading yaml files
schustmi Jan 10, 2022
9f50587
Refactor stack component CLI
schustmi Jan 10, 2022
22b2cde
Remove some old files
schustmi Jan 10, 2022
e562df6
Convert existing stack components to new api
schustmi Jan 10, 2022
ebffe85
Use new repository in kubeflow orchestrator
schustmi Jan 10, 2022
3fb787f
[ci skip] Update some methods of stack CLI
schustmi Jan 10, 2022
d5c5eb6
Improved error message if stack component class is not registered
schustmi Jan 11, 2022
af4724a
BaseMetadataStore is MLMD Store for now
schustmi Jan 11, 2022
43da81d
Use new repository in source utils
schustmi Jan 11, 2022
54ea8aa
Hide repository config and expose immutable values in public api
schustmi Jan 11, 2022
129f344
Update names for local deployment methods
schustmi Jan 11, 2022
c684c40
Update orchestrators for new stack component API
schustmi Jan 11, 2022
9ebc957
Add post execution entrypoint method to new repository
schustmi Jan 11, 2022
e9d51f5
Add missing quotes in type annotations
schustmi Jan 11, 2022
37ddda3
[ci skip] Fix import cycle, add nicer represenation for stack components
schustmi Jan 11, 2022
f9251bc
Implement new up/down logic in kubeflow orchestrator
schustmi Jan 11, 2022
5de609a
Implement new up/down logic in airflow orchestrator
schustmi Jan 11, 2022
f0d5d09
Update orchestrator up/down logic
schustmi Jan 11, 2022
19a748e
Fix missing logger in base metadata store
schustmi Jan 11, 2022
e2eb153
Implement pipeline deployment using the new stack
schustmi Jan 11, 2022
1fd31eb
[ci skip] Improve stack component down CLI command
schustmi Jan 11, 2022
8f58378
Implement provisioning methods in stack
schustmi Jan 11, 2022
a6c9ec4
Updating stack cli commands
schustmi Jan 11, 2022
01f5583
[ci skip] Fix typo in error message
schustmi Jan 11, 2022
0a02c9a
[ci skip] Fix most mypy issues
schustmi Jan 11, 2022
60ae816
Merge branch 'develop' into michael/ENG-23-core-architecture
schustmi Jan 12, 2022
df64b06
Write missing docstrings
schustmi Jan 12, 2022
85dbfc7
[ci skip] Fix remaining mypy issues
schustmi Jan 12, 2022
008beba
Use new repository for conftest fixtures
schustmi Jan 12, 2022
7d69685
Implement metadata store tests
schustmi Jan 12, 2022
b96e504
Implement local artifact store tests
schustmi Jan 12, 2022
15dfd97
Implement local orchestrator tests
schustmi Jan 12, 2022
4d559d8
Implement runtime configuration tests
schustmi Jan 12, 2022
8205485
Remove empty base stack test files
schustmi Jan 12, 2022
37efb23
Implement stack component tests
schustmi Jan 12, 2022
ade09b7
[ci skip] Implement stack validator tests
schustmi Jan 12, 2022
81ab4c4
Improve todo description
schustmi Jan 12, 2022
1b7e1ce
[ci skip] Implement stack component class registry tests
schustmi Jan 12, 2022
c9e71c3
[ci skip] Implement repository tests
schustmi Jan 13, 2022
17b2af9
[ci skip] Prevent deregistering active stack
schustmi Jan 13, 2022
c86c73c
Update tests for GCP artifact store
schustmi Jan 13, 2022
eb8a56b
Add tests for airflow orchestrator
schustmi Jan 13, 2022
d815313
Add tests for kubeflow metadata store
schustmi Jan 13, 2022
5225f27
Add tests for kubeflow orchestrator
schustmi Jan 13, 2022
905267d
[ci skip] Add tests for base container registry
schustmi Jan 13, 2022
fb37606
Remove old cli tests
schustmi Jan 13, 2022
c654c2c
Some cleanup
schustmi Jan 13, 2022
7b44801
Improve imports
schustmi Jan 13, 2022
e3b38f6
Delete old stack files
schustmi Jan 13, 2022
a58a67f
Move repo and runtime configuration to root directory
schustmi Jan 13, 2022
d8f2ef7
Only import files when necessary
schustmi Jan 13, 2022
5d5cd2b
Move stack files into correct directory
schustmi Jan 13, 2022
4376c39
Move test files to correct paths
schustmi Jan 13, 2022
84028cf
Merge branch 'develop' into michael/ENG-23-core-architecture
schustmi Jan 13, 2022
0f323c5
[ci skip] Replace broken license link
schustmi Jan 13, 2022
3daa58b
[ci skip] Implement some stack tests
schustmi Jan 13, 2022
94fde1a
Move import of stack component class registry inside CLI method to im…
schustmi Jan 13, 2022
06b7701
Fix import order
schustmi Jan 14, 2022
ea77acf
[ci skip] Prevent deregistering components that are part of a registe…
schustmi Jan 14, 2022
b7572a4
Fix kubeflow container entrypoint
schustmi Jan 14, 2022
3239c08
Revert change to pipeline run name
schustmi Jan 14, 2022
b5aa7f7
Make stack requirements a set
schustmi Jan 14, 2022
1734444
[ci skip] Implement more stack tests, improve existing stack componen…
schustmi Jan 14, 2022
0f1b4e7
Fix CLI init test
schustmi Jan 14, 2022
7b686a3
Implement remaining stack tests
schustmi Jan 14, 2022
964f374
Fix stack deprovisioning logic
schustmi Jan 14, 2022
6785dad
[ci skip] Delete old core tests
schustmi Jan 14, 2022
9090b33
Fix mypy issue
schustmi Jan 14, 2022
892c8fd
Implement new version of global config
schustmi Jan 17, 2022
fa410cc
Remove old core files
schustmi Jan 17, 2022
16df836
Remove duplicate code
schustmi Jan 17, 2022
8a7b8a1
Move some old test files
schustmi Jan 17, 2022
4d02582
Fix cli init test
schustmi Jan 17, 2022
0ada066
Fix CLI analytics tests
schustmi Jan 17, 2022
ba8ea62
Update example/doc import of repository
schustmi Jan 17, 2022
468f1d4
Merge branch 'develop' into michael/ENG-23-core-architecture
schustmi Jan 17, 2022
302597c
Bump pydantic version
schustmi Jan 17, 2022
f468576
Make sure kubeflow is installed for tests
schustmi Jan 17, 2022
252fb75
Mock global config directory
schustmi Jan 17, 2022
4b7d909
Update integration tests for new repo api
schustmi Jan 17, 2022
7f3c533
Update global config superclass
schustmi Jan 18, 2022
7691c25
Merge branch 'develop' into michael/ENG-23-core-architecture
schustmi Jan 19, 2022
115f441
Some formatting fixes from PR
schustmi Jan 19, 2022
8d3d597
Test for global config environment variable overwriting
schustmi Jan 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/main.yml
Expand Up @@ -67,6 +67,7 @@ jobs:
python -m poetry run zenml integration install pytorch -f
python -m poetry run zenml integration install mlflow -f
python -m poetry run zenml integration install gcp -f
python -m poetry run zenml integration install kubeflow -f
python -m poetry run pip install click~=8.0.3

- name: Lint
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/pull_request.yml
Expand Up @@ -66,6 +66,7 @@ jobs:
python -m poetry run zenml integration install pytorch -f
python -m poetry run zenml integration install mlflow -f
python -m poetry run zenml integration install gcp -f
python -m poetry run zenml integration install kubeflow -f
python -m poetry run pip install click~=8.0.3

- name: Lint
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/class-based-api/create-a-step.md
Expand Up @@ -98,7 +98,7 @@ Step `PandasDatasource` has finished in 0.016s.
You can add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="Chapter1Pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/class-based-api/split-and-preprocess.md
Expand Up @@ -233,7 +233,7 @@ Step `SklearnStandardScaler` has finished in 0.151s.
You can add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="Chapter2Pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/class-based-api/train-and-evaluate.md
Expand Up @@ -228,7 +228,7 @@ Step `SklearnEvaluator` has finished in 0.289s.
If you add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="mnist_pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/functional-api/caching.md
Expand Up @@ -104,7 +104,7 @@ straight to the new trainer and evaluator.
If you add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="mnist_pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/functional-api/create-a-step.md
Expand Up @@ -77,7 +77,7 @@ Step `importer_mnist` has finished in 1.726s.
You can add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="load_mnist_pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/functional-api/import-dynamic-data.md
Expand Up @@ -88,7 +88,7 @@ Even if our data originally lives in an external API, we have now downloaded it
this pipeline. So we can fetch it and inspect it:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="mnist_pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/functional-api/normalize-data.md
Expand Up @@ -64,7 +64,7 @@ Step `normalize_mnist` has finished in 1.848s.
You can add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="load_and_normalize_pipeline")
Expand Down
2 changes: 1 addition & 1 deletion docs/book/guides/functional-api/train-and-evaluate.md
Expand Up @@ -151,7 +151,7 @@ Step `tf_evaluator` has started.
If you add the following code to fetch the pipeline:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
p = repo.get_pipeline(pipeline_name="mnist_pipeline")
Expand Down
Expand Up @@ -19,7 +19,7 @@ times.
Once the pipeline run is finished we can easily access this specific run during our post-execution workflow:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
pipeline = repo.get_pipeline(pipeline_name="my_pipeline")
Expand Down
Expand Up @@ -20,7 +20,7 @@ repository -> pipelines -> runs -> steps -> outputs
The highest level `repository` object is where to start from.

```python
from zenml.core.repo import Repository
from zenml.repository import Repository

repo = Repository()
```
Expand Down
2 changes: 1 addition & 1 deletion examples/class_based_api/chapter_1.py
Expand Up @@ -18,9 +18,9 @@

import pandas as pd

from zenml.core.repo import Repository
from zenml.logger import get_logger
from zenml.pipelines import BasePipeline
from zenml.repository import Repository
from zenml.steps.step_interfaces.base_datasource_step import (
BaseDatasourceConfig,
BaseDatasourceStep,
Expand Down
2 changes: 1 addition & 1 deletion examples/class_based_api/chapter_2.py
Expand Up @@ -14,10 +14,10 @@
import os
from urllib.request import urlopen

from zenml.core.repo import Repository
from zenml.integrations.sklearn import steps as sklearn_steps
from zenml.logger import get_logger
from zenml.pipelines import BasePipeline
from zenml.repository import Repository
from zenml.steps import builtin_steps, step_interfaces

logger = get_logger(__name__)
Expand Down
2 changes: 1 addition & 1 deletion examples/class_based_api/chapter_3.py
Expand Up @@ -15,11 +15,11 @@
import os
from urllib.request import urlopen

from zenml.core.repo import Repository
from zenml.integrations.sklearn import steps as sklearn_steps
from zenml.integrations.tensorflow import steps as tf_steps
from zenml.logger import get_logger
from zenml.pipelines.builtin_pipelines import TrainingPipeline
from zenml.repository import Repository
from zenml.steps import builtin_steps

logger = get_logger(__name__)
Expand Down
2 changes: 1 addition & 1 deletion examples/dag_visualizer/README.md
Expand Up @@ -14,7 +14,7 @@ the post-execution workflow we then plug in the visualization class that visuali
This visualization is produced with the following code:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository
from zenml.integrations.graphviz.visualizers.pipeline_run_dag_visualizer import (
PipelineRunDagVisualizer,
)
Expand Down
2 changes: 1 addition & 1 deletion examples/dag_visualizer/run.py
Expand Up @@ -16,11 +16,11 @@
import pandas as pd
import tensorflow as tf

from zenml.core.repo import Repository
from zenml.integrations.graphviz.visualizers.pipeline_run_dag_visualizer import (
PipelineRunDagVisualizer,
)
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import Output, step

FEATURE_COLS = [
Expand Down
2 changes: 1 addition & 1 deletion examples/drift_detection/evidently.ipynb
Expand Up @@ -392,7 +392,7 @@
"metadata": {},
"outputs": [],
"source": [
"from zenml.core.repo import Repository\n",
"from zenml.repository import Repository\n",
"from zenml.integrations.evidently.visualizers import EvidentlyVisualizer\n",
"\n",
"repo = Repository()\n",
Expand Down
15 changes: 6 additions & 9 deletions examples/drift_detection/run.py
Expand Up @@ -12,18 +12,19 @@
# permissions and limitations under the License.

import json

import pandas as pd
from rich import print
from sklearn import datasets

from zenml.core.repo import Repository
from zenml.integrations.evidently.steps import (
EvidentlyProfileConfig,
EvidentlyProfileStep,
)
from zenml.integrations.evidently.visualizers import EvidentlyVisualizer
from zenml.logger import get_logger
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import step

logger = get_logger(__name__)
Expand Down Expand Up @@ -112,14 +113,10 @@ def visualize_statistics():
repo = Repository()
pipeline = repo.get_pipelines()[0]
last_run = pipeline.runs[-1]
drift_analysis_step = last_run.get_step(
name="drift_analyzer"
)
print(f'Data drift detected: {drift_analysis_step.output.read()}')
drift_analysis_step = last_run.get_step(name="drift_analyzer")
print(f"Data drift detected: {drift_analysis_step.output.read()}")

drift_detection_step = last_run.get_step(
name="drift_detector"
)
print(json.dumps(drift_detection_step.outputs['profile'].read(), indent=2))
drift_detection_step = last_run.get_step(name="drift_detector")
print(json.dumps(drift_detection_step.outputs["profile"].read(), indent=2))

visualize_statistics()
2 changes: 1 addition & 1 deletion examples/functional_api/chapter_1.py
Expand Up @@ -15,8 +15,8 @@
import numpy as np
import tensorflow as tf

from zenml.core.repo import Repository
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import Output, step


Expand Down
2 changes: 1 addition & 1 deletion examples/functional_api/chapter_2.py
Expand Up @@ -15,8 +15,8 @@
import numpy as np
import tensorflow as tf

from zenml.core.repo import Repository
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import Output, step


Expand Down
2 changes: 1 addition & 1 deletion examples/functional_api/chapter_3.py
Expand Up @@ -15,8 +15,8 @@
import numpy as np
import tensorflow as tf

from zenml.core.repo import Repository
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import BaseStepConfig, Output, step


Expand Down
2 changes: 1 addition & 1 deletion examples/functional_api/chapter_4.py
Expand Up @@ -17,8 +17,8 @@
from sklearn.base import ClassifierMixin
from sklearn.linear_model import LogisticRegression

from zenml.core.repo import Repository
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import BaseStepConfig, Output, step


Expand Down
2 changes: 1 addition & 1 deletion examples/functional_api/chapter_5.py
Expand Up @@ -23,9 +23,9 @@
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

from zenml.core.repo import Repository
from zenml.materializers.base_materializer import BaseMaterializer
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import BaseStepConfig, Output, step

Base = declarative_base()
Expand Down
2 changes: 1 addition & 1 deletion examples/functional_api/chapter_6.py
Expand Up @@ -18,8 +18,8 @@
from sklearn.base import ClassifierMixin
from sklearn.linear_model import LogisticRegression

from zenml.core.repo import Repository
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import BaseStepConfig, Output, step


Expand Down
Expand Up @@ -603,7 +603,7 @@
},
"outputs": [],
"source": [
"from zenml.core.repo import Repository"
"from zenml.repository import Repository"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion examples/lineage/README.md
Expand Up @@ -15,7 +15,7 @@ dataframes for us.
This visualization is produced with the following code:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository
from zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import (
PipelineRunLineageVisualizer,
)
Expand Down
2 changes: 1 addition & 1 deletion examples/lineage/run.py
Expand Up @@ -16,11 +16,11 @@
import pandas as pd
import tensorflow as tf

from zenml.core.repo import Repository
from zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import (
PipelineRunLineageVisualizer,
)
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import Output, step

FEATURE_COLS = [
Expand Down
2 changes: 1 addition & 1 deletion examples/not_so_quickstart/run.py
Expand Up @@ -20,8 +20,8 @@
from steps.tf_steps import tf_evaluator, tf_trainer
from steps.torch_steps import torch_evaluator, torch_trainer

from zenml.core.repo import Repository
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import Output, step


Expand Down
2 changes: 1 addition & 1 deletion examples/quickstart/quickstart.ipynb
Expand Up @@ -452,7 +452,7 @@
"metadata": {},
"outputs": [],
"source": [
"from zenml.core.repo import Repository\n",
"from zenml.repository import Repository\n",
"\n",
"repo = Repository()"
]
Expand Down
2 changes: 1 addition & 1 deletion examples/statistics/README.md
Expand Up @@ -17,7 +17,7 @@ dataframes for us.
This visualization is produced with the following code:

```python
from zenml.core.repo import Repository
from zenml.repository import Repository
from zenml.integrations.facets.visualizers.facet_statistics_visualizer import (
FacetStatisticsVisualizer,
)
Expand Down
2 changes: 1 addition & 1 deletion examples/statistics/run.py
Expand Up @@ -16,11 +16,11 @@
import pandas as pd
import tensorflow as tf

from zenml.core.repo import Repository
from zenml.integrations.facets.visualizers.facet_statistics_visualizer import (
FacetStatisticsVisualizer,
)
from zenml.pipelines import pipeline
from zenml.repository import Repository
from zenml.steps import Output, step

FEATURE_COLS = [
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Expand Up @@ -56,7 +56,7 @@ pyyaml = "^5.4.1"
python-dateutil = "^2.8.1"
gitpython = "^3.1.18"
click = "^8.0.1"
pydantic = "<=1.8.2"
pydantic = "^1.9.0"
analytics-python = "^1.4.0"
distro = "^1.6.0"
tabulate = "^0.8.9"
Expand Down