Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update demo project to use OmegaConfigLoader #1590

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions demo-project/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ This project is designed to be a realistic example of what Kedro looks like when

## Setup

1. Run `pip install kedro==0.18.4`
2. Run `kedro install --build-reqs`
1. Run `pip install kedro~=0.18.0`
2. Run `pip install -r src/demo_project/requirements.in`
3. Run `kedro run`
4. Run `kedro viz`
12 changes: 5 additions & 7 deletions demo-project/conf/base/catalog_01_raw.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
companies:
type: pandas.CSVDataSet
filepath: ${base_location}/01_raw/companies.csv
filepath: ${_base_location}/01_raw/companies.csv
metadata:
kedro-viz:
layer: raw
Expand All @@ -9,21 +9,19 @@ companies:

reviews:
type: pandas.CSVDataSet
filepath: ${base_location}/01_raw/reviews.csv
filepath: ${_base_location}/01_raw/reviews.csv
metadata:
kedro-viz:
layer: raw
preview_args:
preview_args:
nrows: 10


shuttles:
type: pandas.ExcelDataSet
filepath: ${base_location}/01_raw/shuttles.xlsx
filepath: ${_base_location}/01_raw/shuttles.xlsx
metadata:
kedro-viz:
layer: raw
preview_args:
nrows: 15


nrows: 15
8 changes: 4 additions & 4 deletions demo-project/conf/base/catalog_02_int.yml
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
ingestion.int_typed_companies:
type: pandas.ParquetDataSet
filepath: ${base_location}/02_intermediate/typed_companies.pq
filepath: ${_base_location}/02_intermediate/typed_companies.pq
metadata:
kedro-viz:
layer: intermediate

ingestion.int_typed_shuttles@pandas1:
type: pandas.ParquetDataSet
filepath: ${base_location}/02_intermediate/typed_shuttles.pq
filepath: ${_base_location}/02_intermediate/typed_shuttles.pq
metadata:
kedro-viz:
layer: intermediate

ingestion.int_typed_shuttles@pandas2:
type: pandas.ParquetDataSet
filepath: ${base_location}/02_intermediate/typed_shuttles.pq
filepath: ${_base_location}/02_intermediate/typed_shuttles.pq
metadata:
kedro-viz:
layer: intermediate

ingestion.int_typed_reviews:
type: pandas.ParquetDataSet
filepath: ${base_location}/02_intermediate/typed_reviews.pq
filepath: ${_base_location}/02_intermediate/typed_reviews.pq
metadata:
kedro-viz:
layer: intermediate
5 changes: 2 additions & 3 deletions demo-project/conf/base/catalog_03_prm.yml
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
prm_shuttle_company_reviews:
type: pandas.ParquetDataSet
filepath: ${base_location}/03_primary/prm_shuttle_company_reviews.pq
filepath: ${_base_location}/03_primary/prm_shuttle_company_reviews.pq
metadata:
kedro-viz:
layer: primary

prm_spine_table:
type: pandas.ParquetDataSet
filepath: ${base_location}/03_primary/prm_spine_table.pq
filepath: ${_base_location}/03_primary/prm_spine_table.pq
metadata:
kedro-viz:
layer: primary

40 changes: 8 additions & 32 deletions demo-project/conf/base/catalog_04_feature.yml
Original file line number Diff line number Diff line change
@@ -1,36 +1,12 @@
# Jinja is super powerful, but does come at the cost of readability
# Set your IDE to Jinja YAML to ensure this is highlighted correctly
# Use dataset factories to reduce duplication
"feature_engineering.feat_{metric_type}_metrics":
type: pandas.ParquetDataSet
filepath: ${_base_location}/04_feature/feat_{metric_type}_metrics.pq
layer: feature

{% set namespace = 'feature_engineering' %}
{% set metric_types = ['weighting', 'scaling'] %}
{% for metric_type in metric_types %}
{{ namespace }}.feat_{{ metric_type }}_metrics:
type: pandas.ParquetDataSet
filepath: ${base_location}/04_feature/feat_{{ metric_type }}_metrics.pq
metadata:
kedro-viz:
layer: feature

{% endfor %}

# This will render to generate the records below...
#
# feature_engineering.feat_weighting_metrics:
# type: pandas.ParquetDataSet
# filepath: ${base_location}/04_feature/feat_weighting_metrics.pq
# layer: feature
#
# feature_engineering.feat_scaling_metrics:
# type: pandas.ParquetDataSet
# filepath: ${base_location}/04_feature/feat_scaling_metrics.pq
# layer: feature


feature_importance_output:
feature_importance_output:
type: pandas.CSVDataSet
filepath: ${base_location}/04_feature/feature_importance_output.csv
filepath: ${_base_location}/04_feature/feature_importance_output.csv
metadata:
kedro-viz:
layer: feature


layer: feature
3 changes: 1 addition & 2 deletions demo-project/conf/base/catalog_05_model_input.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
model_input_table:
type: pandas.ParquetDataSet
filepath: ${base_location}/05_model_input/model_input_table.pq
filepath: ${_base_location}/05_model_input/model_input_table.pq
metadata:
kedro-viz:
layer: model_input

4 changes: 2 additions & 2 deletions demo-project/conf/base/catalog_06_models.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
train_evaluation.linear_regression.regressor:
type: pickle.PickleDataSet
filepath: ${base_location}/06_models/linear_regression.pkl
filepath: ${_base_location}/06_models/linear_regression.pkl
versioned: True

train_evaluation.random_forest.regressor:
type: pickle.PickleDataSet
filepath: ${base_location}/06_models/random_forest.pkl
filepath: ${_base_location}/06_models/random_forest.pkl
versioned: True
11 changes: 5 additions & 6 deletions demo-project/conf/base/catalog_08_reporting.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
reporting.cancellation_policy_breakdown:
type: plotly.PlotlyDataSet # Constructed via plotly_args below
filepath: ${base_location}/08_reporting/cancellation_breakdown.json
filepath: ${_base_location}/08_reporting/cancellation_breakdown.json
metadata:
kedro-viz:
layer: reporting
Expand All @@ -16,26 +16,25 @@ reporting.cancellation_policy_breakdown:

reporting.price_histogram:
type: plotly.JSONDataSet # Constructed via Python API
filepath: ${base_location}/08_reporting/price_histogram.json
filepath: ${_base_location}/08_reporting/price_histogram.json
metadata:
kedro-viz:
layer: reporting
versioned: true

reporting.feature_importance:
type: plotly.JSONDataSet # Constructed via Python API
filepath: ${base_location}/08_reporting/feature_importance_plot.json
filepath: ${_base_location}/08_reporting/feature_importance_plot.json
metadata:
kedro-viz:
layer: reporting
versioned: true

reporting.cancellation_policy_grid:
type: demo_project.extras.datasets.image_dataset.ImageDataSet
ravi-kumar-pilla marked this conversation as resolved.
Show resolved Hide resolved
filepath: ${base_location}/08_reporting/cancellation_policy_grid.png
filepath: ${_base_location}/08_reporting/cancellation_policy_grid.png

reporting.confusion_matrix:
type: matplotlib.MatplotlibWriter
filepath: ${base_location}/08_reporting/confusion_matrix.png
filepath: ${_base_location}/08_reporting/confusion_matrix.png
versioned: true

8 changes: 4 additions & 4 deletions demo-project/conf/base/catalog_09_tracking.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
train_evaluation.linear_regression.r2_score:
type: tracking.MetricsDataSet
filepath: ${base_location}/09_tracking/linear_score.json
filepath: ${_base_location}/09_tracking/linear_score.json
versioned: True

train_evaluation.random_forest.r2_score:
type: tracking.MetricsDataSet
filepath: ${base_location}/09_tracking/rf_score.json
filepath: ${_base_location}/09_tracking/rf_score.json
versioned: True

train_evaluation.linear_regression.experiment_params:
type: tracking.JSONDataSet
filepath: ${base_location}/09_tracking/linear_params.json
filepath: ${_base_location}/09_tracking/linear_params.json
versioned: True

train_evaluation.random_forest.experiment_params:
type: tracking.JSONDataSet
filepath: ${base_location}/09_tracking/rf_params.json
filepath: ${_base_location}/09_tracking/rf_params.json
versioned: True
1 change: 1 addition & 0 deletions demo-project/conf/base/catalog_globals.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
_base_location: data/
1 change: 0 additions & 1 deletion demo-project/conf/base/globals.yml

This file was deleted.

5 changes: 0 additions & 5 deletions demo-project/conf/base/parameters/feature_engineering.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,3 @@
# This is a boilerplate parameters config generated for pipeline 'feature_engineering'
# using Kedro 0.18.1.
#
# Documentation for this file format can be found in "Parameters"
# Link: https://kedro.readthedocs.io/en/0.18.1/kedro_project_setup/configuration.html#parameters
feature_engineering:
feature:
static:
Expand Down
2 changes: 1 addition & 1 deletion demo-project/conf/base/parameters/modelling.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ train_evaluation:
min_samples_split: 2
min_samples_leaf: 1
min_weight_fraction_leaf: 0
max_features: 'auto'
max_features: 1.0
min_impurity_decrease: 0
bootstrap: True
oob_score: False
Expand Down
1 change: 1 addition & 0 deletions demo-project/conf/prod/catalog_globals.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
_base_location: s3://my_bucket/production/
1 change: 0 additions & 1 deletion demo-project/conf/prod/globals.yml

This file was deleted.

2 changes: 1 addition & 1 deletion demo-project/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[tool.kedro]
package_name = "demo_project"
project_name = "modular-spaceflights"
project_version = "0.18.4"
kedro_init_version = "0.18.14"

[tool.isort]
multi_line_output = 3
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def make_price_histogram(model_input_data: pd.DataFrame) -> go.Figure:
Returns:
BaseFigure: Plotly object which is serialised as JSON for rendering
"""
price_data_df = model_input_data[["price", "engine_type"]]
price_data_df = model_input_data.loc[:, ["price", "engine_type"]]
p = np.random.dirichlet([1, 1, 1])
price_data_df["engine_type"] = np.random.choice(
["Quantum", "Plasma", "Nuclear"], len(price_data_df), p=p
Expand Down
1 change: 1 addition & 0 deletions demo-project/src/demo_project/requirements.in
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ wheel>=0.35, <0.37
pillow~=9.0
matplotlib==3.5.0
pre-commit~=1.17
seaborn~=0.11.2
7 changes: 3 additions & 4 deletions demo-project/src/demo_project/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
SESSION_STORE_CLASS = SQLiteStore
SESSION_STORE_ARGS = {"path": str(Path(__file__).parents[2] / "data")}

#Setup for collaborative experiment tracking.
# Setup for collaborative experiment tracking.
# SESSION_STORE_ARGS = {"path": str(Path(__file__).parents[2] / "data"),
# "remote_path": "s3://{path-to-session_store}" }

Expand All @@ -21,7 +21,6 @@
# Define the configuration folder. Defaults to `conf`
# CONF_ROOT = "conf"

from kedro.config import TemplatedConfigLoader # NOQA
from kedro.config import OmegaConfigLoader # NOQA

CONFIG_LOADER_CLASS = TemplatedConfigLoader
CONFIG_LOADER_ARGS = {"globals_pattern": "*globals.yml", "globals_dict": {}}
CONFIG_LOADER_CLASS = OmegaConfigLoader