Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update demo project to use OmegaConfigLoader #1590

Conversation

aanghelidi
Copy link
Contributor

@aanghelidi aanghelidi commented Oct 19, 2023

Description

Resolves kedro-org/kedro#3172

Development notes

To migrate the project to use OmegaConfigLoader, the migration guide was followed. During this migration, I encountered some small issues that I have fixed:

  • A deprecated parameter of the machine learning model in the pipeline. This was changed to the new default.
  • Several warnings about DataSet to be renamed Dataset fixed by the renaming.
  • Fix a Pandas warning, SettingWithCopyWarning. This was fixed using loc to slice a subset of the DataFrame in the pipeline.

Note: I have renamed demo-project/src/demo_project/requirements.in to demo-project/src/dev_requirements.txt but this is optional and could be reverted

QA notes

To test my changes, the following steps were performed:

  • Making sure I could install the dependencies, following Makefile and instructions in README
  • Run pipeline with kedro run
  • Run existing tests with pytest

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

Screenshot

kedro-pipeline

@aanghelidi aanghelidi changed the title Docs/update demo project to omega config loader Update demo project to omega config loader Oct 19, 2023
@aanghelidi aanghelidi changed the title Update demo project to omega config loader Update demo project to use OmegaConfigLoader Oct 19, 2023
@tynandebold tynandebold added Python Pull requests that update Python code Community labels Oct 19, 2023
Comment on lines 1 to 5
kedro-datasets[pandas.CSVDataSet,pandas.ExcelDataSet, pandas.ParquetDataSet, plotly.PlotlyDataSet, matplotlib.MatplotlibWriter]~=1.0
scikit-learn~=1.0
kedro>=0.18.14, <0.19.0
pillow~=9.0
scikit-learn~=1.0
seaborn~=0.11.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question to @ravi-kumar-pilla or @rashidakanchwala, what are these two dev_requirements.txt and docker_requirements.txt for? I am not sure if it's appropriate to have -r docker_requirements.txt.

Is 0.18.14 the minimum version or the 1st version with dataset factory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding, requirements.in (dev_requirements.txt) is for local dev dependencies while docker_requirements.txt are used when demo-project is deployed (Step Build demo container image in .circleci/continue_config.yml), one of the steps for deploy_demo to aws ecr.

I think there is no issue in referencing docker_requirements in dev_requirements. I could see seaborn to be an additional requirement in dev which was not present earlier.

0.18.14 is the latest version of kedro right now which has most of the OmegaConfigLoader features. I guess that was the rationale behind the update. Happy to hear from @rashidakanchwala

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial PR when demo_project was created for reference

Comment on lines 2 to 10
"feature_engineering.feat_{metric_type}_metrics":
type: pandas.ParquetDataset
filepath: ${_base_location}/04_feature/feat_{metric_type}_metrics.pq
layer: feature

feature_importance_output:
type: pandas.CSVDataset
filepath: ${_base_location}/04_feature/feature_importance_output.csv
metadata:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great use of the datasets factory 👍🏼

@aanghelidi aanghelidi force-pushed the docs/update-demo-project-to-omega-config-loader branch from c25cd3a to 539d501 Compare October 19, 2023 20:20
@aanghelidi aanghelidi marked this pull request as ready for review October 19, 2023 20:25
@aanghelidi aanghelidi force-pushed the docs/update-demo-project-to-omega-config-loader branch from 539d501 to a266be2 Compare October 21, 2023 08:25
@noklam
Copy link
Contributor

noklam commented Oct 24, 2023

We had a call last week and IRRC the conclusion is we want to keep the requirements separately instead of having it relies on the docker requirements.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is introducing a lot more changes than the original issue kedro-org/kedro#3172 described. I really appreciate your efforts here @aanghelidi, but this many changes in one PR makes it harder to review and as flagged by the team members not every change is actually desired, e.g. the changes to how requirements are handled.

Would you mind updating this PR and only keep:

  • The changes to use OmegaConfigLoader
  • The use of dataset factories
  • A deprecated parameter of the machine learning model in the pipeline. This was changed to the new default.
  • Fix a Pandas warning, SettingWithCopyWarning. This was fixed using loc to slice a subset of the DataFrame in the pipeline.

Thank you!

demo-project/pyproject.toml Outdated Show resolved Hide resolved
Copy link
Contributor

@ravi-kumar-pilla ravi-kumar-pilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @aanghelidi. Looks good to me and works well in local testing. But, I would agree with @merelcht on keeping the desired changes here. Thank you

@tynandebold tynandebold removed the request for review from yetudada October 25, 2023 11:12
@aanghelidi aanghelidi force-pushed the docs/update-demo-project-to-omega-config-loader branch from a266be2 to 5cf549f Compare October 27, 2023 04:26
@aanghelidi
Copy link
Contributor Author

Thank you for the thorough review @merelcht @ravi-kumar-pilla @noklam and @ankatiyar I appreciated the feedback. Only the desired changes and improvements have been kept for this PR answering what the original issue kedro-org/kedro#3172 described.

The only exception is the addition of seaborn (identical to docker requirements) in the development requirements as mentioned by @ravi-kumar-pilla, because the pipeline won't run without it.

@merelcht merelcht self-requested a review October 27, 2023 09:27
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @aanghelidi ! These changes look good to me now ⭐ 👍

Question for the viz team: should this change go into the release notes for our own record?

@tynandebold
Copy link
Member

I think we could add a line to the release notes for this. Good thought!

Copy link
Contributor

@ravi-kumar-pilla ravi-kumar-pilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @aanghelidi for the PR. Apart from the typo in README.md file, everything else looks good to me and works well.

1. Run `pip install kedro==0.18.4`
2. Run `kedro install --build-reqs`
1. Run `pip install kedro~=0.18.0`
2. Run `pip install -r src/demo-project/requirements.in`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Typo in demo_project] Can you please change the command to pip install -r src/demo_project/requirements.in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch ! The command has been changed to pip install -r src/demo_project/requirements.in

Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Since the version 1.1 of scikit learn the default
value of the parameter max_features of RandomForestRegressor
have been changed from 'auto' to 1.0. Support for the old
'auto' value used have been removed.

This commit fix this issue.

Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
Signed-off-by: Alain Anghelidi <alainanghelidi@gmail.com>
@aanghelidi aanghelidi force-pushed the docs/update-demo-project-to-omega-config-loader branch from 5cf549f to b790049 Compare October 28, 2023 12:16
@tynandebold
Copy link
Member

We can add a release note at a later date. I'm going to merge this now.

@tynandebold tynandebold merged commit d25db88 into kedro-org:main Oct 30, 2023
5 checks passed
@rashidakanchwala rashidakanchwala mentioned this pull request Nov 16, 2023
5 tasks
rashidakanchwala added a commit that referenced this pull request Nov 17, 2023
This is minor release with big backend refactoring work and some bug fixes.

Bug fixes and other changes
Refactor flowchart dataclasses to pydantic base models. (Refactor Flowchart models from dataclass to pydantic base models #1565)
Fix dataset factory patterns in Experiment Tracking. (Fix dataset factory patterns in Experiment Tracking #1588)
Update demo-project to use OmegaConfigLoader. (Update demo project to use OmegaConfigLoader #1590)
Improve feedback for copy to clipboard feature. (Add tooltip to shareable urls copy button #1614)
Ensure Kedro-Viz works when hosted on a URL subpath. (Fix: Kedro-Viz doesn't work when hosted via a URL subpath #1621)
Bump fastapi upper bounds. (Bump FAST API upper bounds #1634)
Fix shareable URL modal to appear across the app. (Make shareable URL modal open globally across the app.  #1639)
Add Kedro-Viz CLI command deprecation warning. (Add kedro viz deprecation warning for CLI #1641)
@NeroOkwa
Copy link
Contributor

Thank you @aanghelidi for this PR. Here's the release announcement on the Kedro slack acknowledging your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Python Pull requests that update Python code
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Update demo_project used for Kedro-Viz demo to use OmegaConfigLoader
7 participants