Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert examples to conform to best practices #679

Merged
merged 47 commits into from Jun 28, 2022

Conversation

AlexejPenner
Copy link
Collaborator

@AlexejPenner AlexejPenner commented Jun 1, 2022

Describe changes

I have updated some of our examples to conform to our best practices - ✔️ is done - 🍔 is not done and should be done in the next few days - airflow currently does not work with our best practices structure.

✔️ add_your_own
✔️ airflow_orchestration
✔️ cloud_secrets_manager
✔️ evidently_drift_detection
✔️ facets_visualize_statistics
✔️ feast_feature_store
✔️ github_actions_orchestration
✔️ : huggingface
✔️ kubeflow_pipelines_orchestration
✔️ lightgbm
✔️ mlflow_deployment
✔️ mlflow_tracking
✔️ neural_prophet
✔️ pytorch
✔️ quickstart
✔️ scipy
🌵 seldon_deployment - done - Need to be tested with AWS cluster where seldon is installed
✔️ slack_alert
✔️ step_operator_remote_training
✔️ vertex_ai_orchestration
✔️ wandb_tracking
✔️ whylogs_data_profiling
✔️ xgboost

https://docs.zenml.io/links-and-resources/resources/best-practices

Besides the obvious many changes to the example structure there have been a few changes in the core repo. The pipeline-run code has moved out of the cli into the pipelines module to be callable from code without invoking subprocess calls or cli-runners.

On top of that the source utils have been adjusted to make sure that zenml pipeline run ... -c config.yaml works for our recommended repo structure. Special attention should be payed to the changes in src/zenml/utils/source_utils.py during code review (:eyes: @schustmi 😉 )

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • If my change requires a change to docs, I have updated the documentation accordingly.
  • If I have added an integration, I have updated the integrations table.
  • I have added tests to cover my changes.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

@AlexejPenner AlexejPenner added enhancement New feature or request internal To filter out internal PRs and issues labels Jun 1, 2022
@AlexejPenner AlexejPenner requested review from htahir1 and fa9r June 1, 2022 15:41
@strickvl strickvl changed the title Feature/eng 939 make examples conform to bp Convert examples to conform to best practices Jun 2, 2022
@htahir1
Copy link
Contributor

htahir1 commented Jun 3, 2022

The fact that ✅ is the emoji for not done is the biggest troll today! :D

@AlexejPenner
Copy link
Collaborator Author

The fact that white_check_mark is the emoji for not done is the biggest troll today! :D

Changed it to 🍔 to motivate volunteers

@fa9r
Copy link
Contributor

fa9r commented Jun 7, 2022

I can do the alerter one later and will do the PR review once all examples are done.

Also, could you please add a link to the best practices section somewhere in your PR description?

@strickvl
Copy link
Contributor

strickvl commented Jun 7, 2022

@fa9r https://docs.zenml.io/links-and-resources/resources/best-practices is the best practices link, but you probably would do better just to look at the model of one of the complete ones. I was going to write up instructions on this process on Friday but didn't get around to it. Maybe overkill given how many we have left and that this is a one-off process. But lmk if you have any questions. I didn't find it super intuitive to do the first one.

@fa9r
Copy link
Contributor

fa9r commented Jun 8, 2022

slack example should also be done now

@github-actions
Copy link
Contributor

@htahir1, @fa9r
1.25 (one and twenty-five hundredths) business days have passed since the review started. Pretty please review the PR.

@fa9r
Copy link
Contributor

fa9r commented Jun 17, 2022

We're getting close, really looking forward to finally have this merged! 😄

@AlexejPenner
Copy link
Collaborator Author

Just some tests missing now :)

@safoinme
Copy link
Contributor

Only Seldon's example is left for the test I think. tried to make sure it's right but still would be better to run it if you guys can do that with the AWS Stack @AlexejPenner @fa9r

@safoinme
Copy link
Contributor

Somehow not all examples are really following the same exact format! there are examples where each step or pipeline is inside a separated folder and some are gathered together, should it be unified @AlexejPenner @strickvl ?

@strickvl
Copy link
Contributor

@safoinme like which examples?

@safoinme
Copy link
Contributor

  • examples where all steps are gathered together: kubeflow_pipeline_orchestration, PyTorch, quickstart, slack_alerter
  • examples where each step is in a separated folder: mlflow_tracking, neural_prophet, Scipy, seldon_deployment, step_operator, Vertex_ai, wandb, why_logs, xgboost,
    @strickvl

@stefannica
Copy link
Collaborator

I know I'm late to this party, but I just had to go through the process of refactoring one of my examples to conform to these best practices and I have to object to the python module hierarchy recommendation. Do we really need to have a separate python sub-module for every single pipeline and step ? I find it unnecessary and difficult to work with.

A better/simpler hierarchy would be e.g.:

example 
  /pipelines   
    __init__.py
    training.py
    inference.py
  /steps
    __init__.py
    importer.py
    splitter.py
    deployer.py
...


Returns:
imported module: Module
"""
# Add directory of python file to PYTHONPATH so we can import it
file_path = os.path.abspath(file_path)
module_name = os.path.splitext(os.path.basename(file_path))[0]
module_path = os.path.relpath(file_path, zen_root)
# module_name = os.path.splitext(os.path.basename(file_path))[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# module_name = os.path.splitext(os.path.basename(file_path))[0]

if full_module_path not in sys.modules:
with prepend_python_path(os.path.dirname(file_path)):
if module_name in sys.modules:
del sys.modules[module_name]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe add a short (or long) comment here that explains in detail why the module needs to be deleted, as that is a very strange operation if someone comes access this code.

@fa9r
Copy link
Contributor

fa9r commented Jun 27, 2022

I know I'm late to this party, but I just had to go through the process of refactoring one of my examples to conform to these best practices and I have to object to the python module hierarchy recommendation. Do we really need to have a separate python sub-module for every single pipeline and step ? I find it unnecessary and difficult to work with.

A better/simpler hierarchy would be e.g.:

example 
  /pipelines   
    __init__.py
    training.py
    inference.py
  /steps
    __init__.py
    importer.py
    splitter.py
    deployer.py
...

I agree 100%. I did all my refactorings like this, which are the "inconsistencies" between the different examples we talked about 😄 For now, I think we agreed that both structures are acceptable since this design choice doesn't affect testing.

@AlexejPenner AlexejPenner merged commit ab52076 into develop Jun 28, 2022
@AlexejPenner AlexejPenner deleted the feature/ENG-939-make-examples-conform-to-bp branch June 28, 2022 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants