Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a type transformer for whylogs dataset profiles #1023

Closed
wants to merge 6 commits into from

Conversation

naddeoa
Copy link

@naddeoa naddeoa commented May 26, 2022

TL;DR

This adds a type plugin that enables whylog's dataset profiles to participate in the flytekit task framework.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

This is the first step to enabling whylogs dataset profile visualization in flytedeck and enabling constraints. Viz will be hooked up through flytedeck and it will allow users to visualize the data in that workflow. Constraints are a way to specify properties that should hold in the data and conditionally fail the workflow otherwise.

Tracking Issue

Follow-up issue

We're in the process of releasing our 1.0.0 right now and some features (constraints in particular) are missing from our release candidate. I need that to exist before I can add the right example. I'll also have to update the dependency in setup.py.

I've got several TODOs in the code where things need to change but this
is a good first step. Next is to take a stab at the rendering plugin.
We published our rc to pypi so I can actually reference it directly
without the wheel.
@welcome
Copy link

welcome bot commented May 26, 2022

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@codecov
Copy link

codecov bot commented May 27, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.49%. Comparing base (e057f2a) to head (81ac491).
Report is 1083 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1023      +/-   ##
==========================================
+ Coverage   86.39%   86.49%   +0.09%     
==========================================
  Files         255      265      +10     
  Lines       24417    24759     +342     
  Branches     2775     2445     -330     
==========================================
+ Hits        21095    21415     +320     
- Misses       2850     2870      +20     
- Partials      472      474       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments we already talked about. let me know when the whylogs dependency is published and we're ready for final review.

@@ -4,7 +4,7 @@

import flytekit
from flytekit import kwtypes, task, workflow
from flytekit.types.schema import FlyteSchema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we revert this? It's used below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is required in the latest version. LMK if otherwise.

plugins/flytekit-whylogs/setup.py Outdated Show resolved Hide resolved

# TODO where is this visible from?
# TODO how can I test this out locally? Do I need to build a wheel of flytekit and use it from a real project?
def to_html(self, ctx: FlyteContext, python_val: DatasetProfileView, expected_python_type: Type[DatasetProfileView]) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pingsutw - could you help please? sorry i still don't know what the right way to extend flyte deck is. is there documentation somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just need to convert python_val to HTML string or pure string.

def to_html(self, ctx: FlyteContext, python_val: T, expected_python_type: Type[T]) -> str:
"""
Converts any python val (dataframe, int, float) to a html string, and it will be wrapped in the HTML div
"""
return str(python_val)

python_val.to_pandas().to_html() is correct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. That's what I'll leave it as for now then. Our next update will be to use our actual html viewer to visualize the profiles. How can I see the render? Is it going to be a full page or a tiny window that pops up in a hover tooltip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you run the workflow locally, flytekit will output a deck.html file in the local disk. just open it, and you are able to see the render.

Note: you have to set env var FLYTE_SDK_LOGGING_LEVEL to 20 to see the log.

you can follow this guide. https://flyte--735.org.readthedocs.build/projects/cookbook/en/735/auto/core/flyte_basics/deck.html

@eapolinario
Copy link
Collaborator

Superseded by #1100.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants