-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pr into #785] Turn structured dataset into dataclass #802
Conversation
Signed-off-by: Kevin Su <pingsutw@apache.org>
Codecov Report
@@ Coverage Diff @@
## structured-dataset-proposal #802 +/- ##
===============================================================
+ Coverage 85.60% 85.71% +0.10%
===============================================================
Files 353 356 +3
Lines 30465 30585 +120
Branches 3674 3679 +5
===============================================================
+ Hits 26080 26216 +136
+ Misses 3716 3700 -16
Partials 669 669
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple questions
@@ -337,7 +337,7 @@ def to_literal( | |||
) -> Literal: | |||
# If the type signature has the StructuredDataset class, it will, or at least should, also be a | |||
# StructuredDataset instance. | |||
if issubclass(python_type, StructuredDataset): | |||
if inspect.isclass(python_type) and issubclass(python_type, StructuredDataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for python 3.7-3.10 right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I've tested it with python 3.7~3.10.
@@ -390,6 +390,8 @@ def to_literal( | |||
) | |||
|
|||
# Otherwise assume it's a dataframe instance. Wrap it with some defaults | |||
if get_origin(python_type) is Annotated: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have done this at the top of this function? or is it okay here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we extract the python_type in get_transformer instead of to_literal, so python_type
could be Annotated
here.
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@@ -542,7 +551,7 @@ def _get_dataset_type(self, t: typing.Union[Type[StructuredDataset], typing.Any] | |||
raise ValueError(f"Unrecognized Annotated type for StructuredDataset {t}") | |||
|
|||
# 2. Fill in columns by checking for StructuredDataset metadata. For example, StructuredDataset[my_cols, parquet] | |||
elif issubclass(t, StructuredDataset): | |||
elif inspect.isclass(t) and issubclass(t, StructuredDataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remind me again what this inspect.isclass
is supposed to catch? can you add a comment? i keep forgetting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's for Annotated[pd.Dataframe, my_col]
. I just moved expected_python_type = get_args(expected_python_type)[0]
to the beginning of the to_python
and to_literal
. Therefore, we don't need inspect.isclass(t)
any more, so I removed it.
Signed-off-by: Kevin Su pingsutw@apache.org
TL;DR
Please replace this text with a description of what this PR accomplishes.
Type
Are all requirements met?
Complete description
How did you fix the bug, make the feature etc. Link to any design docs etc
Tracking Issue
https://github.com/lyft/flyte/issues/
Follow-up issue
NA
OR
https://github.com/lyft/flyte/issues/