-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add static typing to labeler models #672
Add static typing to labeler models #672
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally good after inital read... just some comments
dataprofiler/tests/labelers/test_integration_struct_data_labeler.py
Outdated
Show resolved
Hide resolved
dataprofiler/tests/labelers/test_integration_unstructured_data_labeler.py
Outdated
Show resolved
Hide resolved
Head branch was pushed to by a user without write access
reset_weights=False, | ||
verbose=True, | ||
): | ||
train_data: Union[pd.DataFrame, pd.Series, np.ndarray], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider making an alias for Union[pd.DataFrame, pd.Series, np.ndarray]
and utilizing this everywhere instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah I like this idea! https://docs.python.org/3.9/library/typing.html#type-aliases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments (recommend aliasing where possible) and request around setting something to None
by default without Optional[]
typing
reset_weights=False, | ||
verbose=True, | ||
): | ||
train_data: Union[pd.DataFrame, pd.Series, np.ndarray], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah I like this idea! https://docs.python.org/3.9/library/typing.html#type-aliases
Head branch was pushed to by a user without write access
@@ -18,14 +22,18 @@ | |||
logger = dp_logging.get_child_logger(__name__) | |||
labeler_utils.hide_tf_logger_warnings() | |||
|
|||
Data = Union[pd.DataFrame, pd.Series, np.ndarray] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tonywu315 love it --can we make this used everywhere? I see other places in other files where Union[pd.DataFrame, pd.Series, np.ndarray]
is used but not with an alias. I'd recommend doing this so we don't have to repeat in other files too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to utils.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dope
Head branch was pushed to by a user without write access
@@ -653,7 +670,7 @@ def fit( | |||
history[metric_label] = model_results[i] | |||
|
|||
if val_data: | |||
f1, f1_report = self._validate_training(val_data) | |||
f1, f1_report = self._validate_training(val_data) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's see how we might be able to remove the #type: ignore
and maybe remove if val_data
on line 672
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed redundant if val_data
Head branch was pushed to by a user without write access
dataprofiler/labelers/utils.py
Outdated
import numpy as np | ||
import pandas as pd | ||
|
||
DataArray = Union[pd.DataFrame, pd.Series, np.ndarray] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to create our own. I might suggest we start following a similar pardigm to numpy
and utilize our own _typing
folder or _typing.py
file at the root for universal typing that can be applied throughout the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added _typing.py
Co-authored-by: Taylor Turner <taylorfturner@gmail.com>
#609