Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static typing to labeler models #672

Merged
merged 17 commits into from
Oct 6, 2022

Conversation

tonywu315
Copy link
Contributor

Copy link
Contributor

@taylorfturner taylorfturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally good after inital read... just some comments

dataprofiler/labelers/base_model.py Outdated Show resolved Hide resolved
dataprofiler/labelers/base_model.py Outdated Show resolved Hide resolved
dataprofiler/labelers/base_model.py Outdated Show resolved Hide resolved
dataprofiler/labelers/regex_model.py Outdated Show resolved Hide resolved
@taylorfturner taylorfturner enabled auto-merge (squash) September 30, 2022 21:16
auto-merge was automatically disabled October 3, 2022 13:44

Head branch was pushed to by a user without write access

reset_weights=False,
verbose=True,
):
train_data: Union[pd.DataFrame, pd.Series, np.ndarray],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider making an alias for Union[pd.DataFrame, pd.Series, np.ndarray] and utilizing this everywhere instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taylorfturner taylorfturner enabled auto-merge (squash) October 3, 2022 14:39
Copy link
Contributor

@taylorfturner taylorfturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments (recommend aliasing where possible) and request around setting something to None by default without Optional[] typing

dataprofiler/labelers/char_load_tf_model.py Show resolved Hide resolved
reset_weights=False,
verbose=True,
):
train_data: Union[pd.DataFrame, pd.Series, np.ndarray],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataprofiler/labelers/character_level_cnn_model.py Outdated Show resolved Hide resolved
auto-merge was automatically disabled October 4, 2022 18:11

Head branch was pushed to by a user without write access

@@ -18,14 +22,18 @@
logger = dp_logging.get_child_logger(__name__)
labeler_utils.hide_tf_logger_warnings()

Data = Union[pd.DataFrame, pd.Series, np.ndarray]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonywu315 love it --can we make this used everywhere? I see other places in other files where Union[pd.DataFrame, pd.Series, np.ndarray] is used but not with an alias. I'd recommend doing this so we don't have to repeat in other files too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to utils.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dope

@taylorfturner taylorfturner enabled auto-merge (squash) October 5, 2022 13:49
taylorfturner
taylorfturner previously approved these changes Oct 5, 2022
auto-merge was automatically disabled October 5, 2022 14:43

Head branch was pushed to by a user without write access

@taylorfturner taylorfturner enabled auto-merge (squash) October 5, 2022 14:49
@@ -653,7 +670,7 @@ def fit(
history[metric_label] = model_results[i]

if val_data:
f1, f1_report = self._validate_training(val_data)
f1, f1_report = self._validate_training(val_data) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's see how we might be able to remove the #type: ignore and maybe remove if val_data on line 672

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed redundant if val_data

auto-merge was automatically disabled October 5, 2022 15:16

Head branch was pushed to by a user without write access

import numpy as np
import pandas as pd

DataArray = Union[pd.DataFrame, pd.Series, np.ndarray]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to create our own. I might suggest we start following a similar pardigm to numpy and utilize our own _typing folder or _typing.py file at the root for universal typing that can be applied throughout the repo.

https://github.com/numpy/numpy/tree/main/numpy/_typing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added _typing.py

tonywu315 and others added 3 commits October 5, 2022 14:44
Co-authored-by: Taylor Turner <taylorfturner@gmail.com>
@taylorfturner taylorfturner enabled auto-merge (squash) October 6, 2022 14:31
@taylorfturner taylorfturner merged commit 356b0f9 into capitalone:main Oct 6, 2022
@tonywu315 tonywu315 deleted the static_typing/labelers/model branch October 6, 2022 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
static_typing mypy static typing issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants