-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segment support #116
segment support #116
Conversation
src/whylogs/app/logger.py
Outdated
writers = List[Writer], | ||
verbose: bool = False, | ||
with_rotation_time: Optional[str] = None, | ||
cache: int =1, | ||
segments: Optional[Union[List[List[Dict]], List[str]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this signature is very nested - I'd rather keep type hierarchy lighweight. For the key, value
case, can we just stick to defining an object instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. it is essentially SegmentTag
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/whylogs/app/logger.py
Outdated
session_id=self.session_id | ||
) | ||
|
||
self._profiles.append({"full_profile": full_profile, "segmented_profiles" : {}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason we're not defining an explicit type but rather using a dict
to hold data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, more of an implement first, refactor later. Once we settle on the API we can create the data structures.
src/whylogs/app/logger.py
Outdated
|
||
|
||
def full_profile_check(self,): | ||
return ((self.segments is None ) or ((self.segments is not None) and self.profile_full_dataset)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this method do? can you add a bit of explanation?
if (self.segments is None ) or ((self.segments is not None) and self.profile_full_dataset): | ||
self._profiles[-1]["full_profile"].track_datum(feature_name, value) | ||
|
||
if self.segments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the following code block in terms of how the data is segmented? Given it's a piece of core logic, it's worth extracting to a method with documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a bit confusing, since it is a tracking a segmented datum. I think they way i did is inconsistent with the groupby method, since that method keeps the segmented feature in the data frame. I will fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Minor comments
@@ -14,7 +14,7 @@ jobs: | |||
fail-fast: false | |||
max-parallel: 6 | |||
matrix: | |||
python-version: [ 3.6,3.7,3.8] | |||
python-version: [3.6,3.7, 3.8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consistent spacing?
@@ -124,15 +136,17 @@ def __init__( | |||
pipeline: str, | |||
writers: List[WriterConfig], | |||
verbose: bool = False, | |||
with_rotation_time: str =None, | |||
cache :int = None, | |||
with_rotation_time: str = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shouldn't this be Optional[str]
?
you can check `scrips/segements_logging.py for the use cases