Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String features such as lenght, word count, character pos tracking #236

Merged
merged 37 commits into from
Jun 25, 2021

Conversation

lalmei
Copy link
Contributor

@lalmei lalmei commented Jun 10, 2021

Description

adding String features such as lenght, word count, character pos tracking

General Checklist

  • Tests added for this feature/bug
    if it was a bug, test must cover it.
  • Conform by the style guides, by using formatter
  • Documentation updated
  • (optional) Please add a label to your PR

@lalmei lalmei changed the title String features such as lenght, word count, character pos tracking WIP: String features such as lenght, word count, character pos tracking Jun 10, 2021
@lalmei lalmei added the feature label Jun 10, 2021
@coveralls
Copy link

coveralls commented Jun 10, 2021

Pull Request Test Coverage Report for Build 969712123

  • 161 of 268 (60.07%) changed or added relevant lines in 8 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+79.9%) to 79.927%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/whylogs/core/statistics/stringtracker.py 124 127 97.64%
src/whylogs/viz/base.py 4 8 50.0%
src/whylogs/core/datasetprofile.py 9 19 47.37%
src/whylogs/viz/matplotlib/visualizer.py 11 101 10.89%
Totals Coverage Status
Change from base Build 965657312: 79.9%
Covered Lines: 3352
Relevant Lines: 4025

💛 - Coveralls

@lalmei lalmei changed the title WIP: String features such as lenght, word count, character pos tracking String features such as lenght, word count, character pos tracking Jun 10, 2021
Copy link
Contributor

@andyndang andyndang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Feedback is mostly around the char tracker and the behavior (eager creation of trackers) and the supported set of characters

proto/src/messages.proto Outdated Show resolved Hide resolved
proto/src/messages.proto Outdated Show resolved Hide resolved
proto/src/messages.proto Outdated Show resolved Hide resolved
proto/src/summaries.proto Show resolved Hide resolved
proto/src/summaries.proto Outdated Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Outdated Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Outdated Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Outdated Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Outdated Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Outdated Show resolved Hide resolved
lalmei and others added 12 commits June 14, 2021 09:33
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
@lalmei lalmei requested a review from andyndang June 18, 2021 19:34
src/whylogs/core/datasetprofile.py Outdated Show resolved Hide resolved
src/whylogs/core/datasetprofile.py Outdated Show resolved Hide resolved
src/whylogs/core/datasetprofile.py Outdated Show resolved Hide resolved
src/whylogs/core/datasetprofile.py Outdated Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Show resolved Hide resolved
src/whylogs/core/statistics/stringtracker.py Outdated Show resolved Hide resolved
@lalmei lalmei requested a review from andyndang June 18, 2021 23:54
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Co-authored-by: Andy Dang <26821974+andyndang@users.noreply.github.com>
Copy link
Contributor

@andyndang andyndang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding minor changes

Copy link
Contributor

@andyndang andyndang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub is confusing

Copy link
Contributor

@andyndang andyndang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lalmei lalmei merged commit 32f14d8 into mainline Jun 25, 2021
@lalmei lalmei deleted the freq_mapping_WHY-3113 branch June 25, 2021 00:54
valer-whylabs pushed a commit to valer-whylabs/whylogs that referenced this pull request Aug 3, 2021
…hylabs#236)

* add character position tracker

* add string attributes to  flat summary

* track unicode characters as NITL

* add string notebook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants