Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Varying first cutoff time for each target group #258

Merged
merged 15 commits into from Nov 2, 2021

Conversation

jeff-hernandez
Copy link
Collaborator

Closes #215 by supporting columns as values for minimum_data in a label search. This functionality allows users to vary the first cutoff time for each target group. For example, with a table of customers, you can start the labeling process at the varying times that the accounts were created.

@codecov
Copy link

codecov bot commented Sep 27, 2021

Codecov Report

Merging #258 (bb4b844) into main (f396057) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #258   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           28        28           
  Lines         1270      1304   +34     
=========================================
+ Hits          1270      1304   +34     
Impacted Files Coverage Δ
composeml/label_maker.py 100.00% <100.00%> (ø)
composeml/tests/test_data_slice/test_extension.py 100.00% <100.00%> (ø)
composeml/tests/test_label_maker.py 100.00% <100.00%> (ø)

@jeff-hernandez jeff-hernandez marked this pull request as ready for review September 27, 2021 21:42
@rwedge
Copy link
Contributor

rwedge commented Oct 12, 2021

Can you update the relevant docstring and maybe include an example in the documentation?

@jeff-hernandez
Copy link
Collaborator Author

@rwedge I updated the docstring and also added this guide in the docs

Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test a series that has a mix of date and number strings / a series that has only number strings?

composeml/label_maker.py Outdated Show resolved Hide resolved
@jeff-hernandez
Copy link
Collaborator Author

jeff-hernandez commented Nov 2, 2021

@rwedge I updated the test to include timedelta string and integer values. Although using a mix of date and number can technically work, that seems like an edge use case and I don't think we should test that usage.

Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeff-hernandez jeff-hernandez merged commit 723c581 into main Nov 2, 2021
@jeff-hernandez jeff-hernandez deleted the min-data-per-group branch November 2, 2021 20:33
@jeff-hernandez jeff-hernandez mentioned this pull request Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using a column value for 'minimum_data' in label_maker.search()
2 participants