New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved handling of integer time index #128
Conversation
Codecov Report
@@ Coverage Diff @@
## master #128 +/- ##
==========================================
+ Coverage 88.47% 88.83% +0.35%
==========================================
Files 73 73
Lines 7558 7730 +172
==========================================
+ Hits 6687 6867 +180
+ Misses 871 863 -8
Continue to review full report at Codecov.
|
if isinstance(group[target_time].iloc[0], _numeric_types): | ||
grouped = [[np.inf, group]] | ||
else: | ||
grouped = [[datetime.now(), group]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we move this datetime.now()
outside of calc_results
. basically, we want every chunk that's part of a single cfm call to use the same time.
@@ -591,25 +591,19 @@ def _filter_and_sort(self, df, time_last=None, | |||
""" | |||
if self.time_index: | |||
if time_last is not None and not df.empty: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for time last to be none by this point in the code? If not, let's remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entity.get_all_instances
and a few others call this without specifying a time last
@@ -110,6 +113,11 @@ def calculate_feature_matrix(features, cutoff_time=None, instance_ids=None, | |||
if not isinstance(cutoff_time, pd.DataFrame): | |||
if cutoff_time is None: | |||
cutoff_time = datetime.now() | |||
# if integer time index, use max value as cutoff time instead | |||
if target_entity.time_index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the target entity might not have a time index, but some other entity in the entityset might. perhaps it makes sense to add a method or attribute to an entityset that returns if it is a datetime or numeric time index.
we should assume all entities in an entityset are using the same units for the time index, so this method could do the check when being initialized or when set_time_index
is called.
…ericTimeIndex variable type
@@ -291,6 +291,11 @@ class TimeIndex(Variable): | |||
_dtype_repr = "time_index" | |||
|
|||
|
|||
class NumericTimeIndex(TimeIndex, Numeric): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add to API reference in documentation?
Looks great. Merging |
**v0.1.20** Apr 13, 2018 * Improved chunking when calculating feature matrices (#121) * Primitives as strings in DFS parameters (#129) * Integer time index bugfixes (#128) * Add make_temporal_cutoffs utility function (#126) * Show all entities, switch shape display to row/col (#124) * fixed num characters nan fix (#118) * modify ignore_variables docstring (#117)
Addresses #99 and #125.
cutoff_time
that are not numeric or datetime will raise an errorcutoff_time
will now pass through correctly when using integer time index