-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract measures #117
Extract measures #117
Conversation
* Move work of resolving type of cube-response and parsing JSON to a helper method (._cube_dict). * Change all instances in unit tests of `CrunchCube({})` to `CrunchCube(None)`. This is a mild case, but shows the main reason to avoid real work in the constructor, because all unit tests have to satisfy the arguments required by the real-work.
Get rid of warning-spew that occurred on test runs from calling deprecated CrunchCube.index() in tests.
There are no callers using the optional `adjusted` argument. Having such an argument conflates two separate operations. If adjustment is required, it can easily be performed by the caller on the end-result of as_array().
Remove redundant (copies) integration tests from test_headers_and_subtotals.py
* Remove unused `margin` parameter from CrunchCube.as_array(). If in future such a return value is required it should be provided using a distinct method rather than overloading `.as_array()`. * Remove now unnecessary `margin` parameter from CrunchCube._as_array(). Change call for raw_cube_array to use measure object returned by new `._measure()` method. * Remove now-dead `.raw_cube_array()` method.
* Replace calls to CrunchCube._data() with calls to ._counts() or ._measure() depending on context, using the .raw_cube_array property of the returned measure object to access the appropriate array formerly provided by ._data(). * Remove CrunchCube._data() and its helpers `._flat_values()` and `._shape` as they are now dead code.
* Add `make unit-coverage` task to Makefile that shows our unit test coverage is pretty spotty, indicating an over-reliance on integration tests (which do not test fine-grained behaviors).
392005a
to
6e10548
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall 💯 🎉
Just a couple of comments regarding the style and unit test coverage. See if you can address them, and then we can merge... I can even merge like this, but want your input on comments I made.
unweighted counts are returned. | ||
""" | ||
return ( | ||
self._measures.means if self._measures.means is not None else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we require is not None
part here? It seems that the return value is either None
or an instance, in which case the if self._measures.means
would be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the explicit form would be more readable here:
measures = self._measures
if measures.means is not None:
return measures.means
if weighted:
return measures.weighted_counts
return measures.unweighted_counts
I believe it would also not hide potential misses by coverage (not sure about this though). I tried it locally, and it increases the unit test coverage from 65.94% to 66.10%
else json.loads(cube_response) | ||
) | ||
# ---cube is 'value' item in a shoji response--- | ||
return cube_dict.get('value', cube_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case when there is a "value"
key doesn't seem to be tested by unit tests.
@@ -1466,6 +1466,13 @@ def means(self): | |||
return None | |||
return _MeanMeasure(self._cube_dict, self._all_dimensions) | |||
|
|||
@lazyproperty | |||
def missing_count(self): | |||
"""numeric representing count of missing rows in cube response.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that "row" is the thing that's being counted as missing here...
return ( | ||
self._measures.weighted_counts if weighted else | ||
self._measures.unweighted_counts | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing it like this:
if weighted:
return self._measures.weighted_counts
return self._measures.unweighted_counts
Increases the unit test coverage from 65.94 to 66.01 percent.
Returns | ||
res (ndarray): Tabular representation of crunch cube | ||
""" | ||
return self._apply_missings_and_insertions( | ||
self._raw_cube_array(weighted, margin), | ||
self._measure(weighted).raw_cube_array, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@@ -1,22 +0,0 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this particular commit corresponds to the message.
This refactoring extracts a
_Measures
object and a hierarchy of measure objects that subclass_BaseMeasure
. Specifically, these are_MeanMeasure
,_UnweightedCountMeasure
, and_WeightedCountMeasure
.These objects extract the measure-related logic out of CrunchCube as pure-functions (lazyproperties). The new objects have 100% unit test coverage and simplify CrunchCube.