-
Notifications
You must be signed in to change notification settings - Fork 61
Conversation
f8a9a03
to
f323e5b
Compare
Returns | ||
------- | ||
:obj:`scalar` | ||
returns :obj:`scalar` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can return scalar or Series
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we have to add series
here, because series
is returned in case of level
specified, but currently we don't support parameter level
. So scalar
is always returned in our case. Do you still think I need to add series
here?
if not isinstance(self, SeriesType): | ||
raise TypingError('{} The object must be a pandas.series. Given: {}'.format(_func_name, self)) | ||
|
||
if not isinstance(self.dtype, types.Number): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valuable_length = len(self._data) - numpy.sum(numpy.isnan(self._data)) | ||
return numpy.nanvar(self._data) * valuable_length / (valuable_length - ddof) | ||
|
||
return self._data.var() * len(self._data) / (len(self._data) - ddof) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe numpy has no such functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numpy implementation has ddof=0
by default, but Pandas has ddof=1
by default. So I has to make this extra change to align the parameter between these implementations.
hpat/hiframes/hiframes_typed.py
Outdated
@@ -851,7 +851,7 @@ def parse_impl(data): | |||
|
|||
def _run_call_series(self, assign, lhs, rhs, series_var, func_name): | |||
# single arg functions | |||
if func_name in ('sum', 'count', 'mean', 'var', 'min', 'max', 'prod'): | |||
if func_name in ('sum', 'count', 'mean', 'min', 'max', 'prod'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should brake parallelism. not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, It breaks multiprocessing parallelism, but I still don't know how to save both approaches ("old" and "new") simultaneously. Please let me know if you have thoughts about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is no single and simple recipe
hpat/tests/test_series.py
Outdated
return series.var(skipna=skipna, ddof=ddof) | ||
|
||
cfunc = hpat.jit(pyfunc) | ||
series = pd.Series([1.3, -2.7, np.nan, 0.1, 10.9]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9f57b16
to
80cc25d
Compare
hpat/hiframes/hiframes_typed.py
Outdated
@@ -851,7 +851,7 @@ def parse_impl(data): | |||
|
|||
def _run_call_series(self, assign, lhs, rhs, series_var, func_name): | |||
# single arg functions | |||
if func_name in ('sum', 'count', 'mean', 'var', 'min', 'max', 'prod'): | |||
if func_name in ('sum', 'count', 'mean', 'min', 'max', 'prod'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is no single and simple recipe
80cc25d
to
f783e97
Compare
@shssf, @kozlov-alexey could you take the next round to review the PR? |
0/None/'index' - row-wise operation | ||
1/'columns' - column-wise operation | ||
*unsupported* | ||
skipna: :obj:`bool` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect skipna=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default skipna=None
in Pandas interface. So we did the same.
@@ -256,6 +256,86 @@ def hpat_pandas_series_values_impl(self): | |||
return hpat_pandas_series_values_impl | |||
|
|||
|
|||
@overload_method(SeriesType, 'var') | |||
def hpat_pandas_series_var(self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skipna=True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default skipna=None
in Pandas interface.
No description provided.