Implement series.var() in new style #220

densmirn · 2019-10-14T08:19:09Z

No description provided.

shssf · 2019-10-15T19:26:03Z

hpat/datatypes/hpat_pandas_series_functions.py

+    Returns
+    -------
+    :obj:`scalar`
+         returns :obj:`scalar`


It can return scalar or Series

I'm not sure we have to add series here, because series is returned in case of level specified, but currently we don't support parameter level. So scalar is always returned in our case. Do you still think I need to add series here?

shssf · 2019-10-15T19:30:56Z

hpat/datatypes/hpat_pandas_series_functions.py

+    if not isinstance(self, SeriesType):
+        raise TypingError('{} The object must be a pandas.series. Given: {}'.format(_func_name, self))
+
+    if not isinstance(self.dtype, types.Number):


#216 (comment)

shssf · 2019-10-15T19:31:46Z

hpat/datatypes/hpat_pandas_series_functions.py

+            valuable_length = len(self._data) - numpy.sum(numpy.isnan(self._data))
+            return numpy.nanvar(self._data) * valuable_length / (valuable_length - ddof)
+
+        return self._data.var() * len(self._data) / (len(self._data) - ddof)


I don't believe numpy has no such functionality

Numpy implementation has ddof=0 by default, but Pandas has ddof=1 by default. So I has to make this extra change to align the parameter between these implementations.

shssf · 2019-10-15T19:32:55Z

hpat/hiframes/hiframes_typed.py

@@ -851,7 +851,7 @@ def parse_impl(data):

    def _run_call_series(self, assign, lhs, rhs, series_var, func_name):
        # single arg functions
-        if func_name in ('sum', 'count', 'mean', 'var', 'min', 'max', 'prod'):
+        if func_name in ('sum', 'count', 'mean', 'min', 'max', 'prod'):


I think this should brake parallelism. not sure.

You are right, It breaks multiprocessing parallelism, but I still don't know how to save both approaches ("old" and "new") simultaneously. Please let me know if you have thoughts about it.

where is no single and simple recipe

shssf · 2019-10-15T19:34:23Z

hpat/tests/test_series.py

+            return series.var(skipna=skipna, ddof=ddof)
+
+        cfunc = hpat.jit(pyfunc)
+        series = pd.Series([1.3, -2.7, np.nan, 0.1, 10.9])


#217 (comment)

shssf · 2019-10-18T05:55:46Z

hpat/hiframes/hiframes_typed.py

@@ -851,7 +851,7 @@ def parse_impl(data):

    def _run_call_series(self, assign, lhs, rhs, series_var, func_name):
        # single arg functions
-        if func_name in ('sum', 'count', 'mean', 'var', 'min', 'max', 'prod'):
+        if func_name in ('sum', 'count', 'mean', 'min', 'max', 'prod'):


where is no single and simple recipe

densmirn · 2019-10-25T09:34:57Z

@shssf, @kozlov-alexey could you take the next round to review the PR?

shssf · 2019-10-26T03:00:19Z

hpat/datatypes/hpat_pandas_series_functions.py

+        0/None/'index' - row-wise operation
+        1/'columns'    - column-wise operation
+        *unsupported*
+    skipna: :obj:`bool`


I suspect skipna=True

By default skipna=None in Pandas interface. So we did the same.

shssf · 2019-10-26T03:00:44Z

hpat/datatypes/hpat_pandas_series_functions.py

@@ -256,6 +256,86 @@ def hpat_pandas_series_values_impl(self):
    return hpat_pandas_series_values_impl


+@overload_method(SeriesType, 'var')
+def hpat_pandas_series_var(self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None):


skipna=True?

By default skipna=None in Pandas interface.

densmirn requested review from shssf and akharche and removed request for akharche October 14, 2019 08:55

densmirn force-pushed the feature/series_var branch 5 times, most recently from f8a9a03 to f323e5b Compare October 15, 2019 06:14

shssf suggested changes Oct 15, 2019

View reviewed changes

densmirn force-pushed the feature/series_var branch 4 times, most recently from 9f57b16 to 80cc25d Compare October 16, 2019 15:11

shssf suggested changes Oct 18, 2019

View reviewed changes

densmirn requested a review from shssf October 24, 2019 06:02

densmirn added 3 commits October 24, 2019 09:15

Implement series.var() in new style

942d0fd

Minor changes for series.var()

6c27327

Avoid devision by zero for series.var()

f783e97

densmirn force-pushed the feature/series_var branch from 80cc25d to f783e97 Compare October 24, 2019 06:16

Revert multiprocessing parallelism for series.var()

a1d287a

densmirn requested review from Hardcode84, kozlov-alexey and PokhodenkoSA October 25, 2019 06:56

densmirn added the Ready for Review label Oct 25, 2019

kozlov-alexey approved these changes Oct 25, 2019

View reviewed changes

shssf added Waiting on author and removed Ready for Review labels Oct 25, 2019

Merge branch 'master' into feature/series_var

83d9300

shssf reviewed Oct 26, 2019

View reviewed changes

shssf approved these changes Oct 26, 2019

View reviewed changes

shssf merged commit d443814 into IntelPython:master Oct 26, 2019

densmirn deleted the feature/series_var branch June 9, 2020 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement series.var() in new style #220

Implement series.var() in new style #220

densmirn commented Oct 14, 2019

shssf Oct 15, 2019

densmirn Oct 16, 2019

shssf Oct 15, 2019

shssf Oct 15, 2019

densmirn Oct 16, 2019

shssf Oct 15, 2019

densmirn Oct 16, 2019

shssf Oct 18, 2019

shssf Oct 15, 2019

shssf Oct 18, 2019

densmirn commented Oct 25, 2019

shssf Oct 26, 2019

densmirn Oct 28, 2019

shssf Oct 26, 2019

densmirn Oct 28, 2019

Implement series.var() in new style #220

Implement series.var() in new style #220

Conversation

densmirn commented Oct 14, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

densmirn commented Oct 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment