-
Notifications
You must be signed in to change notification settings - Fork 61
Conversation
@@ -1096,6 +1096,57 @@ def hpat_pandas_series_ge_impl(self, other): | |||
raise TypingError('{} The object must be a pandas.series and argument must be a number. Given: {} and other: {}'.format(_func_name, self, other)) | |||
|
|||
|
|||
@overload_method(SeriesType, 'idxmin') | |||
def hpat_pandas_series_idxmin(self, axis=None, skipna=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1e-to I see in documentation the interface is a bit diffrent. It looks like self, axis=0, skipna=True, *args
The last argument (*args
)is not used but it might be good to put it in the function parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
Returns | ||
------- | ||
:obj:`pandas.Series` or :obj:`int` or :obj:`float` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like description is wrong. The method returns label of selected index. I'm not sure but it looks like it returns a value (corresponded type) of the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
_func_name = 'Method idxmin().' | ||
|
||
if not isinstance(self, SeriesType): | ||
raise TypingError('{} The object must be a pandas.series. Given self: {}'.format(_func_name, self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given self
-> Given
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
if not isinstance(self, SeriesType): | ||
raise TypingError('{} The object must be a pandas.series. Given self: {}'.format(_func_name, self)) | ||
|
||
if not isinstance(self.dtype, types.Number): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really focused on numeric data in series? No strings array supported?
Also, I would vote to avoid self.dtype
using because I see it ambiguous. I think it is better to use self._data.dtype
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to correct a little bit: self._data.dtype
-> self.data.dtype
. self
has not attribute _data
on typing level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
hpat/hiframes/pd_series_ext.py
Outdated
@@ -241,6 +241,7 @@ def __init__(self, dmm, fe_type): | |||
make_attribute_wrapper(SeriesType, 'name', '_name') | |||
|
|||
|
|||
from hpat.datatypes.hpat_pandas_series_functions import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line should not be moved here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -866,7 +866,7 @@ def _run_call_series(self, assign, lhs, rhs, series_var, func_name): | |||
return self._replace_func(func, [data], pre_nodes=nodes) | |||
|
|||
if func_name in ('std', 'nunique', 'describe', 'isna', | |||
'isnull', 'median', 'idxmin', 'idxmax', 'unique'): | |||
'isnull', 'median', 'idxmax', 'unique'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will not work (didn't check). if parallel tests are not working, please leave this line as in original, and add resolve_idxmin
value to the array as here https://github.com/IntelPython/hpat/pull/191/files#diff-f13c21aac60fb8f7ae3c9527239d1e17R990
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, there were no parallel tests for this function, should I write them?
resolve_idxmin need to dont overlap arrays methods, but idxmin for array does not exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, let's wait test system results
@@ -530,6 +530,6 @@ def gt_f(a, b): | |||
'head_index': lambda A, I, k, name: hpat.hiframes.api.init_series(A[:k], I[:k], name), | |||
'median': lambda A: hpat.hiframes.api.median(A), | |||
# TODO: handle NAs in argmin/argmax | |||
'idxmin': lambda A: A.argmin(), | |||
# 'idxmin': lambda A: A.argmin(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to leave it as in original
return S.idxmin() | ||
hpat_func = hpat.jit(test_impl) | ||
|
||
S = pd.Series([1, 2, 3], [4, 45, 14]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see this comment #217 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add 1 test and fix another
-add args -update docs -change check self._data.dtype -add test for all
hpat/tests/test_series.py
Outdated
hpat_func = hpat.jit(test_impl) | ||
|
||
S = pd.Series([1, 2, 3], [4, 45, 14]) | ||
print(hpat_func(S)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this in test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
hpat/tests/test_series.py
Outdated
|
||
@unittest.skip("Need index fix") | ||
def test_series_idxmin(self): | ||
def test_series_idxmin_impl(S): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to stick to the common name for the jitted func - test_impl.
hpat/tests/test_series.py
Outdated
|
||
test_input_data = data_simple + data_extra | ||
|
||
for input_data in data_simple: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the first two loops for testing default index in Series? Then they should make one separate test. You will have another compilation of test_func on the last two loops anyway, so there'll be no benefit to put this all in one test.
hpat/tests/test_series.py
Outdated
[6, 6.1, 2.2, 1, 3, 3, 2.2, 1, 2], | ||
] | ||
|
||
data_extra = [[np.nan, np.nan, np.nan, np.nan], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to have one 'series_data' list (instead of data_simple + data_extra) with it's elements covering all possible situations (i.e. containing numbers, NaNs or both, in different combinations). There's no obvious benefit of using loops in unittests - better identify and cover different cases yourself.
hpat/tests/test_series.py
Outdated
|
||
for input_data in data_simple: | ||
for index_data in data_simple: | ||
S = pd.Series(input_data, index_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to test index_data at all? Does our implementation depend on the indexes?
Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_idxmin1 | ||
Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_idxmin_str | ||
Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_idxmin_int | ||
Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_idxmin_no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the list of tests in docstring according to the real list.
|
||
if not isinstance(self.data.dtype, types.Number): | ||
raise TypingError( | ||
'{} Currently function supports only numeric values. Given data type: {}'.format(_func_name, self.dtype)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.dtype
-> self.data.dtype
|
||
if not isinstance(skipna, (types.Omitted, types.Boolean, bool)): | ||
raise TypingError( | ||
'{} The parameter must be a boolean type. Given type skipna: {}'.format(_func_name, type(skipna))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type(skipna)
-> skipna
|
||
return numpy.argmin(self._data) | ||
|
||
return hpat_pandas_series_idxmin_impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do return once outside of if-else
statement.
hpat/tests/test_series.py
Outdated
[1.1, 0.3, 2.1, 1, 3, 0.3, 2.1, 1.1, 2.2], | ||
[6, 6.1, 2.2, 1, 3, 0, 2.2, 1, 2], | ||
[6, 6, 2, 1, 3, np.inf, np.nan, np.nan, np.nan], | ||
[3., 5.3, np.nan, np.nan, np.inf, np.inf, 4.4, 3.7, 8.9] | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix indentation.
hpat/tests/test_series.py
Outdated
|
||
hpat_func = hpat.jit(test_impl) | ||
|
||
test_input_data = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the list is used nowhere.
hpat/tests/test_series.py
Outdated
|
||
hpat_func = hpat.jit(test_impl) | ||
|
||
test_input_data = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the list is used nowhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index fixed. Please review tests again - it looks like they have NaN != NaN issues
No description provided.