-
Notifications
You must be signed in to change notification settings - Fork 61
Conversation
""" | ||
Pandas Series method :meth:`pandas.Series.head` implementation. | ||
.. only:: developer | ||
Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are more tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
if not isinstance(self.index, types.NoneType): | ||
def hpat_pandas_series_head_impl(self, n=5): | ||
|
||
return pandas.Series(self._data[:n], self._index[:n]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if index is None? I guess it does not work with the current index implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add raise exception to index None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current implementation allows to call head() even if index is set to None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add raise exception to index None?
No, just create pandas.Series with no index
hpat/hiframes/hiframes_typed.py
Outdated
# name = self._get_series_name(series_var, nodes) | ||
# | ||
# return self._replace_func( | ||
# func, (data, index, n_arg, name), pre_nodes=nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dataframe.head() is based on it. Does it work with new style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not tests for Dataframe.head() (only one skiped)
Should I write it and test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_df_head1 is skipped due to 'dtype' fail. I guess it has been fixed yet, let's check. Nevertheless, test was skipped on Windows only
raise TypingError( | ||
'{} The parameter must be an integer type. Given type n: {}'.format(_func_name, n)) | ||
|
||
if not isinstance(self.index, types.NoneType): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case if index is none hpat_pandas_series_head
returns None
. Maybe better temporarily (before fixing indexing) to raise exception in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case on index=None the function returns hpat_pandas_series_head_index_impl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it's so, but before it wasn't so: ad8a03b.
hpat/hiframes/series_kernels.py
Outdated
'head': lambda A, I, k, name: hpat.hiframes.api.init_series(A[:k], None, name), | ||
# 'head': lambda A, I, k, name: hpat.hiframes.api.init_series(A[:k], None, name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the change exactly needed? I think that affects nothing because you commented out head
in old style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you right, I recommented it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shssf Should I delete existing tests on parallelism and write them in another style, or comment old?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1e-to No, I don't think you need it
|
||
if not isinstance(self, SeriesType): | ||
raise TypingError( | ||
'{} The object must be a pandas.series. Given self: {}'.format(_func_name, self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need Given self:
here. Replace it with Given:
Need to redesign tests a bit |
Some tests skips, until problem with index fix
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head1 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_default1 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index1 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index2 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index3 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index4 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_parallel1 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index_parallel1 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index_parallel2 | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_noidx | ||
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_idx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look like too many tests listed in docstring. Maybe better to represent the list as the pattern with parameter -k
: python -m hpat.runtests -k hpat.tests.test_series.TestSeries.test_series_head*
# Returns | ||
# ------- | ||
# :obj:`pandas.Series` | ||
# returns The first n rows of the caller object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
-> the
# n: :obj:`int` | ||
# input argument, default 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move default 5
after type of the parameter:
n: :obj:`int`, default 5
# _func_name = 'Method head().' | ||
# | ||
# if not isinstance(self, SeriesType): | ||
# raise TypingError('{} The object must be a pandas.series. Given: {}'.format(_func_name, self)) | ||
# | ||
# if not isinstance(n, (types.Integer, types.Omitted)) and n != 5: | ||
# raise TypingError('{} The parameter must be an integer type. Given type n: {}'.format(_func_name, n)) | ||
# | ||
# if isinstance(self.index, types.NoneType): | ||
# def hpat_pandas_series_head_impl(self, n=5): | ||
# return pandas.Series(self._data[:n]) | ||
# | ||
# return hpat_pandas_series_head_impl | ||
# else: | ||
# def hpat_pandas_series_head_index_impl(self, n=5): | ||
# return pandas.Series(self._data[:n], self._index[:n]) | ||
# | ||
# return hpat_pandas_series_head_index_impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the implementation commented out?
hpat/tests/test_series.py
Outdated
|
||
hpat_func = hpat.jit(test_impl) | ||
|
||
data_test = [[6, 6, 2, 1, 3, 3, 2, 1, 2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to use globally defined data as input data in tests: f2435b0#diff-deca39d332649cea819383154a5d2cb3R39-R62
|
||
hpat_func = hpat.jit(test_impl) | ||
|
||
data_test = [[6, 6, 2, 1, 3, 3, 2, 1, 2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same, could you use globally defined input data? If it's required the global data could be changed.
hpat/tests/test_series.py
Outdated
|
||
hpat_func_param1 = hpat.jit(test_impl_param) | ||
|
||
for param1 in [0, 3, 10]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename param1
to n
, because n
is real name of parameter of the method.
hpat/tests/test_series.py
Outdated
hpat_func_param1 = hpat.jit(test_impl_param) | ||
|
||
for param1 in [0, 3, 10]: | ||
result_param1_ref = test_impl_param(S, param1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's enough to name such variable as ref_result
or jit_result
.
for param1 in [1, 3, 7]: | ||
result_param1_ref = test_impl_param(S, param1) | ||
result_param1 = hpat_func_param1(S, param1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same:
param1
-> n
result_param1_ref
-> ref_result
result_param1
-> jit_result
result_ref = test_impl(S) | ||
result = hpat_func(S) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename the variables:
result_ref
-> ref_result
result
-> jit_result
|
||
@unittest.skip("Broke another three tests") | ||
def test_series_head_idx(self): | ||
def test_impl(S): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want to add implementation where series is constructed? I mean to test functionality without unboxing series.
return hpat_pandas_series_head_impl | ||
else: | ||
def hpat_pandas_series_head_index_impl(self, n=5): | ||
return pandas.Series(self._data[:n], self._index[:n]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you construct output Series passing parameter name
as self._name
? It should be supported.
hpat/tests/test_series.py
Outdated
S = pd.Series(input_data) | ||
for n in [1, 3, 2]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add negative and zero n
in the testing.
hpat/tests/test_series.py
Outdated
result_jit = hpat_func(S, n) | ||
pd.testing.assert_series_equal(result_jit, result_ref) | ||
|
||
@unittest.skip("Not pass") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more clear description why the test should be skipped?
hpat/tests/test_series.py
Outdated
hpat_func = hpat.jit(test_impl) | ||
for input_data in test_global_input_data_integer64: | ||
S = pd.Series(input_data) | ||
for n in [2, 3]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add negative and zero n
here and in other tests.
hpat/tests/test_series.py
Outdated
result_jit = hpat_func(S, n) | ||
pd.testing.assert_series_equal(result_jit, result_ref) | ||
|
||
@unittest.skip("Not pass") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed more clear description.
Pandas Series method :meth:`pandas.Series.head` implementation. | ||
|
||
.. only:: developer | ||
Test: python -m -k hpat.runtests hpat.tests.test_series.TestSeries.test_series_head* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python -m hpat.runtests -k hpat.tests.test_series.TestSeries.test_series_head*
Some tests skips, until problem with index fix