Skip to content
This repository has been archived by the owner on Feb 2, 2024. It is now read-only.

Implement Series.head() #223

Merged
merged 12 commits into from
Oct 25, 2019
Merged

Implement Series.head() #223

merged 12 commits into from
Oct 25, 2019

Conversation

1e-to
Copy link
Contributor

@1e-to 1e-to commented Oct 14, 2019

Some tests skips, until problem with index fix

"""
Pandas Series method :meth:`pandas.Series.head` implementation.
.. only:: developer
Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if not isinstance(self.index, types.NoneType):
def hpat_pandas_series_head_impl(self, n=5):

return pandas.Series(self._data[:n], self._index[:n])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if index is None? I guess it does not work with the current index implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add raise exception to index None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation allows to call head() even if index is set to None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add raise exception to index None?

No, just create pandas.Series with no index

# name = self._get_series_name(series_var, nodes)
#
# return self._replace_func(
# func, (data, index, n_arg, name), pre_nodes=nodes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataframe.head() is based on it. Does it work with new style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have not tests for Dataframe.head() (only one skiped)
Should I write it and test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_df_head1 is skipped due to 'dtype' fail. I guess it has been fixed yet, let's check. Nevertheless, test was skipped on Windows only

raise TypingError(
'{} The parameter must be an integer type. Given type n: {}'.format(_func_name, n))

if not isinstance(self.index, types.NoneType):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case if index is none hpat_pandas_series_head returns None. Maybe better temporarily (before fixing indexing) to raise exception in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case on index=None the function returns hpat_pandas_series_head_index_impl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it's so, but before it wasn't so: ad8a03b.

'head': lambda A, I, k, name: hpat.hiframes.api.init_series(A[:k], None, name),
# 'head': lambda A, I, k, name: hpat.hiframes.api.init_series(A[:k], None, name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the change exactly needed? I think that affects nothing because you commented out head in old style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you right, I recommented it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@1e-to In this case tests on parallelism check will be broken. Please refer to any latest PRs, like PR #186

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shssf Should I delete existing tests on parallelism and write them in another style, or comment old?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@1e-to No, I don't think you need it


if not isinstance(self, SeriesType):
raise TypingError(
'{} The object must be a pandas.series. Given self: {}'.format(_func_name, self))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need Given self: here. Replace it with Given:

@shssf
Copy link
Contributor

shssf commented Oct 14, 2019

Need to redesign tests a bit
enable all tests
fix boxing/unboxing

@1e-to 1e-to requested a review from akharche October 17, 2019 09:48
Comment on lines 543 to 553
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head1
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_default1
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index1
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index2
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index3
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index4
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_parallel1
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index_parallel1
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_index_parallel2
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_noidx
# Test: python -m hpat.runtests hpat.tests.test_series.TestSeries.test_series_head_idx
Copy link
Contributor

@densmirn densmirn Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look like too many tests listed in docstring. Maybe better to represent the list as the pattern with parameter -k: python -m hpat.runtests -k hpat.tests.test_series.TestSeries.test_series_head*

# Returns
# -------
# :obj:`pandas.Series`
# returns The first n rows of the caller object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -> the

Comment on lines 557 to 558
# n: :obj:`int`
# input argument, default 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move default 5 after type of the parameter:

n: :obj:`int`, default 5

Comment on lines 565 to 582
# _func_name = 'Method head().'
#
# if not isinstance(self, SeriesType):
# raise TypingError('{} The object must be a pandas.series. Given: {}'.format(_func_name, self))
#
# if not isinstance(n, (types.Integer, types.Omitted)) and n != 5:
# raise TypingError('{} The parameter must be an integer type. Given type n: {}'.format(_func_name, n))
#
# if isinstance(self.index, types.NoneType):
# def hpat_pandas_series_head_impl(self, n=5):
# return pandas.Series(self._data[:n])
#
# return hpat_pandas_series_head_impl
# else:
# def hpat_pandas_series_head_index_impl(self, n=5):
# return pandas.Series(self._data[:n], self._index[:n])
#
# return hpat_pandas_series_head_index_impl
Copy link
Contributor

@densmirn densmirn Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the implementation commented out?


hpat_func = hpat.jit(test_impl)

data_test = [[6, 6, 2, 1, 3, 3, 2, 1, 2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to use globally defined data as input data in tests: f2435b0#diff-deca39d332649cea819383154a5d2cb3R39-R62


hpat_func = hpat.jit(test_impl)

data_test = [[6, 6, 2, 1, 3, 3, 2, 1, 2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same, could you use globally defined input data? If it's required the global data could be changed.


hpat_func_param1 = hpat.jit(test_impl_param)

for param1 in [0, 3, 10]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename param1 to n, because n is real name of parameter of the method.

hpat_func_param1 = hpat.jit(test_impl_param)

for param1 in [0, 3, 10]:
result_param1_ref = test_impl_param(S, param1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's enough to name such variable as ref_result or jit_result.

Comment on lines +2050 to +2052
for param1 in [1, 3, 7]:
result_param1_ref = test_impl_param(S, param1)
result_param1 = hpat_func_param1(S, param1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same:
param1 -> n
result_param1_ref -> ref_result
result_param1 -> jit_result

Comment on lines +2044 to +2045
result_ref = test_impl(S)
result = hpat_func(S)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename the variables:
result_ref -> ref_result
result -> jit_result


@unittest.skip("Broke another three tests")
def test_series_head_idx(self):
def test_impl(S):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to add implementation where series is constructed? I mean to test functionality without unboxing series.

return hpat_pandas_series_head_impl
else:
def hpat_pandas_series_head_index_impl(self, n=5):
return pandas.Series(self._data[:n], self._index[:n])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you construct output Series passing parameter name as self._name? It should be supported.

S = pd.Series(input_data)
for n in [1, 3, 2]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add negative and zero n in the testing.

result_jit = hpat_func(S, n)
pd.testing.assert_series_equal(result_jit, result_ref)

@unittest.skip("Not pass")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more clear description why the test should be skipped?

hpat_func = hpat.jit(test_impl)
for input_data in test_global_input_data_integer64:
S = pd.Series(input_data)
for n in [2, 3]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add negative and zero n here and in other tests.

result_jit = hpat_func(S, n)
pd.testing.assert_series_equal(result_jit, result_ref)

@unittest.skip("Not pass")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed more clear description.

Pandas Series method :meth:`pandas.Series.head` implementation.

.. only:: developer
Test: python -m -k hpat.runtests hpat.tests.test_series.TestSeries.test_series_head*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python -m hpat.runtests -k hpat.tests.test_series.TestSeries.test_series_head*

@shssf shssf merged commit 619bacf into IntelPython:master Oct 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants