Skip to content
This repository has been archived by the owner on Feb 2, 2024. It is now read-only.

Refactor Series.fillna() in a new style and add more tests #251

Merged

Conversation

kozlov-alexey
Copy link
Contributor

No description provided.

if not isinstance(self, SeriesType):
raise TypingError('{} The object must be a pandas.series. Given: {}'.format(_func_name, self))

if not (isinstance(axis, (types.Integer, types.UnicodeType, types.Omitted)) or axis is None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's additionally check axis type on StringLiteral.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


if not (isinstance(inplace, types.Literal) and isinstance(inplace, types.Boolean)
or isinstance(inplace, types.Omitted)
or inplace == False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More correct to compare booleans via operator is, inplace is False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +2272 to +2614
if not ((method is None or isinstance(method, types.Omitted))
and (limit is None or isinstance(limit, types.Omitted))
and (downcast is None or isinstance(downcast, types.Omitted))
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about splitting this condition on 3 independent ones? At the least it would look simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@densmirn I don't really think it's better, cause we'll duplicate raise statement, but I agree the check can be simplified if we, e.g. use a function like this:

def is_arg_omitted(arg, default_value):

    has_default_value = (arg is default_value) if (default_value is None) else (arg == default_value)
    return has_default_value or isinstance(arg, types.Omitted)

Can you maybe incorporate it into your TypeChecker in #241?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can update TypeChecker with the change, but with other commit (maybe with this one) and after TypeChecker pushed to master.

# do operation inplace, fill the NA/NaNs in the same array and return None
if isinstance(self.dtype, types.UnicodeType):
# TODO: StringArrayType cannot resize inplace, and assigning a copy back to self._data is not possible now
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to raise exception here something like unsupported...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

def test_series_fillna_axis2(self):
'''Verifies Series.fillna() implementation handles 0 as axis argument'''
def test_impl(S):
print("Original series:\n", S)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove debug print.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1036 to 1037
axis_values = [0, 'index']
for value in axis_values:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not merge the lines:
for axis in [0, 'index']:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@kozlov-alexey
Copy link
Contributor Author

@Hardcode84 @AlexanderKalistratov I'm going to fix remarks can you please review too?

@@ -1033,8 +1032,7 @@ def test_impl(S, axis):
hpat_func = hpat.jit(test_impl)

S = pd.Series([1.0, 2.0, np.nan, 1.0, np.inf])
axis_values = [0, 'index']
for value in axis_values:
for value in [0, 'index']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name the loop variable as axis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's changed too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shssf @Hardcode84 Can you please approve and merge if you have no more comments?

@@ -2512,47 +2513,121 @@ def hpat_pandas_series_median_impl(self, axis=None, skipna=True, level=None, num
return hpat_pandas_series_median_impl


@overload_method(SeriesType, 'dropna')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand. You renamed series.dropna to series.fillna? What happend with series.dropna?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an automatic merge issue, sorry, fixed it.


else:
def hpat_pandas_series_fillna_impl(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None):
na_data_arr = hpat.hiframes.api.get_nan_mask(self._data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for moving such function into some common place. These should be definitely not in api module.
But we could make it later of course.

@shssf shssf added FullyApproved Mark PR if all required approval exists and removed Ready for Review labels Oct 31, 2019
@shssf shssf merged commit d737575 into IntelPython:master Oct 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FullyApproved Mark PR if all required approval exists
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants