Dataframe.head() #363

1e-to · 2019-12-03T12:42:00Z

No description provided.

pep8speaks · 2019-12-03T12:42:07Z

Hello @1e-to! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file sdc/hiframes/pd_dataframe_ext.py:

Line 1632:1: E305 expected 2 blank lines after class or function definition, found 1
Line 1632:1: E402 module level import not at top of file

In the file sdc/tests/test_dataframe.py:

Line 1187:28: E127 continuation line over-indented for visual indent

Comment last updated at 2019-12-16 14:02:14 UTC

shssf · 2019-12-03T16:21:28Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

-        return pandas.Series(data=result_data, index=result_index)
-
-    return sdc_pandas_dataframe_count_impl
+if not sdc.config.use_default_dataframe:


I don't think this variable is needed here.
You can just delete implementation of method count(or keep it commented if you need this).

shssf · 2019-12-03T16:22:31Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+        return sdc_pandas_dataframe_count_impl
+
+else:
+    def sdc_pandas_dataframe_reduce_columns(df, name, param):


Why you need this extra function? Could you please put this realization into @overload_method(DataFrameType, 'head')?

We do this to prevent code duplication for reduce dataframe column,
the general function for reducing the columns of a data frame is described here:

sdc/sdc/datatypes/hpat_pandas_dataframe_functions.py

Line 103 in eeed327

def sdc_pandas_dataframe_reduce_columns(df, name, param):

shssf · 2019-12-03T16:23:25Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+            ty_checker.raise_exc(n, 'integer', 'n')
+
+    @overload_method(DataFrameType, 'head')
+    def median_overload(df, n=5):


Suggested change

def median_overload(df, n=5):

def sdc_pandas_dataframe_head(df, n=5):

sdc/datatypes/hpat_pandas_dataframe_functions.py

akharche · 2019-12-05T13:57:08Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+
+        return _reduce_impl
+
+    def check_type(name, df, axis=None, skipna=None, level=None, numeric_only=None, ddof=1, min_count=0):


Is it a common function for checking all parameters? How is it related to df.head()?

akharche · 2019-12-11T13:23:38Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

        return _reduce_impl

-    def check_type(name, df, axis=None, skipna=None, level=None, numeric_only=None, ddof=1, min_count=0):
+    def check_type(name, df, axis=None, skipna=None, level=None, numeric_only=None, ddof=1, min_count=0, n=5):


Is it a specific function for head method? May be it is not necessary to separate it in one case?

Its common func for all cases that write in it

akharche · 2019-12-11T13:26:51Z

sdc/tests/test_dataframe.py

    def test_df_fillna1(self):
        def test_impl(df):
-            return df.fillna(5.0)
+            return df.fillna(0.)


What is the reason to change it?

Its has attention to other PR, I forget delete this

akharche · 2019-12-11T13:49:21Z

sdc/tests/test_dataframe.py

+                           "STRING": ['a', 'dd', 'c', '12', 'ddf']})
+        pd.testing.assert_frame_equal(sdc_func(df), test_impl(df))
+
+    def test_dataframe_head1(self):


What if Index is set explicitly, this solution is ok for that case? I don't see tests on it

akharche · 2019-12-13T14:21:40Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

        name = 'head'

-        check_type(name, df, n=n)
+        check_type(name, df)


It is overhead here

densmirn · 2019-12-16T11:01:41Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+        if len(params) > 0:
+            space.append(', ')
+        func_definition = 'def _reduce_impl(df{}{}):'.format("".join(space), ", ".join(
+            str(key) + '=' + str(value) for key, value in params))


Why not use formatted strings or f-strings instead of concatenation?
str(key) + '=' + str(value) -> f'{key}={value}' or '{}={}'.format(key, value)

densmirn · 2019-12-16T11:09:25Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+        space = []
+        if len(params) > 0:
+            space.append(', ')
+        func_definition = 'def _reduce_impl(df{}{}):'.format("".join(space), ", ".join(


I would propose to do so:

all_params = ['df'] + [f'{key}={value}' for key, value in params] func_definition = ', '.join(all_params)

densmirn · 2019-12-16T11:11:31Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+        if not (isinstance(n, (types.Omitted, types.Integer)) or n == 5):
+            ty_checker.raise_exc(n, 'int64', 'n')
+
+        params = [('n', 5)]


You could pass [('n', 5)] to the sdc_pandas_dataframe_reduce_columns_series directly without creating variable params.

densmirn · 2019-12-16T13:43:13Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+        n_cols = len(saved_columns)
+        data_args = tuple('data{}'.format(i) for i in range(n_cols))
+        all_params = ['df'] + [f'{key}={value}' for key, value in params]
+        func_definition = "def _reduce_impl(" + ', '.join(all_params) + "):"


Let's use formatted strings or f-strings instead of concatenation.
func_definition = 'def _reduce_impl({}):'.format(', '.join(all_params))

densmirn · 2019-12-16T13:45:26Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+            line = '  {} = sdc.hiframes.api.init_series(sdc.hiframes.pd_dataframe_ext.get_dataframe_data(df, {}))'
+            func_lines.append(line.format(d + '_S', i))
+            func_lines.append('  {} = {}.{}({})'.format(d + '_O', d + '_S', name, ", ".join(
+                str(key) for key, value in params)))


'{} = {}.{}({})'.format(d + '_O', d + '_S' -> '{}_O = {}_S.{}({})'.format(d, d
str(key) -> key
value -> _

densmirn · 2019-12-16T13:47:36Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+                str(key) for key, value in params)))
+        func_lines.append("  return sdc.hiframes.pd_dataframe_ext.init_dataframe({}, None, {})\n".format(
+            ", ".join(d + '_O._data' for d in data_args),
+            ", ".join("'" + c + "'" for c in saved_columns)))


Let's use formatted strings or f-strings. E.g. f"'{c}'"

densmirn · 2019-12-16T14:19:40Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+
+        return _reduce_impl
+
+    def check_type(name, df, axis=None, skipna=None, level=None, numeric_only=None, ddof=1, min_count=0):


Is check_type used somewhere?

densmirn · 2019-12-16T14:20:22Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+            func_lines.append('  {}_O = {}_S.{}({})'.format(d, d, name, ", ".join(
+                key for key, _ in params)))
+        func_lines.append("  return sdc.hiframes.pd_dataframe_ext.init_dataframe({}, None, {})\n".format(
+            ", ".join(d + '_O._data' for d in data_args),


You forgot to use f-string here.

densmirn · 2019-12-16T14:21:46Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+            func_lines.append(line.format(d + '_S', i))
+            func_lines.append('  {}_O = {}_S.{}({})'.format(d, d, name, ", ".join(
+                key for key, _ in params)))
+        func_lines.append("  return sdc.hiframes.pd_dataframe_ext.init_dataframe({}, None, {})\n".format(


Do you exactly need \n at the end?

densmirn · 2019-12-16T14:22:22Z

sdc/datatypes/hpat_pandas_dataframe_functions.py

+        func_lines = [func_definition]
+        for i, d in enumerate(data_args):
+            line = '  {} = sdc.hiframes.api.init_series(sdc.hiframes.pd_dataframe_ext.get_dataframe_data(df, {}))'
+            func_lines.append(line.format(d + '_S', i))


Please use f-string or formatted string instead of concatenation.

elena.totmenina added 2 commits November 29, 2019 12:49

Init dataframe

bf8c023

Implement Dataframe.head()

c966f8e

1e-to requested a review from densmirn December 3, 2019 12:42

1e-to and others added 2 commits December 3, 2019 19:42

Merge branch 'master' into dataframe

eef527d

Fix codestyle

d036cc7

shssf reviewed Dec 3, 2019

View reviewed changes

akharche suggested changes Dec 5, 2019

View reviewed changes

1e-to added the [WIP] Work in progress label Dec 5, 2019

shssf mentioned this pull request Dec 5, 2019

add reduce #366

Merged

elena.totmenina and others added 3 commits December 11, 2019 14:23

impl dataframe.head

12d8f2c

codestyle

a1872ad

Merge branch 'master' into dataframe

2c1bd6b

1e-to requested a review from AlexanderKalistratov December 11, 2019 11:29

1e-to added Ready for Review and removed [WIP] Work in progress labels Dec 11, 2019

akharche reviewed Dec 11, 2019

View reviewed changes

elena.totmenina and others added 4 commits December 12, 2019 15:12

Fix type check

655bba3

Merge

f6bb9c6

Merge branch 'master' into dataframe

0356771

Merge remote-tracking branch 'origin/dataframe' into dataframe

b26ac5f

akharche suggested changes Dec 13, 2019

View reviewed changes

densmirn suggested changes Dec 16, 2019

View reviewed changes

little fixes

fe0490e

1e-to force-pushed the dataframe branch from 255d6e2 to fe0490e Compare December 16, 2019 13:29

Merge branch 'master' into dataframe

c4be0ab

densmirn reviewed Dec 16, 2019

View reviewed changes

format fix

785b80e

densmirn reviewed Dec 16, 2019

View reviewed changes

1e-to closed this Mar 2, 2020

1e-to deleted the dataframe branch March 2, 2020 12:15

	def median_overload(df, n=5):
	def sdc_pandas_dataframe_head(df, n=5):


		return _reduce_impl

		def check_type(name, df, axis=None, skipna=None, level=None, numeric_only=None, ddof=1, min_count=0):

Dataframe.head() #363

Dataframe.head() #363

Uh oh!

Conversation

1e-to commented Dec 3, 2019

Uh oh!

pep8speaks commented Dec 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2019-12-16 14:02:14 UTC

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shssf Dec 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pep8speaks commented Dec 3, 2019 •

edited

Loading

shssf Dec 3, 2019 •

edited

Loading