Argmin/argmax + nanargmin/nanargmax #580

1e-to · 2020-02-07T09:03:14Z

pep8speaks · 2020-02-07T09:03:21Z

Hello @1e-to! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-19 12:48:19 UTC

AlexanderKalistratov · 2020-02-07T09:09:17Z

sdc/functions/numpy_like.py

+    if isinstance(dtype, types.Number):
+        def sdc_nanargmin_impl(self):
+            res = max_ref
+            position = max_


What is max_?

AlexanderKalistratov · 2020-02-07T09:15:44Z

sdc/functions/numpy_like.py

+            position = max_
+            length = len(self)
+            for i in prange(length):
+                if min(res, self[i]) == self[i]:


That looks weird.

if not isnan(self[i]): res = min(res, self[i]) if res == self[i]: position = min(position, i)

AlexanderKalistratov · 2020-02-07T09:22:48Z

Python version is 10000 times faster on one thread? Am I getting it right?

AlexanderKalistratov · 2020-02-07T09:24:05Z

@PokhodenkoSA I can't find previous measurements of argmax/argmin anywhere. Do we have it?

1e-to · 2020-02-07T09:41:27Z

Python version is 10000 times faster on one thread? Am I getting it right?

Yes, it is, perhaps the python finds and returns the first nan by function min (x,y)

I can't find previous measurements of argmax/argmin anywhere. Do we have it?

Now we use argmin that named "Numba" at the table of results

AlexanderKalistratov · 2020-02-07T09:42:52Z

sdc/functions/numpy_like.py

+            position = max_
+            length = len(self)
+            for i in prange(length):
+                if not isnan(self[i]):


Let's try it this way:

result_is_nan = False if not isnan(self[i]): if not result_is_nan: res = min(res, self[i]) if res == self[i]: position = min(position, i) else: position = min(position, i) result_is_nan = True

No, we need different position for nan result and for not nan result:

result_is_nan = False nanposition = max_ if not isnan(self[i]): if not result_is_nan: res = min(res, self[i]) if res == self[i]: position = min(position, i) else: nanposition = min(nanposition, i) result_is_nan = True

AlexanderKalistratov · 2020-02-07T09:44:27Z

Python version is 10000 times faster on one thread? Am I getting it right?

Yes, it is, perhaps the python finds and returns the first nan by function min (x,y)

Can we test also without nans?

AlexanderKalistratov · 2020-02-17T21:52:23Z

sdc/functions/numpy_like.py

+                res = min_ref
+                pos = max_int64
+                for j in range(chunk.start, chunk.stop):
+                    if max(res, self[j]) == self[j]:


Since we are using chunks now, can't we simplify it just to:

if self[j] > res: res = self[j] pos = j

AlexanderKalistratov · 2020-02-17T21:53:45Z

Could you please think if you can generalize argmin/argmax implementation in the same way it was done here: #541

@PokhodenkoSA has details

AlexanderKalistratov · 2020-02-19T07:52:37Z

any performance numbers?

1e-to · 2020-02-19T07:54:35Z

any performance numbers?
Skipna = True and False has equal results:

PokhodenkoSA · 2020-02-19T08:00:28Z

sdc/tests/tests_perf/test_perf_numpy.py

        CE(type_='Numba', code='data.astype(np.int64)', jitted=True),
        CE(type_='SDC', code='sdc.functions.numpy_like.astype(data, np.int64)', jitted=True),
    ], usecase_params='data'),
+    TC(name='nanargmin', size=[10 ** 7], call_expr=[


Why did not you add cases for Numba for np.nanargmin and np.nanargmax?

Because its not exist. Numba dont have this methods

densmirn · 2020-02-19T08:04:06Z

sdc/datatypes/hpat_pandas_series_functions.py

+        if isinstance(self.data, StringArrayType):
+            def hpat_pandas_series_idxmax_impl(self, axis=None, skipna=None):
+                if skipna is None:
+                    _skipna = True
+                else:
+                    raise ValueError("Method idxmax(). Unsupported parameter 'skipna'=False with str data")
+
+                return numpy.argmax(self._data)
+
+            return hpat_pandas_series_idxmax_impl
+
+        def hpat_pandas_series_idxmax_impl(self, axis=None, skipna=None):
+            # return numpy.argmax(self._data)
+            if skipna is None:
+                _skipna = True
+            else:
+                _skipna = skipna
+
+            if _skipna:
+                return numpy_like.nanargmax(self._data)
+
+            return numpy_like.argmax(self._data)


I would create variable none_index = isinstance(self.index, types.NoneType) or self.index is None, then make common implementations hpat_pandas_series_idxmax_impl and hpat_pandas_series_idxmax_impl for both cases with index and without index. In the implementations I would add something like that:

if none_index == True: # noqa return result else: self._index[int(result)]

The same is applicable for idxmin.

densmirn · 2020-02-19T08:39:03Z

sdc/functions/numpy_like.py

+                        if reduce_op(res, self[j]) == self[j]:
+                            if not isnan(self[j]):
+                                if res == self[j]:
+                                    pos = min(pos, j)
+                                else:
+                                    pos = j
+                                    res = self[j]


Let's decrease indentations:

if reduce_op(res, self[j]) != self[j]: continue if isnan(self[j]): continue if res == self[j]: pos = min(pos, j) else: pos = j res = self[j]

densmirn · 2020-02-19T08:39:30Z

sdc/functions/numpy_like.py

+                    if reduce_op(general_res, arr_res[i]) == arr_res[i]:
+                        if general_res == arr_res[i]:


Let's decrease indentations.

densmirn · 2020-02-19T08:43:05Z

sdc/tests/test_sdc_numpy.py

+        def sdc_impl(a):
+            return numpy_like.argmin(a)
+
+        sdc_func = self.jit(sdc_impl)


Let's combine that to:

@self.jit def sdc_impl(a): return numpy_like.argmin(a)

This change is acceptable for other tests.

AlexanderKalistratov · 2020-02-19T14:49:05Z

@1e-to conflict

Argmin/argmax + nanargmin/nanargmax

b069bb3

1e-to requested review from AlexanderKalistratov and PokhodenkoSA February 7, 2020 09:03

AlexanderKalistratov reviewed Feb 7, 2020

View reviewed changes

elena.totmenina added 4 commits February 7, 2020 15:53

Remove max and min int

0a773a7

astype tests fix

75e9186

fix overload names

cd4c2ac

Merge branch 'master' of https://github.com/IntelPython/sdc into argmin

cd9ebbf

1e-to force-pushed the argmin branch from 5dfaad9 to cd9ebbf Compare February 17, 2020 08:19

rewrite with prange

3d335df

AlexanderKalistratov reviewed Feb 17, 2020

View reviewed changes

elena.totmenina added 6 commits February 18, 2020 12:42

final fix

48313c4

reduce op

e957c9c

Merge branch 'master' of https://github.com/IntelPython/sdc into argmin

d8d1024

new prange utils

d786389

Merge branch 'master' of https://github.com/IntelPython/sdc into argmin

aeb2e2e

Call numpy-like arg

20d7073

PokhodenkoSA reviewed Feb 19, 2020

View reviewed changes

densmirn reviewed Feb 19, 2020

View reviewed changes

elena.totmenina added 2 commits February 19, 2020 15:47

optimize algo

fc27331

Merge branch 'master' of https://github.com/IntelPython/sdc into argmin

f538494

densmirn approved these changes Feb 19, 2020

View reviewed changes

Merge branch 'master' into argmin

4fdb61c

AlexanderKalistratov merged commit d529471 into IntelPython:master Feb 19, 2020

		if reduce_op(general_res, arr_res[i]) == arr_res[i]:
		if general_res == arr_res[i]:

Argmin/argmax + nanargmin/nanargmax #580

Argmin/argmax + nanargmin/nanargmax #580

Uh oh!

Conversation

1e-to commented Feb 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pep8speaks commented Feb 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-02-19 12:48:19 UTC

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov commented Feb 7, 2020

Uh oh!

AlexanderKalistratov commented Feb 7, 2020

Uh oh!

1e-to commented Feb 7, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov commented Feb 7, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov commented Feb 17, 2020

Uh oh!

AlexanderKalistratov commented Feb 19, 2020

Uh oh!

1e-to commented Feb 19, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov commented Feb 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

1e-to commented Feb 7, 2020 •

edited

Loading

pep8speaks commented Feb 7, 2020 •

edited

Loading