Series.dropna scalable draft #604

1e-to · 2020-02-14T13:45:08Z

pep8speaks · 2020-02-14T13:45:13Z

Hello @1e-to! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-19 07:25:08 UTC

AlexanderKalistratov · 2020-02-14T14:58:07Z

sdc/functions/numpy_like.py

    return nanprod_impl
+
+
+def get_pool_size():


Let's move this file to something like prange_utils.py

AlexanderKalistratov · 2020-02-14T16:02:55Z

sdc/functions/numpy_like.py

+    if pool_size == 0:
+        pool_size = get_pool_size()
+
+    chunk_size = size//pool_size + 1


Suggested change

chunk_size = size//pool_size + 1

chunk_size = (size - 1)//pool_size + 1

this is the correct formula

AlexanderKalistratov · 2020-02-17T08:14:49Z

sdc/utilities/prange_utils.py

+        if pool_size == 0:
+            pool_size = get_pool_size()
+
+        chunk_size = size//pool_size + 1


(size - 1)//pool_size + 1

AlexanderKalistratov

👍

AlexanderKalistratov · 2020-02-17T21:34:53Z

sdc/functions/numpy_like.py

+        result_index = numpy.empty(shape=length, dtype=dtype_idx)
+        for i in prange(len(chunks)):
+            chunk = chunks[i]
+            if i == 0:


new_start = sum(arr_len[0:i]) new_stop = new_start + arr_len[i]

AlexanderKalistratov · 2020-02-17T21:40:39Z

sdc/functions/numpy_like.py

+            for j in range(chunk.start, chunk.stop):
+                if new_start < new_stop:
+                    if not isnan(arr[j]):
+                        result_data[new_start] = arr[j]


It is better to introduce new variable. Something like current_pos. Like this:

current_pos = new_start for j in range(chunk.start, chunk.stop): if current_pos < new_stop: if not isnan(arr[j]): result_data[current_pos] = arr[j] result_index[current_pos] = idx[j] current_pos += 1

It is confusing that you are always writing to new_start

AlexanderKalistratov · 2020-02-17T21:40:56Z

sdc/functions/numpy_like.py

+                new_stop = new_start + arr_len[i]
+
+            for j in range(chunk.start, chunk.stop):
+                if new_start < new_stop:


Not sure if you actually need this condition

AlexanderKalistratov · 2020-02-17T21:41:35Z

conflicts

AlexanderKalistratov · 2020-02-19T08:37:57Z

Could you please remeasure performance?

1e-to · 2020-02-19T12:14:25Z

ase remeasure performance?

It's still like in a picture one:

elena.totmenina added 3 commits February 14, 2020 12:10

wip

51cbb74

Series.dropna draft

398497b

Merge branch 'master' of https://github.com/IntelPython/sdc into drop

8c8ff1a

1e-to requested a review from AlexanderKalistratov February 14, 2020 13:45

AlexanderKalistratov reviewed Feb 14, 2020

View reviewed changes

add check parallel

6b50702

AlexanderKalistratov reviewed Feb 14, 2020

View reviewed changes

elena.totmenina added 2 commits February 17, 2020 10:58

Remove prange utils to other file

2b3e793

pep

3806bc5

AlexanderKalistratov reviewed Feb 17, 2020

View reviewed changes

chunk size fix

cf92c1f

AlexanderKalistratov reviewed Feb 17, 2020

View reviewed changes

elena.totmenina added 4 commits February 18, 2020 17:55

small fixes

25e0f9f

new prange utils

2806c48

Merge branch 'master' of https://github.com/IntelPython/sdc into drop

8460786

fix prange utils call

ed7e4a5

PokhodenkoSA approved these changes Feb 19, 2020

View reviewed changes

AlexanderKalistratov merged commit 6b53d4c into IntelPython:master Feb 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Series.dropna scalable draft #604

Series.dropna scalable draft #604

Uh oh!

1e-to commented Feb 14, 2020

Uh oh!

pep8speaks commented Feb 14, 2020 •

edited

Loading

Uh oh!

AlexanderKalistratov Feb 14, 2020

Uh oh!

AlexanderKalistratov Feb 14, 2020

Uh oh!

AlexanderKalistratov Feb 17, 2020

Uh oh!

AlexanderKalistratov left a comment

Uh oh!

AlexanderKalistratov Feb 17, 2020

Uh oh!

AlexanderKalistratov Feb 17, 2020

Uh oh!

AlexanderKalistratov Feb 17, 2020

Uh oh!

AlexanderKalistratov commented Feb 17, 2020

Uh oh!

AlexanderKalistratov commented Feb 19, 2020

Uh oh!

1e-to commented Feb 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	chunk_size = size//pool_size + 1
	chunk_size = (size - 1)//pool_size + 1

Series.dropna scalable draft #604

Series.dropna scalable draft #604

Uh oh!

Conversation

1e-to commented Feb 14, 2020

Uh oh!

pep8speaks commented Feb 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-02-19 07:25:08 UTC

Uh oh!

AlexanderKalistratov Feb 14, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 14, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 17, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov left a comment

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 17, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 17, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Feb 17, 2020

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov commented Feb 17, 2020

Uh oh!

AlexanderKalistratov commented Feb 19, 2020

Uh oh!

1e-to commented Feb 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pep8speaks commented Feb 14, 2020 •

edited

Loading