-
Notifications
You must be signed in to change notification settings - Fork 62
Fix prange utils #615
Fix prange utils #615
Conversation
sdc/utilities/prange_utils.py
Outdated
| return chunks | ||
|
|
||
| return get_chunks_impl | ||
| return get_chunks_impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return get_chunks_impl | |
| return get_chunks_impl | |
densmirn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hang @sdc_register_jitable on get_chunks and call the function inside of the overload implementation to avoid code duplication.
@sdc_overload(get_chunks)
def get_chunks_overload(size, pool_size=0):
def get_chunks_impl(size, pool_size=0):
return get_chunks(size, pool_size=pool_size)
return get_chunks_impl |
sdc/utilities/prange_utils.py
Outdated
| if i == pool_size - 1: | ||
| rest = size - size // pool_size | ||
| stop = min((i + 1) * chunk_size + rest, size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to limit the loop for i in range(pool_size - 1) and handle the latest step outside of the loop.
sdc/utilities/prange_utils.py
Outdated
| # ***************************************************************************** | ||
| # Copyright (c) 2020, Intel Corporation All rights reserved. | ||
| # | ||
| # Redistribution and use in source and binary forms, with or without | ||
| # modification, are permitted provided that the following conditions are met: | ||
| # | ||
| # Redistributions of source code must retain the above copyright notice, | ||
| # this list of conditions and the following disclaimer. | ||
| # | ||
| # Redistributions in binary form must reproduce the above copyright notice, | ||
| # this list of conditions and the following disclaimer in the documentation | ||
| # and/or other materials provided with the distribution. | ||
| # | ||
| # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
| # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, | ||
| # THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
| # PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR | ||
| # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
| # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
| # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; | ||
| # OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, | ||
| # WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR | ||
| # OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, | ||
| # EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
| # ***************************************************************************** | ||
|
|
||
|
|
||
| import numba | ||
| import sdc | ||
|
|
||
| from typing import NamedTuple | ||
| from sdc.utilities.utils import sdc_overload | ||
| from sdc.utilities.utils import sdc_register_jitable | ||
|
|
||
|
|
||
| class Chunk(NamedTuple): | ||
| start: int | ||
| stop: int | ||
|
|
||
|
|
||
| def get_pool_size(): | ||
| if sdc.config.config_use_parallel_overloads: | ||
| return numba.config.NUMBA_NUM_THREADS | ||
| else: | ||
| return 1 | ||
|
|
||
|
|
||
| @sdc_overload(get_pool_size) | ||
| def get_pool_size_overload(): | ||
| pool_size = get_pool_size() | ||
|
|
||
| def get_pool_size_impl(): | ||
| return pool_size | ||
|
|
||
| return get_pool_size_impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated code.
sdc/utilities/prange_utils.py
Outdated
| from sdc.utilities.utils import sdc_overload | ||
| from sdc.utilities.utils import sdc_register_jitable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines could be combined to a single one.
sdc/utilities/prange_utils.py
Outdated
| pool_size = get_pool_size() | ||
|
|
||
| chunk_size = (size - 1) // pool_size + 1 | ||
| chunk_size = size // pool_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's not valid.
What are you doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After closer look, it is valid, but a bad idea.
Let's assume, you have 56 threads and task size 55. In your implementation it wouldn't be paralleled at all. All 55 would be computed by the last chunk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was my idea :) After some thinking I have the same conclusion. We should do the last thread to take lesser work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have implemented my vision in #618.
sdc/utilities/prange_utils.py
Outdated
| return numba.config.NUMBA_NUM_THREADS | ||
|
|
||
| return 1 | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
|
In it's current (not this one) implementation number of chunks is equal to the pool size. In case task size is less than pool size first chunks would contain valid information, while the last ones would have start outside of task size. This is not a problem if we a doing something like this: for i in prange(len(chunks):
chunk = chunks[i]
for j in range(chunk.start, chunk.stop):
...In this case for i in prange(len(chunks):
chunk = chunks[i]
first_item = items[chunk.start]
for j in range(chunk.start, chunk.stop):
...This could be a problem, since What is your thoughts? |
I agree. pool_size = min(pool_size, size) |
sdc/utilities/prange_utils.py
Outdated
|
|
||
| chunk_size = (size - 1) // pool_size + 1 | ||
| pool_size = min(pool_size, size) | ||
| chunk_size = size // pool_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please return the initial formula (size - 1)//pool_size + 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not working properly.
With an array of 5 elements and 4 threads, chunks are divided into 0-2, 2-4, 4-6, 6-8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it divided into chunks [0,2), [2,4), [4,5), [5,5). And that's fine.
I think you should add code to discard last chunk in such case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this module or in the function code that use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this module. Something like this:
pool_size = min(pool_size, size)
chunk_size = (size - 1)//pool_size + 1
chunks = []
for i in range(pool_size):
start = i*chunk_size
stop = min((i+1)*chunk_size, size)
if start >= size:
break
chunks.append(Chunk(start, stop))
sdc/utilities/prange_utils.py
Outdated
| start = min(i * chunk_size, size) | ||
| stop = min((i + 1) * chunk_size, size) | ||
| start = i*chunk_size | ||
| stop = min((i+1)*chunk_size, size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could move calculating of stop after if-block.
sdc/utilities/prange_utils.py
Outdated
|
|
||
| chunk_size = (size - 1) // pool_size + 1 | ||
| pool_size = min(pool_size, size) | ||
| chunk_size = (size - 1)//pool_size + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you round all the operations by white spaces?
No description provided.