Skip to content
This repository was archived by the owner on Feb 2, 2024. It is now read-only.

Conversation

@1e-to
Copy link
Contributor

@1e-to 1e-to commented Feb 17, 2020

No description provided.

@pep8speaks
Copy link

pep8speaks commented Feb 17, 2020

Hello @1e-to! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-18 15:11:39 UTC

return chunks

return get_chunks_impl
return get_chunks_impl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return get_chunks_impl
return get_chunks_impl

Copy link
Contributor

@densmirn densmirn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hang @sdc_register_jitable on get_chunks and call the function inside of the overload implementation to avoid code duplication.

@densmirn
Copy link
Contributor

Let's hang @sdc_register_jitable on get_chunks and call the function inside of the overload implementation to avoid code duplication.

@sdc_overload(get_chunks)
def get_chunks_overload(size, pool_size=0):
    def get_chunks_impl(size, pool_size=0):
        return get_chunks(size, pool_size=pool_size)

    return get_chunks_impl

Comment on lines 67 to 69
if i == pool_size - 1:
rest = size - size // pool_size
stop = min((i + 1) * chunk_size + rest, size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe to limit the loop for i in range(pool_size - 1) and handle the latest step outside of the loop.

elena.totmenina added 3 commits February 17, 2020 19:04
Comment on lines 57 to 111
# *****************************************************************************
# Copyright (c) 2020, Intel Corporation All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
#
# Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
# EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# *****************************************************************************


import numba
import sdc

from typing import NamedTuple
from sdc.utilities.utils import sdc_overload
from sdc.utilities.utils import sdc_register_jitable


class Chunk(NamedTuple):
start: int
stop: int


def get_pool_size():
if sdc.config.config_use_parallel_overloads:
return numba.config.NUMBA_NUM_THREADS
else:
return 1


@sdc_overload(get_pool_size)
def get_pool_size_overload():
pool_size = get_pool_size()

def get_pool_size_impl():
return pool_size

return get_pool_size_impl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated code.

Comment on lines 32 to 33
from sdc.utilities.utils import sdc_overload
from sdc.utilities.utils import sdc_register_jitable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines could be combined to a single one.

pool_size = get_pool_size()

chunk_size = (size - 1) // pool_size + 1
chunk_size = size // pool_size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's not valid.

What are you doing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After closer look, it is valid, but a bad idea.

Let's assume, you have 56 threads and task size 55. In your implementation it wouldn't be paralleled at all. All 55 would be computed by the last chunk

Copy link
Contributor

@PokhodenkoSA PokhodenkoSA Feb 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was my idea :) After some thinking I have the same conclusion. We should do the last thread to take lesser work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented my vision in #618.

return numba.config.NUMBA_NUM_THREADS

return 1
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

@AlexanderKalistratov
Copy link
Collaborator

In it's current (not this one) implementation number of chunks is equal to the pool size.

In case task size is less than pool size first chunks would contain valid information, while the last ones would have start outside of task size.
e.g. if task size is 5 and number of threads 16 first five chunks would have valid start and stop ([0,1], [1, 2], [2,3], [3,4], [4,5]). But the rest will have start and stop outside of task size ([5,5])

This is not a problem if we a doing something like this:

for i in prange(len(chunks):
    chunk = chunks[i]
    for j in range(chunk.start, chunk.stop):
        ...

In this case range(chunk.start, chunk.stop) is zero for extra chunks.
But if we need to do something like this:

for i in prange(len(chunks):
    chunk = chunks[i]
    first_item = items[chunk.start]
    for j in range(chunk.start, chunk.stop):
        ...

This could be a problem, since chunk.start could be out of range.
So, probably it is better for the case task size is less than pool size return number of chunks equal to task size. I.e. in case of task size 5 and pool size 16 return only first 5 chunks.

What is your thoughts?

@PokhodenkoSA
Copy link
Contributor

What is your thoughts?

I agree.

pool_size = min(pool_size, size)


chunk_size = (size - 1) // pool_size + 1
pool_size = min(pool_size, size)
chunk_size = size // pool_size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please return the initial formula (size - 1)//pool_size + 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not working properly.
With an array of 5 elements and 4 threads, chunks are divided into 0-2, 2-4, 4-6, 6-8

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it divided into chunks [0,2), [2,4), [4,5), [5,5). And that's fine.
I think you should add code to discard last chunk in such case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this module or in the function code that use this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this module. Something like this:

    pool_size = min(pool_size, size)
    chunk_size = (size - 1)//pool_size + 1

    chunks = []
    for i in range(pool_size):
        start = i*chunk_size
        stop = min((i+1)*chunk_size, size)
        if start >= size:
            break
        
        chunks.append(Chunk(start, stop))

elena.totmenina added 2 commits February 18, 2020 12:44
start = min(i * chunk_size, size)
stop = min((i + 1) * chunk_size, size)
start = i*chunk_size
stop = min((i+1)*chunk_size, size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could move calculating of stop after if-block.


chunk_size = (size - 1) // pool_size + 1
pool_size = min(pool_size, size)
chunk_size = (size - 1)//pool_size + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you round all the operations by white spaces?

@1e-to 1e-to closed this Feb 19, 2020
@1e-to 1e-to deleted the fix branch February 19, 2020 15:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants