Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: Run with more than 1 PE to use charm.pool #184

Open
karankakwani opened this issue Nov 20, 2020 · 12 comments
Open

AssertionError: Run with more than 1 PE to use charm.pool #184

karankakwani opened this issue Nov 20, 2020 · 12 comments

Comments

@karankakwani
Copy link

When I run the example given here on my system, I get the following error(s):-
Example: https://github.com/UIUC-PPL/charm4py/blob/master/examples/parallel-map/square.py

image

@ZwFink
Copy link
Contributor

ZwFink commented Nov 20, 2020

Can you paste the command you used to produce this output?

@karankakwani
Copy link
Author

I just ran this snippet in pycharm.

@ZwFink
Copy link
Contributor

ZwFink commented Nov 20, 2020

Aah, I see. When starting Charm4Py scripts, you actually want to run the charmrun.start program that becomes available when you installed Charm4Py. Here is an example of the command I used to run the above square.py script using 4 cores on my machine:
python3 -m charmrun.start +p4 ./square.py

The +p4 argument is used to specify that 4 PEs should be used for execution. You can think of PEs as processes that run on a core of your machine, so 4 PEs will use 4 cores.

In the example when charm.pool is used, 1 PE is reserved as a worker, according to the documentation. Therefore, when using charm.pool on a machine with n cores, you may want consider passing n+1 as the value to the +p argument shown above.

As for running this in PyCharm, you might be able to set up the run configuration such that it will allow you to run Charm4Py programs directly from within the IDE: https://www.jetbrains.com/help/pycharm/creating-and-editing-run-debug-configurations.html

@karankakwani
Copy link
Author

python3 -m charmrun.start +p4 ./square.py
Is this the only way to run it in order to exploit the benefits of charm4py?
How to do this in a case where there is a main program that launches a child process and this child process needs to run in parallel?

@karankakwani
Copy link
Author

@ZwFink So I tried the way you mentioned above in my own program and getting this error:-

image
image

@ZwFink
Copy link
Contributor

ZwFink commented Nov 23, 2020

How to do this in a case where there is a main program that launches a child process and this child process needs to run in parallel? Can you explain what you mean by this a little further?

Regarding the timeout error, this issue happens because charmrun uses ssh to start processes. For this to work, you need to be enable passwordless ssh for your machine, so that running ssh localhost doesn't require a password. What operating system are you using?

@ZwFink
Copy link
Contributor

ZwFink commented Nov 23, 2020

I should mention that this is a gap in the Charm4Py documentation and will make sure it gets added.

@karankakwani
Copy link
Author

karankakwani commented Nov 23, 2020

@ZwFink Thanks for your reply and clarification.

To explain further on my doubt earlier, what I meant is:-

I have a main script say main.py
This main.py spawns a child process say childprocess.py and now this child process needs to exploit charm4py to achieve faster execution.

E.g. main.py

child_process = Process(target=(...), args=(...))
child_process.start()
<some other code>
child_process.join()

How to use charm4py in this case?

@ZwFink
Copy link
Contributor

ZwFink commented Nov 23, 2020

Is this an existing codebase where it's not possible/feasible to make main.py a Charm4Py program and perform the processing of both child/parent processes therein? Doing this would be nice as the Charm++ scheduler can schedule all of the work the program does, which may improve performance. The fork/join model you propose above is trivial to express in Charm4Py.

Otherwise, in the above example can child_process not simply be a call to invoke a Charm4Py program such as python3 -m charmrun.start +p4 ./square.py?

@karankakwani
Copy link
Author

"The fork/join model you propose above is trivial to express in Charm4Py."

Can you please share an example?

@ZwFink
Copy link
Contributor

ZwFink commented Nov 24, 2020

Sure! I transformed your example above into a Charm4Py program:

from charm4py import charm, Chare, Future, Array, Reducer
import time

class ChildProcess(Chare):
    def __init__(self, arg, doneFuture):
        self.arg = arg
        self.doneFuture = doneFuture
      

    @coro
    def start(self):
        # our "work" here is just to perform a reduction and send it to the
        # future, but anything can be done
        self.reduce(self.doneFuture, self.thisIndex[0], Reducer.sum)

        # if the main process doesn't need the result, we may simply do:
        # self.reduce(self.doneFuture)

def main(args):
    childProcessFuture = Future()

    childProcessArg = 1
    # child_process = Process(target=(...), args=(...))
    # Here we specify that 20 chares will be created to perform the work.
    childProc = Array(ChildProcess, 20, args=[childProcessArg, childProcessFuture])

    # alternatively, the constructor of ChildProcess can call doWork which saves some message passing
    # this represents the child process creation that does the work
    # child_process.start()
    childProc.start()

    # some other code, in this case just sleep
    # <some other code>
    time.sleep(3)

    # similar to 'child_process.join()' in the example you provided
    # child_process.join()
    childResult = childProcessFuture.get()

    print(f'Child result: {childResult}')
    charm.exit()

charm.start(main)

In the above example, the code executed by the childProc.start() call will happen asynchronously and in parallel with the code executed in <some other code>, which is what you were achieving above. The work will be distributed among the processors that you use for execution. The call to childProcessFuture.get() is blocking and will return as soon as the array childProc has finished processing.

After saving the above in example.py, the following command executes the above code using 4 processors:
python3 -m charmrun.start +p4 ./example.py. I can also provide an example using charm.pool if you think that would be good to see.

@ZwFink
Copy link
Contributor

ZwFink commented Dec 3, 2020

@karankakwani Just following up on this. Have you been able to use something similar to the above to solve your problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants