Skip to content
Parallel Programming with Python and Charm++
Branch: master
Clone or download



(NOTE: With release v0.11 we changed the name of the project to charm4py.)

charm4py (Charm++ for Python -formerly CharmPy-) is a general-purpose parallel and distributed programming framework with a simple and powerful API, based on migratable Python objects and remote method invocation; built on top of an adaptive C++ runtime system providing speed, scalability and dynamic load balancing.

charm4py allows development of parallel applications that scale from laptops to supercomputers, using the Python language. It is built on top of Charm++.

Please see the Documentation for more information.

Short Example

The following computes Pi in parallel, using any number of machines and processors:

from charm4py import charm, Chare, Group, Reducer
from math import pi
import time

class Worker(Chare):

    def work(self, n_steps, pi_future):
        h = 1.0 / n_steps
        s = 0.0
        for i in range(self.thisIndex, n_steps, charm.numPes()):
            x = h * (i + 0.5)
            s += 4.0 / (1.0 + x**2)
        # perform a reduction among members of the group, sending the result to the future
        self.contribute(s * h, Reducer.sum, pi_future)

def main(args):
    n_steps = 1000
    if len(args) > 1:
        n_steps = int(args[1])
    mypi = charm.createFuture()
    workers = Group(Worker)  # create one instance of Worker on every processor
    t0 = time.time(), mypi)  # invoke 'work' method on every worker
    print('Approximated value of pi is:', mypi.get(),  # 'get' blocks until result arrives
          'Error is', abs(mypi.get() - pi), 'Elapsed time=', time.time() - t0)


This is a simple example and demonstrates only a few features of charm4py. Some things to note from this example:

  • Chares are distributed Python objects.
  • A Group is a type of distributed collection where one instance of the specified chare type is created on each processor.
  • Remote method invocation in charm4py is asynchronous.

In this example, there is only one chare per processor, but multiple chares (of the same or different type) can exist on any given processor, which can bring flexibility and also performance benefits. Please refer to the documentation for more information.


We would like feedback from the community. If you have feature suggestions, support questions or general comments, please visit our forum.

Main author at <>

You can’t perform that action at this time.