Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semaphore terribly unfair #1487

Closed
jamadden opened this issue Dec 6, 2019 · 0 comments · Fixed by #1488
Closed

Semaphore terribly unfair #1487

jamadden opened this issue Dec 6, 2019 · 0 comments · Fixed by #1488

Comments

@jamadden
Copy link
Member

@jamadden jamadden commented Dec 6, 2019

  • gevent version: 1.3+
  • Python version: CPython (only)

In all versions of CPython, sets are iterated according to increasing hash order. When a Semaphore is released, it notifies only one waiter. That waiter happens to be the first waiter that's in a set of waiters. This means that if a new waiter is added to a semaphore while an old waiter is already there, but the new waiter has a lower hashcode, it will take priority over the old waiter. (The waiters here are getcurrent().switch.) If that happens repeatedly, the old waiter will never be called.

To put that in concrete terms, have one greenlet acquire a semaphore. Have a second greenlet attempt to acquire the semaphore and block. Have the first greenlet spawn a new greenlet that wants to acquire the semaphore (and repeat the process). If the hash codes work out just so, that newly spawned greenlet will get the semaphore, while the original waiting greenlet continues to wait indefinitely.

In this example, we contrive to manipulate the hash codes of getcurrent().switch. One greenlet wants to acquire the semaphore and exit the program; another greenlet wants to acquire the semaphore and keep going. Depending on what values you choose for the hashcodes, the program will either exit right away, or run forever. If LastG.hashcode is larger, the program doesn't terminate.

from __future__ import print_function

import sys

import gevent
from gevent import getcurrent
from gevent.lock import Semaphore

class Switch:

    def __init__(self, greenlet, hashcode):
        self.switch = greenlet.switch
        self.hashcode = hashcode

    def __hash__(self):
        return self.hashcode

    def __eq__(self, other):
        return self is other

    def __call__(self, *args, **kwargs):
        return self.switch(*args, **kwargs)

class FirstG(gevent.Greenlet):

    hashcode = 10

    def __init__(self, *args, **kwargs):
        gevent.Greenlet.__init__(self, *args, **kwargs)
        self.switch = Switch(self, self.hashcode)


class LastG(FirstG):
    hashcode = 12


sem = Semaphore()

def acquire_then_exit():
    print("In", getcurrent(), "About to acquire for exit")
    sem.acquire()
    print("In", getcurrent(), "Acquired and exiting")
    sys.exit()


def acquire_then_spawn():
    print("In", getcurrent(), "About to acquire to spawn")
    sem.acquire()
    print("In", getcurrent(), "acquired; about to spawn")
    g = FirstG.spawn(release_then_spawn)
    print("In", getcurrent(), "spawned", g, "; about to join")
    g.join()
    print("In", getcurrent(), "joined and done")

def release_then_spawn():
    print("in", getcurrent(), "about to release")
    sem.release()
    print("in", getcurrent(), "spawning")
    g = FirstG.spawn(acquire_then_spawn)
    print("in", getcurrent(), "joining")
    g.join()

keep_going1 = FirstG.spawn(acquire_then_spawn)
keep_going2 = FirstG.spawn(acquire_then_spawn)
exiting = LastG.spawn(acquire_then_exit)

gevent.joinall([keep_going1, keep_going2, exiting])

Note: I wasn't able to get the non-termination to happen without manipulating the hashcodes. I think this is because usually later allocated method objects will have higher addresses than earlier ones, and the address is the hashcode. But this all depends on the internals of Python memory allocation. Setting PYTHONMALLOC=malloc, for example, allows the number of iterations before termination to vary drastically.

This applies to things built on top of Semaphore, namely gevent.lock.RLock and gevent.thread.LockType (used in the monkey-patched threading.Lock and threading.RLock and threading.Condition).

Now, the Python documentation says those lock types aren't necessarily fair. On Windows, they use semaphores, which aren't guaranteed to be fair AFAICS. On Unix systems, they're implemented with pthread_mutex, which isn't required to be fair. I think on Linux it isn't, and on macOS, it used to be, and optionally can be again:

The pthread_mutexattr_setpolicy_np() function sets the mutex policy value of the attribute. Valid mutex policies are:

  • PTHREAD_MUTEX_POLICY_FIRSTFIT_NP
    The first-fit mutex policy allows acquisition of the mutex to occur in any order. This policy is similar in operation to os_unfair_lock, new contending acquirers may obtain ownership of the mutex ahead of existing waiters.

  • PTHREAD_MUTEX_POLICY_FAIRSHARE_NP
    The fairshare mutex policy guarantees that ownership of a contended mutex will be granted to waiters on a strictly ordered first-in, first-out basis. That is, a mutex holder that unlocks the mutex and then attempts to relock will wait behind existing threads already waiting on the mutex before being granted ownership again.

Prior to macOS 10.14 (iOS and tvOS 12.0, watchOS 5.0) the only available pthread mutex policy mode was PTHREAD_MUTEX_POLICY_FAIRSHARE_NP. macOS 10.14 (iOS and tvOS 12.0, watchOS 5.0) introduces PTHREAD_MUTEX_POLICY_FIRSTFIT_NP and also makes this the default mode for mutexes initialized without a policy attribute set.

Fairness is a desired attribute if it's cheap. Fairness doesn't necessarily have to mean FIFO, but if that's cheap then that would also be nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.