Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refinements to Python 3 New GIL #52778

Closed
dabeaz mannequin opened this issue Apr 26, 2010 · 3 comments
Closed

Refinements to Python 3 New GIL #52778

dabeaz mannequin opened this issue Apr 26, 2010 · 3 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@dabeaz
Copy link
Mannequin

dabeaz mannequin commented Apr 26, 2010

BPO 8532
Nosy @gpshead, @pitrou, @alex
Superseder
  • bpo-7946: Convoy effect with I/O bound threads and New GIL
  • Files
  • gil.patch: GIL Patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/pitrou'
    closed_at = <Date 2010-04-26.09:13:38.359>
    created_at = <Date 2010-04-26.03:41:40.066>
    labels = ['interpreter-core', 'performance']
    title = 'Refinements to Python 3 New GIL'
    updated_at = <Date 2010-04-26.09:13:38.358>
    user = 'https://bugs.python.org/dabeaz'

    bugs.python.org fields:

    activity = <Date 2010-04-26.09:13:38.358>
    actor = 'pitrou'
    assignee = 'pitrou'
    closed = True
    closed_date = <Date 2010-04-26.09:13:38.359>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2010-04-26.03:41:40.066>
    creator = 'dabeaz'
    dependencies = []
    files = ['17083']
    hgrepos = []
    issue_num = 8532
    keywords = ['patch']
    message_count = 3.0
    messages = ['104192', '104193', '104199']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'nirai', 'alex', 'dabeaz']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'patch review'
    status = 'closed'
    superseder = '7946'
    type = 'performance'
    url = 'https://bugs.python.org/issue8532'
    versions = ['Python 3.2']

    @dabeaz
    Copy link
    Mannequin Author

    dabeaz mannequin commented Apr 26, 2010

    The attached patch makes two simple refinements to the new GIL implemented in Python 3.2. Each is briefly described below.

    1. Changed mechanism for thread time expiration

    In the current implementation, threads perform a timed-wait on a condition variable. If time expires and no thread switches have occurred, the currently running thread is forced to drop the GIL.

    In the patch, timeouts are now performed by a special "GIL monitor" thread. This thread runs independently of Python and simply handles time expiration. Basically, it records the number of thread switches, sleeps for a specified interval (5ms), and then looks at the number of thread switches again. If no switches occurred, it forces the currently running thread to drop the GIL.

    With this monitor thread, it is no longer necessary to perform any timed condition variable waits. This approach has a few subtle benefits. First, threads no longer sit in a wait/timeout cycle when trying to get the GIL (so, there is less overhead). Second, you get FIFO scheduling of threads. When time expires, the thread that has been waiting the longest on the condition variable runs next. Generally, you want this.

    1. A very simple two-level priority mechanism

    A new attribute 'cpu_bound' is added to the PyThreadState structure. If a thread is ever forced to drop the GIL, this attribute is simply set True (1). If a thread gives up the GIL voluntarily, it is set back to False (0). This attribute is used to set up simple scheduling (described next).

    There are now two separate condition variables (gil_cpu_cond) and (gil_io_cond) that separate waiting threads according to their cpu_bound attribute setting. CPU-bound threads wait on gil_cpu_cond whereas I/O-bound threads wait on gil_io_cond.

    Using the two condition variables, the following scheduling rules are enforced:

    • If there are any waiting I/O bound threads, they are always signaled first, before any CPU-bound threads.
    • If an I/O bound thread wants the GIL, but a CPU-bound thread is running, the CPU-bound thread is immediately forced to drop the GIL.
    • If a CPU-bound thread wants the GIL, but another CPU-bound thread is running, the running thread is immediately forced to drop the GIL if its time period has already expired.

    Results
    -------
    This patch gives excellent results for both the ccbench test and all of my previous I/O bound tests. Here is the output:

    == CPython 3.2a0.0 (py3k:80470:80497M) ==
    == i386 Darwin on 'i386' ==

    --- Throughput ---

    Pi calculation (Python)

    threads=1: 871 iterations/s.
    threads=2: 844 ( 96 %)
    threads=3: 838 ( 96 %)
    threads=4: 826 ( 94 %)

    regular expression (C)

    threads=1: 367 iterations/s.
    threads=2: 345 ( 94 %)
    threads=3: 339 ( 92 %)
    threads=4: 327 ( 89 %)

    bz2 compression (C)

    threads=1: 384 iterations/s.
    threads=2: 728 ( 189 %)
    threads=3: 695 ( 180 %)
    threads=4: 707 ( 184 %)

    --- Latency ---

    Background CPU task: Pi calculation (Python)

    CPU threads=0: 0 ms. (std dev: 0 ms.)
    CPU threads=1: 0 ms. (std dev: 0 ms.)
    CPU threads=2: 1 ms. (std dev: 2 ms.)
    CPU threads=3: 0 ms. (std dev: 1 ms.)
    CPU threads=4: 0 ms. (std dev: 1 ms.)

    Background CPU task: regular expression (C)

    CPU threads=0: 0 ms. (std dev: 0 ms.)
    CPU threads=1: 2 ms. (std dev: 1 ms.)
    CPU threads=2: 1 ms. (std dev: 1 ms.)
    CPU threads=3: 1 ms. (std dev: 1 ms.)
    CPU threads=4: 2 ms. (std dev: 1 ms.)

    Background CPU task: bz2 compression (C)

    CPU threads=0: 0 ms. (std dev: 0 ms.)
    CPU threads=1: 0 ms. (std dev: 2 ms.)
    CPU threads=2: 2 ms. (std dev: 3 ms.)
    CPU threads=3: 0 ms. (std dev: 1 ms.)
    CPU threads=4: 0 ms. (std dev: 1 ms.)

    --- I/O bandwidth ---

    Background CPU task: Pi calculation (Python)

    CPU threads=0: 5850.9 packets/s.
    CPU threads=1: 5246.8 ( 89 %)
    CPU threads=2: 4228.9 ( 72 %)
    CPU threads=3: 4222.8 ( 72 %)
    CPU threads=4: 2959.5 ( 50 %)

    Particular attention should be given to tests involving I/O performance. In particular, here are the results of the I/O bandwidth test using the unmodified GIL:

    --- I/O bandwidth ---

    Background CPU task: Pi calculation (Python)

    CPU threads=0: 6007.1 packets/s.
    CPU threads=1: 189.0 ( 3 %)
    CPU threads=2: 19.7 ( 0 %)
    CPU threads=3: 19.7 ( 0 %)
    CPU threads=4: 5.1 ( 0 %)

    Other Benefits
    --------------
    This patch does not involve any complicated libraries, platform specific functionality, low-level lock twiddling, or mathematically complex priority scheduling algorithms. Emphasize: The code is simple.

    Negative Aspects
    ----------------
    This modification might introduce a starvation effect where CPU-bound threads never get to run if there is an extremely heavy load of I/O-bound threads competing for the GIL.

    Is starvation a real problem or a theoretical problem? Hard to say. Would need study.

    @dabeaz dabeaz mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Apr 26, 2010
    @dabeaz
    Copy link
    Mannequin Author

    dabeaz mannequin commented Apr 26, 2010

    Can't decide whether this should be attached to bpo-7946 or not.
    I will also post it there. (Feel free to close this issue if you want to keep 7946 alive).

    @gpshead
    Copy link
    Member

    gpshead commented Apr 26, 2010

    since the other discussion is ongoing on bpo-7946 i posted my comment there.

    @pitrou pitrou closed this as completed Apr 26, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants