Optimized `pgRect_FromObject` #2465

Matiiss · 2023-09-22T12:24:57Z

Modifying RectExport_RectFromObject to also check for the other type of rect and return an SDL_(F)Rect object earlier.

The speed difference was tested using Surface.fblits (fast iteration over sequence, thus many calls to this internal API) with small surfaces to minimize blit overhead:

from timeit import timeit

import pygame

N = 1000
C = 10_000

target = pygame.Surface((1, 1))
src = pygame.Surface((1, 1))
rect = src.get_rect()
frect = src.get_frect()
rect_seq = [(src, rect) for _ in range(C)]
frect_seq = [(src, frect) for _ in range(C)]


timeit("""
target.fblits(frect_seq)
""", number=N, globals=globals())

print(timeit("""
target.fblits(rect_seq)
""", number=N, globals=globals()))

print(timeit("""
target.fblits(frect_seq)
""", number=N, globals=globals()))

For some reason it appears that Rect usage times also improve (i.e., the speed improves), but nothing's really changed in that aspect, so mainly I was focusing on FRects and saw a 20%+ speed increase.
EDIT: ran my version comparison test again and this time Rects were down 14%, which shows rather high volatility, but realistically the speed should be about the same as it was before (so averages cancel out or something). FRects did show a steady improvement of 28% this time.
EDIT2/UPDATE2: ran the test again, Rects down 3%, FRects up 40%

Btw, this is how I compare it across versions (both wheels built locally, -main version is just a wheel I built upon creating the branch, so that's basically current main):

import subprocess


CURRENT_VERSION = "venv/Scripts/python -m pip install my_tests/pygame_ce-2.4.0.dev1-cp310-cp310-win_amd64.whl --force"
OTHER_VERSION = "venv/Scripts/python -m pip install my_tests/pygame_ce_main-2.4.0.dev1-cp310-cp310-win_amd64.whl --force"

TEST_COUNT = 5

current = True
current_values = []
other_values = []

def percentage_stuff(x0, x1):
    return (x1 - x0) / x0


def speed_stuff(time_change):
    return 1 / (1 + time_change) - 1


for _ in range(2 * TEST_COUNT):
    if current:
        subprocess.run(CURRENT_VERSION)
    elif not current:
        subprocess.run(OTHER_VERSION)

    completed = subprocess.run("venv/Scripts/python my_tests/test.py", capture_output=True, text=True)
    lst = [float(line) for line in completed.stdout.splitlines()[1:]]
    print(completed.stdout.splitlines()[0])
    if current:
        if not current_values:
            current_values = lst
        else:
            for index, value in enumerate(lst):
                current_values[index] += value
        print(f"{current_values = }")
        current = False
    elif not current:
        if not other_values:
            other_values = lst
        else:
            for index, value in enumerate(lst):
                other_values[index] += value
        print(f"{other_values = }")
        current = True

print("current version =", current_values)
print("latest release =", other_values)
for t0, t1 in zip(other_values, current_values):
    print(speed_stuff(percentage_stuff(t0, t1)) * 100, end=", ")
print()

Modifying `RectExport_RectFromObject` to also check for the other type of rect and return an SDL_(F)Rect object earlier.

itzpr3d4t0r · 2023-09-22T18:23:09Z

I think you should test this with something like colliderect to avoid any potential extra overheads and not fblits. Also using something like min(repeat(...)) or mean(repeat(...)) instead of just timeit is also advisable as it gives a better estimate since outliers in the timings won't count towards a single value but will be discarded.

Also could we do it all in a single check that covers both cases? like a generic check?

Matiiss · 2023-09-23T11:33:16Z

Results from the fblits testing. Blue and red lines are roughly the same thus the speed of getting a rect from a rect object remains the same, however, the orange line is this PR's time for getting a rect from a frect object and compared to current main which is the purple line, there is a considerable improvement. Lastly the green and brown lines are somewhat similar, it's likely that the green one (current PR) would be a little bit higher since there is now an extra check in between, but they mostly seem to be very similar.

Results from collidelist testing. Only tested opposite rectangles since that's what this PR mostly aims to optimize. As can be observed from the results, the blue and orange lines (this PR) are slightly lower (i.e., faster) than green and red lines (current main).

Testing scripts:
test.py (works by uncommenting stuff for testing different things, best if whole sections are commented out or uncommented (i.e., everything between collidelist test and fblits test and then everything after fblits test comments.))

### collidelist test

import timeit
import statistics

import pygame

N = 1
R = 100
# R = 10_000
C = 10_000
TESTS = 1000

rect = pygame.Rect(0, 0, 0, 0)
frect = pygame.Rect(0, 0, 0, 0)
rect_seq = [rect for _ in range(C)]
frect_seq = [frect for _ in range(C)]
list_seq = [[*rect] for _ in range(C)]


def thingy(stmt, num_tests, name=None, **kwargs):
    print(f"{name or stmt.strip()}:", [statistics.mean(timeit.repeat(stmt, **kwargs)) for _ in range(num_tests)])


# repeat("""
# rect.collidelist(frect_seq)
# """, number=N, repeat=R, globals=globals())

# thingy("""
# rect.collidelist(rect_seq)
# """, TESTS, number=N, repeat=R, globals=globals())

thingy("""
rect.collidelist(frect_seq)
""", TESTS, number=N, repeat=R, globals=globals())

thingy("""
frect.collidelist(rect_seq)
""", TESTS, number=N, repeat=R, globals=globals())

# thingy("""
# frect.collidelist(frect_seq)
# """, TESTS, number=N, repeat=R, globals=globals())

# thingy("""
# rect.collidelist(list_seq)
# """, TESTS, number=N, repeat=R, globals=globals())


# ### fblits test

# import timeit
# import statistics

# import pygame

# N = 1
# R = 10
# C = 5000
# TESTS = 5000

# target = pygame.Surface((0, 0))
# src = pygame.Surface((0, 0))
# rect = src.get_rect()
# frect = src.get_frect()
# rect_seq = [(src, rect) for _ in range(C)]
# frect_seq = [(src, frect) for _ in range(C)]
# list_seq = [(src, [*rect]) for _ in range(C)]


# def thingy(stmt, num_tests, name=None, **kwargs):
#     print(f"{name or stmt.strip()}:", [statistics.mean(timeit.repeat(stmt, **kwargs)) for _ in range(num_tests)])

# thingy("""
# target.fblits(rect_seq)
# """, TESTS, number=N, globals=globals())

# thingy("""
# target.fblits(frect_seq)
# """, TESTS, number=N, globals=globals())

# thingy("""
# target.fblits(list_seq)
# """, TESTS, number=N, globals=globals())

Script for version comparison, requires two venvs for quicker change between versions during testing, venv-main has locally compiled pygame-ce from the main branch and venv has pygame-ce compiled locally using this PR:

import ast
import collections
import subprocess

from matplotlib import pyplot as plt


CURRENT_VENV = "venv\Scripts\python"
OTHER_VENV = "venv-main\Scripts\python"

TEST_COUNT = 2

current = True
current_values = collections.defaultdict(list)
other_values = collections.defaultdict(list)

def percentage_stuff(x0, x1):
    return (x1 - x0) / x0


def speed_stuff(time_change):
    return 1 / (1 + time_change) - 1


for _ in range(2 * TEST_COUNT):
    completed = subprocess.run(f"{CURRENT_VENV if current else OTHER_VENV} my_tests/test.py", capture_output=True, text=True)
    lst = [line.split(": ") for line in completed.stdout.splitlines()[1:]]
    print(completed.stdout.splitlines()[0])
    if current:
        for name, value in lst:
            current_values["current " + name].extend(ast.literal_eval(value))
        current = False
    elif not current:
        for name, value in lst:
            other_values["main " + name].extend(ast.literal_eval(value))
        current = True

for name, value in (current_values | other_values).items():
    plt.plot(value, label=name)

plt.legend()
plt.show()

novialriptide · 2023-09-23T13:08:18Z

Slightly confused here; why isn't OtherRectCheck named FRectCheck?

Matiiss · 2023-09-23T15:38:02Z

Slightly confused here; why isn't OtherRectCheck named FRectCheck?

Because for FRect it would use RectCheck, both rects are created from the same "template", so in both cases the other rect is the other rect.

ankith26

LGTM, thanks for the PR 🎉

Starbuck5 · 2023-09-30T06:11:28Z

We've had a lot of discussion about rounding recently, I'm concerned this explicit cast strategy closes the door on changing rounding behaviors in the future, or might lead to different rounding behavior based on exact object type.

src_c/rect_impl.h

ankith26 · 2023-10-01T10:14:52Z

We've had a lot of discussion about rounding recently, I'm concerned this explicit cast strategy closes the door on changing rounding behaviors in the future, or might lead to different rounding behavior based on exact object type.

I don't think this change is changing any behaviour currently, even the generic/old path seems to truncate floats

MyreMylar

This seems fine to me. I agree it doesn't appear to be changing current behaviour, or adding much more of it. We should probably address any float->int rounding strategy changes in a separate issue/PR.

Optimized pgRect_FromObject

4b707ae

Modifying `RectExport_RectFromObject` to also check for the other type of rect and return an SDL_(F)Rect object earlier.

Matiiss requested a review from a team as a code owner September 22, 2023 12:24

Matiiss added Code quality/robustness Code quality and resilience to changes Performance Related to the speed or resource usage of the project and removed Code quality/robustness Code quality and resilience to changes labels Sep 22, 2023

This comment was marked as outdated.

Sign in to view

ankith26 approved these changes Sep 26, 2023

View reviewed changes

ankith26 added this to the 2.4.0 milestone Sep 26, 2023

itzpr3d4t0r reviewed Oct 1, 2023

View reviewed changes

src_c/rect_impl.h Show resolved Hide resolved

MyreMylar approved these changes Oct 8, 2023

View reviewed changes

Merge branch 'main' into matiiss-optimize-pgRect_FromObject

d64316e

ankith26 added the rect pygame.rect label Oct 9, 2023

ankith26 merged commit 1427fc2 into pygame-community:main Oct 9, 2023
30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized `pgRect_FromObject` #2465

Optimized `pgRect_FromObject` #2465

Matiiss commented Sep 22, 2023 •

edited

This comment was marked as outdated.

itzpr3d4t0r commented Sep 22, 2023 •

edited

This comment was marked as outdated.

Matiiss commented Sep 23, 2023 •

edited

novialriptide commented Sep 23, 2023 •

edited

Matiiss commented Sep 23, 2023

ankith26 left a comment

Starbuck5 commented Sep 30, 2023

ankith26 commented Oct 1, 2023

MyreMylar left a comment

Optimized pgRect_FromObject #2465

Optimized pgRect_FromObject #2465

Conversation

Matiiss commented Sep 22, 2023 • edited

This comment was marked as outdated.

itzpr3d4t0r commented Sep 22, 2023 • edited

This comment was marked as outdated.

Matiiss commented Sep 23, 2023 • edited

novialriptide commented Sep 23, 2023 • edited

Matiiss commented Sep 23, 2023

ankith26 left a comment

Choose a reason for hiding this comment

Starbuck5 commented Sep 30, 2023

ankith26 commented Oct 1, 2023

MyreMylar left a comment

Choose a reason for hiding this comment

Optimized `pgRect_FromObject` #2465

Optimized `pgRect_FromObject` #2465

Matiiss commented Sep 22, 2023 •

edited

itzpr3d4t0r commented Sep 22, 2023 •

edited

Matiiss commented Sep 23, 2023 •

edited

novialriptide commented Sep 23, 2023 •

edited