Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized pgRect_FromObject #2465

Merged

Conversation

Matiiss
Copy link
Member

@Matiiss Matiiss commented Sep 22, 2023

Modifying RectExport_RectFromObject to also check for the other type of rect and return an SDL_(F)Rect object earlier.

The speed difference was tested using Surface.fblits (fast iteration over sequence, thus many calls to this internal API) with small surfaces to minimize blit overhead:

from timeit import timeit

import pygame

N = 1000
C = 10_000

target = pygame.Surface((1, 1))
src = pygame.Surface((1, 1))
rect = src.get_rect()
frect = src.get_frect()
rect_seq = [(src, rect) for _ in range(C)]
frect_seq = [(src, frect) for _ in range(C)]


timeit("""
target.fblits(frect_seq)
""", number=N, globals=globals())

print(timeit("""
target.fblits(rect_seq)
""", number=N, globals=globals()))

print(timeit("""
target.fblits(frect_seq)
""", number=N, globals=globals()))

For some reason it appears that Rect usage times also improve (i.e., the speed improves), but nothing's really changed in that aspect, so mainly I was focusing on FRects and saw a 20%+ speed increase.
EDIT: ran my version comparison test again and this time Rects were down 14%, which shows rather high volatility, but realistically the speed should be about the same as it was before (so averages cancel out or something). FRects did show a steady improvement of 28% this time.
EDIT2/UPDATE2: ran the test again, Rects down 3%, FRects up 40%

Btw, this is how I compare it across versions (both wheels built locally, -main version is just a wheel I built upon creating the branch, so that's basically current main):

import subprocess


CURRENT_VERSION = "venv/Scripts/python -m pip install my_tests/pygame_ce-2.4.0.dev1-cp310-cp310-win_amd64.whl --force"
OTHER_VERSION = "venv/Scripts/python -m pip install my_tests/pygame_ce_main-2.4.0.dev1-cp310-cp310-win_amd64.whl --force"

TEST_COUNT = 5

current = True
current_values = []
other_values = []

def percentage_stuff(x0, x1):
    return (x1 - x0) / x0


def speed_stuff(time_change):
    return 1 / (1 + time_change) - 1


for _ in range(2 * TEST_COUNT):
    if current:
        subprocess.run(CURRENT_VERSION)
    elif not current:
        subprocess.run(OTHER_VERSION)

    completed = subprocess.run("venv/Scripts/python my_tests/test.py", capture_output=True, text=True)
    lst = [float(line) for line in completed.stdout.splitlines()[1:]]
    print(completed.stdout.splitlines()[0])
    if current:
        if not current_values:
            current_values = lst
        else:
            for index, value in enumerate(lst):
                current_values[index] += value
        print(f"{current_values = }")
        current = False
    elif not current:
        if not other_values:
            other_values = lst
        else:
            for index, value in enumerate(lst):
                other_values[index] += value
        print(f"{other_values = }")
        current = True

print("current version =", current_values)
print("latest release =", other_values)
for t0, t1 in zip(other_values, current_values):
    print(speed_stuff(percentage_stuff(t0, t1)) * 100, end=", ")
print()

Modifying `RectExport_RectFromObject` to also check for the other type of rect and return an SDL_(F)Rect object earlier.
@Matiiss Matiiss requested a review from a team as a code owner September 22, 2023 12:24
@Matiiss Matiiss added Code quality/robustness Code quality and resilience to changes Performance Related to the speed or resource usage of the project and removed Code quality/robustness Code quality and resilience to changes labels Sep 22, 2023
@Matiiss

This comment was marked as outdated.

@itzpr3d4t0r
Copy link
Member

itzpr3d4t0r commented Sep 22, 2023

I think you should test this with something like colliderect to avoid any potential extra overheads and not fblits. Also using something like min(repeat(...)) or mean(repeat(...)) instead of just timeit is also advisable as it gives a better estimate since outliers in the timings won't count towards a single value but will be discarded.

Also could we do it all in a single check that covers both cases? like a generic check?

@Matiiss

This comment was marked as outdated.

@Matiiss
Copy link
Member Author

Matiiss commented Sep 23, 2023

Results from the fblits testing. Blue and red lines are roughly the same thus the speed of getting a rect from a rect object remains the same, however, the orange line is this PR's time for getting a rect from a frect object and compared to current main which is the purple line, there is a considerable improvement. Lastly the green and brown lines are somewhat similar, it's likely that the green one (current PR) would be a little bit higher since there is now an extra check in between, but they mostly seem to be very similar.
image

Results from collidelist testing. Only tested opposite rectangles since that's what this PR mostly aims to optimize. As can be observed from the results, the blue and orange lines (this PR) are slightly lower (i.e., faster) than green and red lines (current main).
image

Testing scripts:
test.py (works by uncommenting stuff for testing different things, best if whole sections are commented out or uncommented (i.e., everything between collidelist test and fblits test and then everything after fblits test comments.))

### collidelist test

import timeit
import statistics

import pygame

N = 1
R = 100
# R = 10_000
C = 10_000
TESTS = 1000

rect = pygame.Rect(0, 0, 0, 0)
frect = pygame.Rect(0, 0, 0, 0)
rect_seq = [rect for _ in range(C)]
frect_seq = [frect for _ in range(C)]
list_seq = [[*rect] for _ in range(C)]


def thingy(stmt, num_tests, name=None, **kwargs):
    print(f"{name or stmt.strip()}:", [statistics.mean(timeit.repeat(stmt, **kwargs)) for _ in range(num_tests)])


# repeat("""
# rect.collidelist(frect_seq)
# """, number=N, repeat=R, globals=globals())

# thingy("""
# rect.collidelist(rect_seq)
# """, TESTS, number=N, repeat=R, globals=globals())

thingy("""
rect.collidelist(frect_seq)
""", TESTS, number=N, repeat=R, globals=globals())

thingy("""
frect.collidelist(rect_seq)
""", TESTS, number=N, repeat=R, globals=globals())

# thingy("""
# frect.collidelist(frect_seq)
# """, TESTS, number=N, repeat=R, globals=globals())

# thingy("""
# rect.collidelist(list_seq)
# """, TESTS, number=N, repeat=R, globals=globals())


# ### fblits test

# import timeit
# import statistics

# import pygame

# N = 1
# R = 10
# C = 5000
# TESTS = 5000

# target = pygame.Surface((0, 0))
# src = pygame.Surface((0, 0))
# rect = src.get_rect()
# frect = src.get_frect()
# rect_seq = [(src, rect) for _ in range(C)]
# frect_seq = [(src, frect) for _ in range(C)]
# list_seq = [(src, [*rect]) for _ in range(C)]


# def thingy(stmt, num_tests, name=None, **kwargs):
#     print(f"{name or stmt.strip()}:", [statistics.mean(timeit.repeat(stmt, **kwargs)) for _ in range(num_tests)])

# thingy("""
# target.fblits(rect_seq)
# """, TESTS, number=N, globals=globals())

# thingy("""
# target.fblits(frect_seq)
# """, TESTS, number=N, globals=globals())

# thingy("""
# target.fblits(list_seq)
# """, TESTS, number=N, globals=globals())

Script for version comparison, requires two venvs for quicker change between versions during testing, venv-main has locally compiled pygame-ce from the main branch and venv has pygame-ce compiled locally using this PR:

import ast
import collections
import subprocess

from matplotlib import pyplot as plt


CURRENT_VENV = "venv\Scripts\python"
OTHER_VENV = "venv-main\Scripts\python"

TEST_COUNT = 2

current = True
current_values = collections.defaultdict(list)
other_values = collections.defaultdict(list)

def percentage_stuff(x0, x1):
    return (x1 - x0) / x0


def speed_stuff(time_change):
    return 1 / (1 + time_change) - 1


for _ in range(2 * TEST_COUNT):
    completed = subprocess.run(f"{CURRENT_VENV if current else OTHER_VENV} my_tests/test.py", capture_output=True, text=True)
    lst = [line.split(": ") for line in completed.stdout.splitlines()[1:]]
    print(completed.stdout.splitlines()[0])
    if current:
        for name, value in lst:
            current_values["current " + name].extend(ast.literal_eval(value))
        current = False
    elif not current:
        for name, value in lst:
            other_values["main " + name].extend(ast.literal_eval(value))
        current = True

for name, value in (current_values | other_values).items():
    plt.plot(value, label=name)

plt.legend()
plt.show()

@novialriptide
Copy link
Member

novialriptide commented Sep 23, 2023

Slightly confused here; why isn't OtherRectCheck named FRectCheck?

@Matiiss
Copy link
Member Author

Matiiss commented Sep 23, 2023

Slightly confused here; why isn't OtherRectCheck named FRectCheck?

Because for FRect it would use RectCheck, both rects are created from the same "template", so in both cases the other rect is the other rect.

Copy link
Member

@ankith26 ankith26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR 🎉

@ankith26 ankith26 added this to the 2.4.0 milestone Sep 26, 2023
@Starbuck5
Copy link
Member

We've had a lot of discussion about rounding recently, I'm concerned this explicit cast strategy closes the door on changing rounding behaviors in the future, or might lead to different rounding behavior based on exact object type.

@ankith26
Copy link
Member

ankith26 commented Oct 1, 2023

We've had a lot of discussion about rounding recently, I'm concerned this explicit cast strategy closes the door on changing rounding behaviors in the future, or might lead to different rounding behavior based on exact object type.

I don't think this change is changing any behaviour currently, even the generic/old path seems to truncate floats

Copy link
Member

@MyreMylar MyreMylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me. I agree it doesn't appear to be changing current behaviour, or adding much more of it. We should probably address any float->int rounding strategy changes in a separate issue/PR.

@ankith26 ankith26 added the rect pygame.rect label Oct 9, 2023
@ankith26 ankith26 merged commit 1427fc2 into pygame-community:main Oct 9, 2023
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Related to the speed or resource usage of the project rect pygame.rect
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants