Skip to content

Reduce pytensor function call overhead #1351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Apr 8, 2025

Removing all the deprecated stuff that nobody uses

I think after this there's not much more fat to trim. We could try to avoid the input/output list thing with linkers that don't need it but we still need to sort them/ recognize kwargs which is pretty much what's left.

The C part of the call takes 100ns, so the "overhead" in this case is 200-250ns.
For reference, calling numpy.zeros(100) takes like ~250ns, and a python identity function is ~20ns.

Results of the new benchmark

Before
---------------------------------------------------------------------------------------------------------- benchmark: 4 tests ----------------------------------------------------------------------------------------------------------
Name (time in ns)                                          Min                    Max                  Mean              StdDev                Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_overhead_benchmar[NO-GC-trust_input=True]        618.2859 (1.0)      33,186.5717 (3.01)       698.3395 (1.0)      204.4863 (1.0)        675.5714 (1.0)      20.1427 (1.0)    4439;14591    1,431.9682 (1.0)      193799           7
test_overhead_benchmar[GC-trust_input=True]           690.9995 (1.12)     11,041.0001 (1.0)        804.4022 (1.15)     437.7705 (2.14)       770.9987 (1.14)     30.9992 (1.54)     264;1235    1,243.1592 (0.87)      35559           1
test_overhead_benchmar[NO-GC-trust_input=False]     1,622.9988 (2.62)     40,334.9986 (3.65)     1,784.9450 (2.56)     338.6002 (1.66)     1,772.9999 (2.62)     60.0012 (2.98)      167;349      560.2413 (0.39)      23213           1
test_overhead_benchmar[GC-trust_input=False]        1,632.9996 (2.64)     22,442.0000 (2.03)     1,784.2403 (2.55)     256.1013 (1.25)     1,763.0009 (2.61)     59.0007 (2.93)     980;1555      560.4626 (0.39)      61920           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
After
------------------------------------------------------------------------------------------------------------ benchmark: 4 tests -----------------------------------------------------------------------------------------------------------
Name (time in ns)                                          Min                     Max                  Mean                StdDev                Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_overhead_benchmar[NO-GC-trust_input=True]        317.7142 (1.0)       23,333.0000 (2.65)       358.4749 (1.0)        141.9661 (1.20)       348.5000 (1.0)      11.5000 (1.0)     2884;8322    2,789.5957 (1.0)      196079          14
test_overhead_benchmar[GC-trust_input=True]           440.9994 (1.39)       8,815.9995 (1.0)        492.2538 (1.37)       118.0708 (1.0)        490.9998 (1.41)     30.0006 (2.61)       35;252    2,031.4722 (0.73)      26596           1
test_overhead_benchmar[NO-GC-trust_input=False]     1,393.0003 (4.38)      13,054.9997 (1.48)     1,569.0200 (4.38)       274.1532 (2.32)     1,541.9992 (4.42)     59.0007 (5.13)      712;985      637.3405 (0.23)      26839           1
test_overhead_benchmar[GC-trust_input=False]        1,473.0012 (4.64)     135,623.9991 (15.38)    1,680.1193 (4.69)     1,107.7806 (9.38)     1,612.9998 (4.63)     50.9990 (4.43)     182;1043      595.1958 (0.21)      30192           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

📚 Documentation preview 📚: https://pytensor--1351.org.readthedocs.build/en/1351/

@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 88ef60d to 0dc3147 Compare April 9, 2025 18:07
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 0dc3147 to 782bc2d Compare April 9, 2025 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant