-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
⚡️ Speed up dataclass() by 7% in pydantic/dataclasses.py
#9843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚡️ Speed up dataclass() by 7% in pydantic/dataclasses.py
#9843
Conversation
Below is the optimized version of your Python program. I've made several changes to reduce redundant checks, avoid unnecessary dictionary manipulation, and improved clarity without altering the functionalities. Notable improvements. 1. Removed redundant dictionary initializations and checks. 2. Simplified the condition checking for Python version compatibility. 3. Streamlined the configuration inheritance by making a direct assignment. 4. Omitting unnecessary reassignment of variables where scope and mutability allow.
CodSpeed Performance ReportMerging #9843 will not alter performanceComparing Summary
|
|
Conditional expressions compiles to equivalent byte code, so there's no performance improvement in using them: import dis
def func1():
kwargs = {'kw_only': kw_only, 'slots': slots} if sys.version_info >= (3, 10) else {}
def func2():
if sys.version_info >= (3, 10):
kwargs = {'kw_only': kw_only, 'slots': slots}
else:
kwargs = {}
dis.dis(func1)
"""
2 0 LOAD_GLOBAL 0 (sys)
2 LOAD_ATTR 1 (version_info)
4 LOAD_CONST 1 ((3, 10))
6 COMPARE_OP 5 (>=)
8 POP_JUMP_IF_FALSE 12 (to 24)
10 LOAD_GLOBAL 2 (kw_only)
12 LOAD_GLOBAL 3 (slots)
14 LOAD_CONST 2 (('kw_only', 'slots'))
16 BUILD_CONST_KEY_MAP 2
18 STORE_FAST 0 (kwargs)
20 LOAD_CONST 0 (None)
22 RETURN_VALUE
>> 24 BUILD_MAP 0
26 STORE_FAST 0 (kwargs)
28 LOAD_CONST 0 (None)
30 RETURN_VALUE
"""
dis.dis(func2)
"""
2 0 LOAD_GLOBAL 0 (sys)
2 LOAD_ATTR 1 (version_info)
4 LOAD_CONST 1 ((3, 10))
6 COMPARE_OP 5 (>=)
8 POP_JUMP_IF_FALSE 12 (to 24)
3 10 LOAD_GLOBAL 2 (kw_only)
12 LOAD_GLOBAL 3 (slots)
14 LOAD_CONST 2 (('kw_only', 'slots'))
16 BUILD_CONST_KEY_MAP 2
18 STORE_FAST 0 (kwargs)
20 LOAD_CONST 0 (None)
22 RETURN_VALUE
5 >> 24 BUILD_MAP 0
26 STORE_FAST 0 (kwargs)
28 LOAD_CONST 0 (None)
30 RETURN_VALUE
"""I'm a bit concerned by these AI micro improvements: In some cases it seems to improve performance of code paths that could be run a lot (#9839 for instance), but some others improve code paths that may hardly run during the execution of an application and seems to create more diff noise than anything. If we want a real impact on user experience regarding performance, we should probably run some profiling to see where are the bottlenecks, code paths run many times, etc. |
|
I think that's great insight. Some of these improvements have introduced nifty speedups, whereas others definitely lead to less readable code and no significant speedups. I've chatted with the codeflash team about specifically working on improvements for our core schema building process. This is an area that we've specifically identified as needing performance help. I love your idea about profiling in order to find hot code paths, then focusing on improvements in those areas as well. @misrasaurabh1, what do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simplifications in the 2nd and 3rd parts of the diff here, but could do without the version changes.
|
Thank you for the great feedback, I took notes on how to improve codeflash product for future as well. In the meantime, i manually addressed your feedback and fixed the diff |
| return create_dataclass | ||
|
|
||
| return create_dataclass(_cls) | ||
| return create_dataclass if _cls is None else create_dataclass(_cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho I think I prefer the current form:
if _cls is None:
return create_dataclass
return create_dataclass(_cls)As it makes it easier to understand the two cases, when the decorator is applied as is (@dataclass) and with arguments (@datalcass(...)).
But feel free to ignore, this is very minor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the one liner here, but thanks for the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nice thing about this new approach is that it removes the need for the noqa above
Co-authored-by: Victorien <65306057+Viicos@users.noreply.github.com>
|
@misrasaurabh1, could you please fix the last linting issue? Then we should be good to merge this one |
Update, I've attempted a fix, will see if this makes CI happy |
Change Summary
📄
dataclass()inpydantic/dataclasses.py📈 Performance improved by
7%(0.07xfaster)⏱️ Runtime went down from
29.9 microsecondsto27.9 microsecondsExplanation and details
Below is the optimized version of your Python program. I've made several changes to reduce redundant checks, avoid unnecessary dictionary manipulation, and improved clarity without altering the functionalities.
Notable improvements.
Correctness verification
The new optimized code was tested for correctness. The results are listed below.
✅ 202 Passed − ⚙️ Existing Unit Tests
(click to show existing tests)
✅ 0 Passed − 🌀 Generated Regression Tests
🔘 (none found) − ⏪ Replay Tests
Checklist