Skip to content

Conversation

misrasaurabh1
Copy link
Contributor

Change Summary

📄 is_pydantic_dataclass() in pydantic/dataclasses.py

📈 Performance improved by 41% (0.41x faster)

⏱️ Runtime went down from 4.80 microseconds to 3.40 microseconds

Explanation and details

Here's the optimized version of the provided program.

Explanation.

  1. Short-circuit Evaluation: I reordered the conditions in the return statement to utilize short-circuit evaluation. The check for '__pydantic_validator__' in class_.__dict__ is less expensive and will fail fast if class_ is not a valid type with this attribute, reducing unnecessary calls to the relatively more expensive dataclasses.is_dataclass function.

  2. Error Handling: Added a try-except block to quickly handle the scenario where class_ may not have the __dict__ attribute (__dict__ is not present for some objects that do not have attribute dictionaries). This prevents the function from throwing an attribute error and failing incompletely.

This should make the function slightly faster, especially when dealing with objects that early out on the __dict__ check.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

✅ 211 Passed − ⚙️ Existing Unit Tests

(click to show existing tests)
- test_dataclasses.py

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Tests pass on CI
  • Documentation reflects the changes where applicable
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

codeflash-ai bot and others added 4 commits June 4, 2024 21:41
Here's the optimized version of the provided program.



### Explanation.
1. **Short-circuit Evaluation:** I reordered the conditions in the `return` statement to utilize short-circuit evaluation. The check for `'__pydantic_validator__' in class_.__dict__` is less expensive and will fail fast if `class_` is not a valid type with this attribute, reducing unnecessary calls to the relatively more expensive `dataclasses.is_dataclass` function.
  
2. **Error Handling:** Added a `try-except` block to quickly handle the scenario where `class_` may not have the `__dict__` attribute (`__dict__` is not present for some objects that do not have attribute dictionaries). This prevents the function from throwing an attribute error and failing incompletely.

This should make the function slightly faster, especially when dealing with objects that early out on the `__dict__` check.
@github-actions github-actions bot added the relnotes-fix Used for bugfixes. label Jun 12, 2024
Copy link

codspeed-hq bot commented Jun 12, 2024

CodSpeed Performance Report

Merging #9652 will not alter performance

Comparing misrasaurabh1:codeflash/optimize-is_pydantic_dataclass-2024-06-04T21.41.42 (0bf4c9f) with main (6b0e7de)

Summary

✅ 13 untouched benchmarks

@sydney-runkle sydney-runkle added relnotes-performance Used for performance improvements. and removed relnotes-fix Used for bugfixes. labels Jun 18, 2024
@sydney-runkle
Copy link
Contributor

@misrasaurabh1,

How about:

if hasattr(class_, '__dict__'):
    return '__pydantic_validator__' in class_.__dict__ and dataclasses.is_dataclass(class_)
return False

What are the performance changes like with that approach?

@misrasaurabh1
Copy link
Contributor Author

Great suggestion, when I tested your recommendation with codeflash on my laptop, I got a 23% speedup. But when I try the try/except approach (the one that codeflash originally recommended), I got a 36% speedup. So the try/except approach seems to be faster.

@sydney-runkle
Copy link
Contributor

Great suggestion, when I tested your recommendation with codeflash on my laptop, I got a 23% speedup. But when I try the try/except approach (the one that codeflash originally recommended), I got a 36% speedup. So the try/except approach seems to be faster.

Awesome, thanks for looking into this. We're happy with the existing change, then!

I'll note, this isn't a super critical function to speed up, but we'll take the performance benefit where we can get it :). We're excited to collaborate more on more critical functions!

Copy link
Contributor

@sydney-runkle sydney-runkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks folks!

@sydney-runkle sydney-runkle enabled auto-merge (squash) June 20, 2024 23:19
@sydney-runkle sydney-runkle merged commit 5a77c5c into pydantic:main Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes-performance Used for performance improvements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants