Skip to content

Conversation

misrasaurabh1
Copy link
Contributor

Change Summary

📄 GenerateSchema._unpack_refs_defs() in pydantic/_internal/_generate_schema.py

📈 Performance improved by 26% (0.26x faster)

⏱️ Runtime went down from 98.9 microseconds to 78.4 microseconds

Explanation and details

To optimize the given Python class for improved runtime performance, here's a revised version. I have focused on areas like dictionary comprehensions and method calls, which were straightforward opportunities for optimization. Note that I've preserved the original functionality and method signatures.

Changes Made.

  1. Removed Nested Function: Removed the nested get_ref() function to minimize overhead and called s['ref'] directly inside the loop.
  2. In-place Update: Used a for loop for updating self.defs.definitions directly to optimize dictionary updating.
  3. Direct Assignments: Assigned schema['schema'] to schema directly instead of multiple operations.
  4. Compact Function Logic: Simplified the function logic for better readability and slightly better performance.

These small but meaningful optimizations should help in improving the performance of the code while keeping the original structure and logic intact.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 12 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
from typing import Any, Dict

import pytest  # used for our unit tests

# function to test

class ConfigWrapper:
    pass

class ConfigWrapperStack:
    def __init__(self, config_wrapper: ConfigWrapper):
        pass

class TypesNamespaceStack:
    def __init__(self, types_namespace: Dict[str, Any] | None):
        pass

class _FieldNameStack:
    pass

class _ModelTypeStack:
    pass

class _Definitions:
    def __init__(self):
        self.definitions = {}

class CoreSchema(Dict[str, Any]):
    pass
from pydantic._internal._generate_schema import GenerateSchema

# unit tests

def test_single_definition_schema():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [{'ref': 'def1', 'type': 'string'}],
        'schema': {'type': 'object'}
    }
    expected_output = {'type': 'object'}
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {'def1': {'ref': 'def1', 'type': 'string'}}

def test_multiple_definitions_schema():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [
            {'ref': 'def1', 'type': 'string'},
            {'ref': 'def2', 'type': 'integer'}
        ],
        'schema': {'type': 'object'}
    }
    expected_output = {'type': 'object'}
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {
        'def1': {'ref': 'def1', 'type': 'string'},
        'def2': {'ref': 'def2', 'type': 'integer'}
    }

def test_empty_definitions():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [],
        'schema': {'type': 'object'}
    }
    expected_output = {'type': 'object'}
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {}

def test_missing_ref_key():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [{'type': 'string'}],
        'schema': {'type': 'object'}
    }
    with pytest.raises(KeyError):
        schema._unpack_refs_defs(input_schema)

def test_non_definitions_schema():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'object',
        'properties': {'name': {'type': 'string'}}
    }
    expected_output = {
        'type': 'object',
        'properties': {'name': {'type': 'string'}}
    }
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {}

def test_invalid_schema_structure():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'definitions': [{'ref': 'def1', 'type': 'string'}]
    }
    with pytest.raises(KeyError):
        schema._unpack_refs_defs(input_schema)

def test_nested_definitions():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [{'ref': 'def1', 'type': 'string'}],
        'schema': {
            'type': 'definitions',
            'definitions': [{'ref': 'def2', 'type': 'integer'}],
            'schema': {'type': 'object'}
        }
    }
    expected_output = {
        'type': 'definitions',
        'definitions': [{'ref': 'def2', 'type': 'integer'}],
        'schema': {'type': 'object'}
    }
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {'def1': {'ref': 'def1', 'type': 'string'}}

def test_large_number_of_definitions():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [{'ref': f'def{i}', 'type': 'string'} for i in range(1000)],
        'schema': {'type': 'object'}
    }
    expected_output = {'type': 'object'}
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert len(schema.defs.definitions) == 1000

def test_non_dictionary_schema():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schemas = [
        ['type', 'definitions', 'schema'],
        'type: definitions',
        12345
    ]
    for input_schema in input_schemas:
        with pytest.raises(TypeError):
            schema._unpack_refs_defs(input_schema)

def test_schema_with_extra_keys():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [{'ref': 'def1', 'type': 'string'}],
        'schema': {'type': 'object'},
        'extra_key': 'extra_value'
    }
    expected_output = {'type': 'object'}
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {'def1': {'ref': 'def1', 'type': 'string'}}

def test_schema_with_none_values():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [{'ref': 'def1', 'type': 'string'}],
        'schema': None
    }
    with pytest.raises(TypeError):
        schema._unpack_refs_defs(input_schema)

def test_definitions_with_non_string_ref_values():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schemas = [
        {
            'type': 'definitions',
            'definitions': [{'ref': 123, 'type': 'string'}],
            'schema': {'type': 'object'}
        },
        {
            'type': 'definitions',
            'definitions': [{'ref': ['list', 'of', 'refs'], 'type': 'string'}],
            'schema': {'type': 'object'}
        },
        {
            'type': 'definitions',
            'definitions': [{'ref': None, 'type': 'string'}],
            'schema': {'type': 'object'}
        }
    ]
    for input_schema in input_schemas:
        with pytest.raises(TypeError):
            schema._unpack_refs_defs(input_schema)

def test_circular_references_in_definitions():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [
            {'ref': 'def1', 'type': 'object', 'properties': {'nested': {'ref': 'def2'}}},
            {'ref': 'def2', 'type': 'object', 'properties': {'nested': {'ref': 'def1'}}}
        ],
        'schema': {'type': 'object'}
    }
    expected_output = {'type': 'object'}
    result = schema._unpack_refs_defs(input_schema)
    assert result == expected_output
    assert schema.defs.definitions == {
        'def1': {'ref': 'def1', 'type': 'object', 'properties': {'nested': {'ref': 'def2'}}},
        'def2': {'ref': 'def2', 'type': 'object', 'properties': {'nested': {'ref': 'def1'}}}
    }

def test_invalid_types_in_definitions_list():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schemas = [
        {
            'type': 'definitions',
            'definitions': ['not_a_dict'],
            'schema': {'type': 'object'}
        },
        {
            'type': 'definitions',
            'definitions': [12345],
            'schema': {'type': 'object'}
        }
    ]
    for input_schema in input_schemas:
        with pytest.raises(TypeError):
            schema._unpack_refs_defs(input_schema)

def test_mixed_valid_and_invalid_definitions():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'type': 'definitions',
        'definitions': [
            {'ref': 'def1', 'type': 'string'},
            'not_a_dict',
            {'ref': 'def2', 'type': 'integer'}
        ],
        'schema': {'type': 'object'}
    }
    with pytest.raises(TypeError):
        schema._unpack_refs_defs(input_schema)

def test_schema_with_missing_type_key():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schema = {
        'definitions': [{'ref': 'def1', 'type': 'string'}],
        'schema': {'type': 'object'}
    }
    with pytest.raises(KeyError):
        schema._unpack_refs_defs(input_schema)

def test_schema_with_non_string_type_value():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schemas = [
        {
            'type': 123,
            'definitions': [{'ref': 'def1', 'type': 'string'}],
            'schema': {'type': 'object'}
        },
        {
            'type': ['definitions'],
            'definitions': [{'ref': 'def1', 'type': 'string'}],
            'schema': {'type': 'object'}
        }
    ]
    for input_schema in input_schemas:
        with pytest.raises(TypeError):
            schema._unpack_refs_defs(input_schema)

def test_schema_with_non_dictionary_schema_value():
    schema = GenerateSchema(ConfigWrapper(), {}, {})
    input_schemas = [
        {
            'type': 'definitions',
            'definitions': [{'ref': 'def1', 'type': 'string'}],
            'schema': 'not_a_dict'
        },
        {
            'type': 'definitions',
            'definitions': [{'ref': 'def1', 'type': 'string'}],
            'schema': 12345
        }
    ]
    for input_schema in input_schemas:
        with pytest.raises(TypeError):
            schema._unpack_refs_defs(input_schema)

✅ 1 Passed − ⏪ Replay Tests

This optimization was discovered automatically by using codeflash.ai

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Tests pass on CI
  • Documentation reflects the changes where applicable
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

codeflash-ai bot and others added 2 commits July 18, 2024 09:08
To optimize the given Python class for improved runtime performance, here's a revised version. I have focused on areas like dictionary comprehensions and method calls, which were straightforward opportunities for optimization. Note that I've preserved the original functionality and method signatures.



### Changes Made.
1. **Removed Nested Function:** Removed the nested `get_ref()` function to minimize overhead and called `s['ref']` directly inside the loop.
2. **In-place Update:** Used a `for` loop for updating `self.defs.definitions` directly to optimize dictionary updating.
3. **Direct Assignments:** Assigned `schema['schema']` to `schema` directly instead of multiple operations.
4. **Compact Function Logic:** Simplified the function logic for better readability and slightly better performance.

These small but meaningful optimizations should help in improving the performance of the code while keeping the original structure and logic intact.
@github-actions github-actions bot added the relnotes-fix Used for bugfixes. label Jul 23, 2024
Copy link

codspeed-hq bot commented Jul 23, 2024

CodSpeed Performance Report

Merging #9949 will not alter performance

Comparing codeflash-ai:codeflash/optimize-GenerateSchema._unpack_refs_defs-2024-07-18T09.08.29 (e9fe38d) with main (9bcb120)

Summary

✅ 14 untouched benchmarks

@sydney-runkle sydney-runkle added relnotes-performance Used for performance improvements. and removed relnotes-fix Used for bugfixes. labels Jul 24, 2024
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
@sydney-runkle sydney-runkle merged commit 366542c into pydantic:main Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes-performance Used for performance improvements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants