# Dependency
> Using Pydantic's `validate_call` arguments to run dependant functions

**The Setup**

In one of my previous roles, my team was required to produce multiple PDF reports on various schedules. We had various python functions that produced data for these reports, and some functions produced data that would be re-used in other sections. So for example, `function_a` produces `table_a`, and `function_b` uses `table_a` to produce `table_b`. `function_c` also uses `table_a` but produces a different result. 

As we developed these dependant functions, we wanted to be sure that the entire dependency tree worked. But when they ran in production, we didn't want to constantly re-run the same function over and over again. Especially if there is up-to-the-minute data involved that might change slighly throughout the course of the report generation. 

**The Goal**  
We wanted a pattern that would allow us to write a dependent function that, when called, could either produce the dependency or retrieve it from state. 

From our earlier example: 
- function_b is the dependent (because it depends on the result of function_a),
- function_a is the dependency (because function_b relies on it).
- function_c is also a dependent of function_a

When called, function_b will automatically generate the results of function_a. Later, when function_c is called, it will recall the result of function_a from state to avoid using different inputs than funtion_b and increase performance. 

**The Result**  
A `Dependency` object with a mapping of argument names -> dependency functions. Dependencies are defined in a function signature using type annotations, and functions that use this dependency structure will have a decorator to impliment this pattern. 


In [None]:
#| exporti 

import logging
from typing import Any, Callable, Dict
from pydantic import ValidationError, validate_call,GetCoreSchemaHandler
from pydantic_core import core_schema
from pydantic.json_schema import JsonSchemaValue,GetJsonSchemaHandler

In [None]:
#exporti

# create logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

# create console handler and set level to debug
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)

# create formatter
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# add formatter to ch
ch.setFormatter(formatter)

# add ch to logger
logger.addHandler(ch)

In [None]:
#|export 

logger = logging.getLogger(__name__)

class Dependency:
    depends_on: Dict[str, Callable]

    @classmethod
    def validate(cls, value: Dict[str, Any]) -> Dict[str, Any]:
        """
        Validate the input dictionary based on the functions in `depends_on`.
        If a function is not present in the dictionary, call the function and store the result.
        """
        for key, func in cls.depends_on.items():
            if key not in value:
                logger.info(f"validating: {func.__name__} as {key}")
                output = func(value)
                value[key] = output

            else:
                logger.info(f"retrieving {key} from state.")
        return value

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source_type: Any, handler: GetCoreSchemaHandler
    ) -> core_schema.CoreSchema:
        """
        Core schema definition for Pydantic, integrating the Dependency class
        with the list of functions passed via depends_on.
        """
        return core_schema.chain_schema([
            core_schema.dict_schema(),
            core_schema.no_info_plain_validator_function(cls.validate)
        ])

    @classmethod
    def __get_pydantic_json_schema__(
        cls, _core_schema: core_schema.CoreSchema, handler: GetJsonSchemaHandler
    ) -> JsonSchemaValue:
        """
        Defines how the `Dependency` object should be serialized in JSON schemas.
        """
        json_schema = handler(core_schema.dict_schema())
        print(cls.depends_on)
        json_schema.update({'description': 'A custom Dependency type with named keys'})
        return json_schema


# Custom `depends_on` to parameterize the Dependency class with named keys
def depends_on(**functions: Callable) -> type:
    """
    Dynamically creates a new subclass of Dependency with the depends_on dict set to the given functions.
    This allows for named dependencies.
    """
    return type(
        f'Dependency({", ".join(f"{k}={v.__name__}" for k, v in functions.items())})',
        (Dependency,),
        {'depends_on': functions}
    )

In [None]:
import pandas as pd

In [None]:
# Sample function to return a dummy DataFrame
def create_dummy_frame(*args):
    return pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})

# Using validate_call for input validation
@validate_call
def double_dummy(
    data: depends_on(dummy_data=create_dummy_frame)
):
    assert data['dummy_data'].equals(create_dummy_frame())
    return data['dummy_data'] * 2

In [None]:
# Example of calling the function where create_dummy_frame is called by the validator
try:
    result = double_dummy({})
    print(result)
except ValidationError as e:
    print(e)

2024-09-19 09:45:48,581 - __main__ - INFO - validating: create_dummy_frame as dummy_data


   a   b
0  2   8
1  4  10
2  6  12


In [None]:
# or you can pass the data yourself
double_dummy(
    data={'dummy_data':create_dummy_frame()}
)

2024-09-19 09:45:48,588 - __main__ - INFO - retrieving dummy_data from state.


Unnamed: 0,a,b
0,2,8
1,4,10
2,6,12


In [None]:

@validate_call
def call_both(data:depends_on(
    dummy_data=create_dummy_frame,
    double=double_dummy
)):
    return data

In [None]:
call_both(data={})

2024-09-19 09:45:48,598 - __main__ - INFO - validating: create_dummy_frame as dummy_data
2024-09-19 09:45:48,598 - __main__ - INFO - validating: double_dummy as double
2024-09-19 09:45:48,601 - __main__ - INFO - retrieving dummy_data from state.


{'dummy_data':    a  b
 0  1  4
 1  2  5
 2  3  6,
 'double':    a   b
 0  2   8
 1  4  10
 2  6  12}

In [None]:
from pydantic import BaseModel

class Model(BaseModel):
    data: depends_on(dummy_data=create_dummy_frame,double=double_dummy)

In [None]:
dep = depends_on(dummy_data=create_dummy_frame,double=double_dummy)
dep

__main__.Dependency(dummy_data=create_dummy_frame, double=double_dummy)