Unable to handle Pydantic BaseModel subclasses with custom init functions #8542

masonmenges · 2023-02-15T18:26:04Z

First check

I added a descriptive title to this issue.
I used the GitHub search to find a similar issue and didn't find it.
I searched the Prefect documentation for this issue.
I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

Attempting to instantiate Pydantic Subclasses with custom init functions within a task or referencing the model within a task results in the following error model_instance = typ( TypeError: MyModel.__init__() missing 1 required keyword-only argument: 'bar' running the same code in the example below with both tasks as functions completes successfully.

Reproduction

from pydantic import BaseModel
from prefect import flow, task

class MyModel(BaseModel):
    id: int
    info: str = "Foo"

    def __init__(self, id: int = 1, *, bar: str, **data) -> None:
        """My custom init!"""
        super().__init__(id=id, bar=bar, **data)

@task
def return_model(x) -> MyModel:
    return MyModel(id=x, bar="foo")


@task()
def use_model(model: MyModel):
    print(model.id)


@flow
def pydantic_test(x: int = 10):
    model = return_model(x)
    use_model(model)


if __name__ == "__main__":
    print(pydantic_test(2))

Error

Traceback (most recent call last):
  File "/Users/masonmenges/Repos/support-directory/prefect-support/cloud2/masonm/test/test13.py", line 37, in <module>
    print(pydantic_test(2))
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/flows.py", line 456, in __call__
    return enter_flow_run_engine_from_flow_call(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 170, in enter_flow_run_engine_from_flow_call
    return anyio.run(begin_run)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_core/_eventloop.py", line 70, in run
    return asynclib.run(func, *args, **backend_options)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 292, in run
    return native_run(wrapper(), debug=debug)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
    return await func(*args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
    return await fn(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 250, in create_then_begin_flow_run
    return await state.result(fetch=True)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/states.py", line 89, in _get_state_result
    raise await get_state_exception(state)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 650, in orchestrate_flow_run
    result = await run_sync(flow_call)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 154, in run_sync_in_interruptible_worker_thread
    async with anyio.create_task_group() as tg:
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 135, in capture_worker_thread_and_result
    result = __fn(*args, **kwargs)
  File "/Users/masonmenges/Repos/support-directory/prefect-support/cloud2/masonm/test/test13.py", line 33, in pydantic_test
    use_model(model)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/tasks.py", line 456, in __call__
    return enter_task_run_engine(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 940, in enter_task_run_engine
    return run_async_from_worker_thread(begin_run)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 177, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1089, in get_task_call_return_value
    return await future._result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/futures.py", line 237, in _result
    return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/states.py", line 89, in _get_state_result
    raise await get_state_exception(state)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/task_runners.py", line 207, in submit
    result = await call()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1328, in begin_task_run
    state = await orchestrate_task_run(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1418, in orchestrate_task_run
    resolved_parameters = await resolve_inputs(parameters)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1758, in resolve_inputs
    return await run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 318, in visit_collection
    items = [(visit_nested(k), visit_nested(v)) for k, v in expr.items()]
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 318, in <listcomp>
    items = [(visit_nested(k), visit_nested(v)) for k, v in expr.items()]
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 264, in visit_nested
    return visit_collection(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 344, in visit_collection
    model_instance = typ(
TypeError: MyModel.__init__() missing 1 required keyword-only argument: 'bar'

Versions

Version:             2.7.12
API version:         0.8.4
Python version:      3.10.4
Git commit:          524c25cd
Built:               Mon, Feb 6, 2023 4:31 PM
OS/Arch:             darwin/arm64
Profile:             graniterock
Server type:         cloud

Additional context

No response

The text was updated successfully, but these errors were encountered:

zanieb · 2023-02-21T21:53:23Z

I do not think we can handle this case because bar is not stored on the model (and if its side-effects are stored on the model then bar should be optional and when we copy the model we will just re-use the final values). quote should be used in cases like this.

tharwan · 2023-02-22T09:05:18Z

@madkinsz as stated in the related PR: from my point of view you are placing assumptions on BaseModel and data classes for that matter, that do not hold up. I would argue this is the wrong approach to force the user to adopt your assumptions.

If you need special handling of BaseModel you should probably use your own subclass and 'quote' all others automatically if they do not conform to your assumptions.

I did not really have the time to understand why you need to instantiate a new model in the first place, due to a lack of time. But pydantic technically allows to bypass the init:

class MyModel(BaseModel):
    id: int
    info: str = "Foo"


    def __init__(self, id: int = 1, *, bar:str, **data) -> None:
        """My custom init!"""
        print(bar)
        super().__init__(id=id, **data)

m1 = MyModel(id=10, bar="baz")  # prints "baz" because init is used
m2 = MyModel.construct(_fields_set=m1.__fields_set__, **m1.dict())  # skips init
assert m2.id == m1.id
assert m2.info == m1.info

As a side note, in our case the init method does not allow all fields to be set. But the point is very much the same.

zanieb · 2023-03-09T18:43:32Z

@tharwan we resolve futures in models so you can pass models with deferred computation into downstream tasks and get concurrent execution. This is a large part of our value proposition.

I do not think your described use of Pydantic is something I expect many of our users to encounter. Switching from typ(...) to typ.construct in visit_collection is an interesting proposition. I'm not sure we want to bypass model validation since we may have resolved a future into a real datatype that requires transformations.

We cannot change our behavior here easily as other users may be depending on it.

tharwan · 2023-03-09T18:58:30Z

@madkinsz From my point the best way then would be to have a custom pydantic Base that models need to be derived from that conforms to your assumptions about pydantic, as they are not true for pydantic models in general.

zanieb · 2023-03-09T19:00:55Z

@tharwan that would be a breaking change

zanieb · 2023-03-09T19:07:25Z

Here is a possible solution for you #8763

tharwan · 2023-03-09T19:16:21Z

@madkinsz in our case the model is defined external to the prefect project, so this would not help much I think. This is also potentially the biggest drawback of the current state, that there might be objects that are pydantic models that the user does not even knows are pydantic models, because they are just part of some 3rd party package.

tharwan · 2023-03-09T19:33:07Z

Oh I see this a prefect setting not something attached to the model definition. That might help indeed, thanks.

masonmenges added bug Something isn't working status:triage labels Feb 15, 2023

zanieb added the from:sales label Feb 16, 2023

zanieb mentioned this issue Feb 17, 2023

Add wrapper for exceptions encountered while resolving parameter inputs #8584

Merged

zanieb closed this as completed Feb 21, 2023

github-actions bot removed the status:triage label Feb 21, 2023

zanieb mentioned this issue Mar 9, 2023

Add setting to exclude traversal of Pydantic models in parameters #8763

Closed

zanieb mentioned this issue May 25, 2023

Dataclass __post_init__ gets invoked again during task call #9747

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to handle Pydantic BaseModel subclasses with custom init functions #8542

Unable to handle Pydantic BaseModel subclasses with custom init functions #8542

masonmenges commented Feb 15, 2023

zanieb commented Feb 21, 2023

tharwan commented Feb 22, 2023

zanieb commented Mar 9, 2023

tharwan commented Mar 9, 2023

zanieb commented Mar 9, 2023

zanieb commented Mar 9, 2023

tharwan commented Mar 9, 2023

tharwan commented Mar 9, 2023

Unable to handle Pydantic BaseModel subclasses with custom init functions #8542

Unable to handle Pydantic BaseModel subclasses with custom init functions #8542

Comments

masonmenges commented Feb 15, 2023

First check

Bug summary

Reproduction

Error

Versions

Additional context

zanieb commented Feb 21, 2023

tharwan commented Feb 22, 2023

zanieb commented Mar 9, 2023

tharwan commented Mar 9, 2023

zanieb commented Mar 9, 2023

zanieb commented Mar 9, 2023

tharwan commented Mar 9, 2023

tharwan commented Mar 9, 2023