Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to handle Pydantic BaseModel subclasses with custom init functions #8542

Closed
4 tasks done
masonmenges opened this issue Feb 15, 2023 · 8 comments
Closed
4 tasks done
Labels
bug Something isn't working

Comments

@masonmenges
Copy link
Contributor

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the Prefect documentation for this issue.
  • I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

Attempting to instantiate Pydantic Subclasses with custom init functions within a task or referencing the model within a task results in the following error model_instance = typ( TypeError: MyModel.__init__() missing 1 required keyword-only argument: 'bar' running the same code in the example below with both tasks as functions completes successfully.

Reproduction

from pydantic import BaseModel
from prefect import flow, task

class MyModel(BaseModel):
    id: int
    info: str = "Foo"

    def __init__(self, id: int = 1, *, bar: str, **data) -> None:
        """My custom init!"""
        super().__init__(id=id, bar=bar, **data)

@task
def return_model(x) -> MyModel:
    return MyModel(id=x, bar="foo")


@task()
def use_model(model: MyModel):
    print(model.id)


@flow
def pydantic_test(x: int = 10):
    model = return_model(x)
    use_model(model)


if __name__ == "__main__":
    print(pydantic_test(2))

Error

Traceback (most recent call last):
  File "/Users/masonmenges/Repos/support-directory/prefect-support/cloud2/masonm/test/test13.py", line 37, in <module>
    print(pydantic_test(2))
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/flows.py", line 456, in __call__
    return enter_flow_run_engine_from_flow_call(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 170, in enter_flow_run_engine_from_flow_call
    return anyio.run(begin_run)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_core/_eventloop.py", line 70, in run
    return asynclib.run(func, *args, **backend_options)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 292, in run
    return native_run(wrapper(), debug=debug)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
    return await func(*args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
    return await fn(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 250, in create_then_begin_flow_run
    return await state.result(fetch=True)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/states.py", line 89, in _get_state_result
    raise await get_state_exception(state)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 650, in orchestrate_flow_run
    result = await run_sync(flow_call)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 154, in run_sync_in_interruptible_worker_thread
    async with anyio.create_task_group() as tg:
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 135, in capture_worker_thread_and_result
    result = __fn(*args, **kwargs)
  File "/Users/masonmenges/Repos/support-directory/prefect-support/cloud2/masonm/test/test13.py", line 33, in pydantic_test
    use_model(model)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/tasks.py", line 456, in __call__
    return enter_task_run_engine(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 940, in enter_task_run_engine
    return run_async_from_worker_thread(begin_run)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 177, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1089, in get_task_call_return_value
    return await future._result()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/futures.py", line 237, in _result
    return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/states.py", line 89, in _get_state_result
    raise await get_state_exception(state)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/task_runners.py", line 207, in submit
    result = await call()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1328, in begin_task_run
    state = await orchestrate_task_run(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1418, in orchestrate_task_run
    resolved_parameters = await resolve_inputs(parameters)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/engine.py", line 1758, in resolve_inputs
    return await run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 318, in visit_collection
    items = [(visit_nested(k), visit_nested(v)) for k, v in expr.items()]
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 318, in <listcomp>
    items = [(visit_nested(k), visit_nested(v)) for k, v in expr.items()]
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 264, in visit_nested
    return visit_collection(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/cloud2/lib/python3.10/site-packages/prefect/utilities/collections.py", line 344, in visit_collection
    model_instance = typ(
TypeError: MyModel.__init__() missing 1 required keyword-only argument: 'bar'

Versions

Version:             2.7.12
API version:         0.8.4
Python version:      3.10.4
Git commit:          524c25cd
Built:               Mon, Feb 6, 2023 4:31 PM
OS/Arch:             darwin/arm64
Profile:             graniterock
Server type:         cloud

Additional context

No response

@zanieb
Copy link
Contributor

zanieb commented Feb 21, 2023

I do not think we can handle this case because bar is not stored on the model (and if its side-effects are stored on the model then bar should be optional and when we copy the model we will just re-use the final values). quote should be used in cases like this.

@tharwan
Copy link

tharwan commented Feb 22, 2023

@madkinsz as stated in the related PR: from my point of view you are placing assumptions on BaseModel and data classes for that matter, that do not hold up. I would argue this is the wrong approach to force the user to adopt your assumptions.

If you need special handling of BaseModel you should probably use your own subclass and 'quote' all others automatically if they do not conform to your assumptions.

I did not really have the time to understand why you need to instantiate a new model in the first place, due to a lack of time. But pydantic technically allows to bypass the init:

class MyModel(BaseModel):
    id: int
    info: str = "Foo"


    def __init__(self, id: int = 1, *, bar:str, **data) -> None:
        """My custom init!"""
        print(bar)
        super().__init__(id=id, **data)

m1 = MyModel(id=10, bar="baz")  # prints "baz" because init is used
m2 = MyModel.construct(_fields_set=m1.__fields_set__, **m1.dict())  # skips init
assert m2.id == m1.id
assert m2.info == m1.info

As a side note, in our case the init method does not allow all fields to be set. But the point is very much the same.

@zanieb
Copy link
Contributor

zanieb commented Mar 9, 2023

@tharwan we resolve futures in models so you can pass models with deferred computation into downstream tasks and get concurrent execution. This is a large part of our value proposition.

I do not think your described use of Pydantic is something I expect many of our users to encounter. Switching from typ(...) to typ.construct in visit_collection is an interesting proposition. I'm not sure we want to bypass model validation since we may have resolved a future into a real datatype that requires transformations.

We cannot change our behavior here easily as other users may be depending on it.

@tharwan
Copy link

tharwan commented Mar 9, 2023

@madkinsz From my point the best way then would be to have a custom pydantic Base that models need to be derived from that conforms to your assumptions about pydantic, as they are not true for pydantic models in general.

@zanieb
Copy link
Contributor

zanieb commented Mar 9, 2023

@tharwan that would be a breaking change

@zanieb
Copy link
Contributor

zanieb commented Mar 9, 2023

Here is a possible solution for you #8763

@tharwan
Copy link

tharwan commented Mar 9, 2023

@madkinsz in our case the model is defined external to the prefect project, so this would not help much I think. This is also potentially the biggest drawback of the current state, that there might be objects that are pydantic models that the user does not even knows are pydantic models, because they are just part of some 3rd party package.

@tharwan
Copy link

tharwan commented Mar 9, 2023

Oh I see this a prefect setting not something attached to the model definition. That might help indeed, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants