Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnhashableParamError for Pydantic models when using st.cache_data() #6290

Closed
4 of 5 tasks
mmz-001 opened this issue Mar 10, 2023 · 8 comments
Closed
4 of 5 tasks

UnhashableParamError for Pydantic models when using st.cache_data() #6290

mmz-001 opened this issue Mar 10, 2023 · 8 comments
Labels
feature:cache Related to st.cache_data and st.cache_resource feature:cache-hash-func status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working

Comments

@mmz-001
Copy link

mmz-001 commented Mar 10, 2023

Checklist

  • I have searched the existing issues for similar issues.
  • I added a very descriptive title to this issue.
  • I have provided sufficient information below to help reproduce this issue.

Summary

The new st.cache_data() decorator can't hash arguments which are Pydantic models, however st.cache() seems to work fine.

Reproducible Code Example

import streamlit as st
from pydantic import BaseModel

class Person(BaseModel):
    name: str

@st.cache_data()
def identity(person: Person):
    return person


person = identity(Person(name="Lee"))

Steps To Reproduce

  1. Run the code example

Expected Behavior

Streamlit should be able to cache the parameters.

Current Behavior

Error:

streamlit.runtime.caching.cache_errors.UnhashableParamError: Cannot hash argument 'person' (of type `__main__.Person`) in 'identity'

Is this a regression?

  • Yes, this used to work in a previous version.

Debug info

  • Streamlit version: 1.20.0
  • Python version: 3.10.2
  • Operating System: Windows 10
  • Virtual environment: Poetry

Additional Information

No response

Are you willing to submit a PR?

  • Yes, I am willing to submit a PR!
@mmz-001 mmz-001 added status:needs-triage Has not been triaged by the Streamlit team type:bug Something isn't working labels Mar 10, 2023
@sfc-gh-jcarroll
Copy link
Collaborator

sfc-gh-jcarroll commented Mar 11, 2023

Hey @mmz-001 there are a few workarounds described here

#6295

I think there's also a way to solve it by making the class pickle-able - ETA: See below

@sfc-gh-jcarroll
Copy link
Collaborator

sfc-gh-jcarroll commented Mar 11, 2023

So two things:

  1. The code example you showed is using the pydantic class as both an input AND output of the cached function. To be clear, caching where you only return the Person object works fine.
# This totally works
@st.cache_data()
def identity(name: str):
    return Person(name=name)
  1. In a case where you need to use the Pydantic model (or instance of some other custom class) as an input, there is another approach where you implement __reduce__() on the class, which allows streamlit to hash it as an input parameter. As described in the pickle documentation, this is a potentially dangerous thing to do, which can impact the serialization/deserialization of class instances in other contexts. It should be used carefully!

Disclaimer: I'm not a pickle expert, I think this approach works but not guaranteed

import streamlit as st
import pickle
import functools
from pydantic import BaseModel

class Person(BaseModel):
    name: str

    def __reduce__(self):
        return (functools.partial(Person, name=self.name), tuple())
    
    def _repr_html_(self):
        return f"I am a Person named {self.name}"

@st.cache_data()
def identity(person: Person):
    return person

person = identity(Person(name="Lee"))
st.write(person)

person = pickle.loads(pickle.dumps(person))
st.write(person)

cc @tconkling who can maybe help keep me honest

@sfc-gh-jcarroll
Copy link
Collaborator

Ah - Digging through some other issues, I also found a simpler solution using @dataclass

from dataclasses import dataclass
from pydantic import BaseModel

# need init=False to prevent it from overriding Pydantic's init method
@dataclass(init=False)
class Person(BaseModel):
    name: str
    
    def _repr_html_(self):
        return f"I am a Person named {self.name}"

@mmz-001
Copy link
Author

mmz-001 commented Mar 11, 2023

@sfc-gh-jcarroll thanks for looking into this issue. Using dataclasses seems the best way to resolve this problem in the example I gave. However, changing the Person model isn't ideal, especially if the model comes from another library or module. Of course, we can subclass the model but this opens up a host of other issues that may require major refactoring.

It would be best if Streamlit natively supports using pydantic models as input arguments to cachable functions since it is a common use case. Alternatively, bringing back the hash_func to the new cache primitives should also work.

@LukasMasuch LukasMasuch added feature:cache Related to st.cache_data and st.cache_resource status:confirmed Bug has been confirmed by the Streamlit team feature:cache-hash-func and removed status:needs-triage Has not been triaged by the Streamlit team labels Mar 17, 2023
@lambertwx
Copy link

Count me as another vote endorsing @mmz-001 's opinion above. I just ran into the same problem.

@sfc-gh-jcarroll
Copy link
Collaborator

I believe this will be fixed with #6502 in the next release. Let us know if it doesn't seem to solve the issue.

@sfc-gh-jcarroll
Copy link
Collaborator

Fixed with 1.24.0

@TateStaples
Copy link

Fixed with 1.24.0

This is still an issue for me in 1.24.1.

UnhashableParamError: Cannot hash argument 'reporter' (of type reporting.Reporter) in 'load_data'.

Here is the basics of the reporting.Reporter class

class Reporter:
def init(self, case_board: pd.DataFrame):
self.case_board = case_board
self.df = case_board

def hash(self):
""" Hash the dataframe """
return int(pd.util.hash_pandas_object(self.df).sum())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:cache Related to st.cache_data and st.cache_resource feature:cache-hash-func status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants