Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash_funcs for st.cache_data and st.cache_resource #6502

Merged
merged 18 commits into from Jun 8, 2023
Merged

Conversation

kajarenc
Copy link
Collaborator

@kajarenc kajarenc commented Apr 17, 2023

📚 Context

hash_funcs parameter existed previously in st.cache decorator, and we had multiple requests to return it to new cache_primitives: #6295 , #5939 , #6290

In this PR we return back hash_funcs functionality mostly using behaviour from st.cache.

  • What kind of change does this PR introduce?

    • Bugfix
    • Feature
    • Refactoring
    • Other, please describe:

🧠 Description of Changes

  • Add bullet points summarizing your changes here

    • This is a breaking API change
    • This is a visible (user-facing) change

Revised:

Insert screenshot of your updated UI/code here

Current:

Insert screenshot of existing UI/code here

🧪 Testing Done

  • Screenshots included
  • Added/Updated unit tests
  • Added/Updated e2e tests

🌐 References

Does this depend on other work, documents, or tickets?


Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@kajarenc kajarenc changed the title hash_funcs prototype [WIP] hash_funcs prototype Apr 28, 2023
@sfc-gh-jcarroll
Copy link
Collaborator

This looks great from product perspective. I noted a couple of questions / possible concerns here but they are fairly minor:

https://www.notion.so/snowflake-corp/Tech-Spec-for-hash_funcs-st-cache_data-st-cache_resource-939e6e77613743989f0dd5aea92134b7?pvs=4#13eb911cd6a1454cbc910372dc158d28

Comment on lines 91 to 92
If you think this is actually a Streamlit bug, please [file a bug report here.]
(https://github.com/streamlit/streamlit/issues/new/choose)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render as a link, it just shows the raw markdown. Not sure if it's a limitation in st.exception or because of the line break.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you @sfc-gh-jcarroll !
Removing line break works :)

@sfc-gh-jcarroll
Copy link
Collaborator

Only last question from me is whether there's any low lift thing to make this easier to use for cached methods in a class.

class MyCustomClass:
    def __init__(self, initial_score: int):
        self.my_score = initial_score

    @st.cache_data(hash_funcs={"__main__.MyCustomClass": id})
    def multiply_score(self, x: int) -> int:
        return self.my_score * x

It would be nice at least if "MyCustomClass" string worked without needing __main__ (and same if the class is defined in an imported file rather than the running one, not sure if that causes further weirdness)

This was a common pain point in many of the forum threads (not sure how it was solved before with st.cache), so if it's difficult to discover / use I think we will continue to get feedback about it.

Otherwise the behavior all looks good from product side based on my testing of the whl file ✅

@sfc-gh-kjavadyan sfc-gh-kjavadyan changed the title [WIP] hash_funcs prototype hash_funcs for st.cache_data and st.cache_resource Jun 6, 2023
my_obj = MyObj()
user_hash_error_func(my_obj)

expected_message = """unsupported operand type(s) for +=: 'MyObj' and 'int'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote the same tests cases for st.cache_data and st.cache_resource and did not use parameterized common cache tests because expected_message are very verbose and different, so it would look ugly if parametrized.

@kajarenc
Copy link
Collaborator Author

kajarenc commented Jun 7, 2023

@sfc-gh-jcarroll related to method decoration and "__main__.MyCustomClass" syntax, this happen because class name doesn't exist during method definition, so we have to use string instead of just MyCustomClass

For example in the case of a function that uses an argument of type MyCustomClass and defined after it, we could refer to the class name, and not to string.

For module names starting with "__main__", it is normal practice in Python for top-level executing module, and if the class is defined in an imported file rather than the running one, this should not cause weirdness, for example this work correctly:

app.py

import streamlit as st
import hello_world


x = hello_world.MyHelloWorld()

st.write(x.my_method())
st.write(x.my_method())

and hello_world.py

import streamlit as st


class MyHelloWorld:
    @st.cache_data(hash_funcs={"hello_world.MyHelloWorld": id})
    def my_method(self):
        print("IN METHOD")
        return 45

@kajarenc kajarenc marked this pull request as ready for review June 7, 2023 14:34
@kajarenc kajarenc requested a review from tconkling June 7, 2023 14:34
@sfc-gh-jcarroll
Copy link
Collaborator

Sounds good. That works for this initial update and we can decide later whether we need to improve it. Thank you!

Copy link
Contributor

@tconkling tconkling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

def pretty_print(self):
def to_str(v):
try:
return "Object of type %s: %s" % (type_util.get_fqn_type(v), str(v))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can we use an f-string here, instead of %?

@@ -84,6 +180,15 @@ def pop(self):
def __contains__(self, val: Any):
return id(val) in self._stack

def pretty_print(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs return type annotation

@parameterized.expand(
[("cache_data", cache_data), ("cache_resource", cache_resource)]
)
def test_recursive_hash_func(self, _, cache_decorator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short docstring explaining what this test is doing?

@@ -391,6 +413,72 @@ def test_generator_not_hashable(self):
with self.assertRaises(UnhashableTypeError):
get_hash((x for x in range(1)))

def test_hash_funcs_acceptable_keys(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also use a docstring, because the purpose of the test isn't very clear from its name. (Maybe the comment in the middle of the function is sufficient?)

@kajarenc kajarenc merged commit cd38181 into develop Jun 8, 2023
45 checks passed
@kajarenc kajarenc deleted the hash-funcs branch June 8, 2023 14:22
tconkling added a commit to tconkling/streamlit that referenced this pull request Jun 14, 2023
* develop:
  Bump dependencies of component-lib (streamlit#6830)
  Make `streamlit config show` honor newlines (streamlit#6758)
  Update comments on magic.py (streamlit#6833)
  Add tests for component-lib (streamlit#6580)
  Fix button height (streamlit#6738)
  Add `to_pandas` method to convert inputs to pandas DataFrames (streamlit#6668)
  hash_funcs for st.cache_data and st.cache_resource (streamlit#6502)
eric-skydio pushed a commit to eric-skydio/streamlit that referenced this pull request Dec 20, 2023
hash_funcs parameter existed previously in st.cache decorator, in this PR we bring back hash_funcs functionality to new cache primitives st.cache_data and st.cache_resource mostly using behavior from st.cache.
zyxue pushed a commit to zyxue/streamlit that referenced this pull request Mar 22, 2024
hash_funcs parameter existed previously in st.cache decorator, in this PR we bring back hash_funcs functionality to new cache primitives st.cache_data and st.cache_resource mostly using behavior from st.cache.
zyxue pushed a commit to zyxue/streamlit that referenced this pull request Apr 16, 2024
hash_funcs parameter existed previously in st.cache decorator, in this PR we bring back hash_funcs functionality to new cache primitives st.cache_data and st.cache_resource mostly using behavior from st.cache.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants