Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Make dynamic data loading functions accept arguments #479

Merged
merged 49 commits into from
May 20, 2024

Conversation

antonymilne
Copy link
Contributor

@antonymilne antonymilne commented May 14, 2024

Description

This is step 1 out of 2 in completing https://github.com/McK-Internal/vizro-internal/issues/753/ to allow for parametrisation of dynamic data from the frontend.

Here I just implement the changes needed to the data manager. I anticipated this would be as simple as adding *args, **kwargs in a couple of places, but it turned out to be much more complicated than that. In short:

  • for memoization to work fully (so with def f(a=1): ..., we need f(a=1) and f(1) and f() to all have same cache key) flask-caching relies on inspecting the signature of the function being memoized
  • previously we memoized the _DynamicData.load bound method, but it is basically impossible for this to be done in a way that mirrors the signature of the underlying load_data function (details in How to use adapter factory to change signature depending on instance GrahamDumpleton/wrapt#263)
  • hence I've moved the memoization inside the load function itself
  • this means that the memoized function could be either a function or a bound method (e.g. in the important case of kedro datasets)
  • handling bound methods with flask-caching is tricky because it's difficult to set independent timeouts without lots of hacking
  • eventually I found the cleanest way to achieve this was to handle everything as a partial function named with the data source name. This ensures that both bound methods and functions are handled correctly, with caches keyed by data source name
  • small consequence of this: using partial functions now works so I've removed the error raised in DataManager.__setitem__

Also I've written many more tests to check this all works. These are a bit slow because they run for many different functions and include time.sleep. If they become annoying to run in CI we can tag them as slow and split them out or skip them or something.

To do in this PR

  • @petar-qb to do manual test with redis 🙏 No need to test anything that's new functionality involving passing arguments, just to check I didn't break anything that worked before. I suggest you pick a few representative cases from [Feat] Add dynamic data and caching to data manager #398 (comment) and see what the cache keys look like since they'll be a bit different from before. Then please say if anything looks amiss

To do in following PR #482

To do separately

  • Make new ticket to tidy log_call

Screenshot

Notice

  • I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":

    • I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
    • I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorized to submit this contribution on behalf of the original creator(s) or their licensees.
    • I certify that the use of this contribution as authorized by the Apache 2.0 license does not violate the intellectual property rights of anyone else.
    • I have not referenced individuals, products or companies in any commits, directly or indirectly.
    • I have not added data or restricted code in any commits, directly or indirectly.

antonymilne and others added 29 commits May 1, 2024 11:09
@antonymilne antonymilne changed the title Add *args, **kwargs to load and many more tests [Feat] Make dynamic data loading functions accepts arguments May 14, 2024
@antonymilne antonymilne requested a review from petar-qb May 14, 2024 20:31
@antonymilne antonymilne changed the title [Feat] Make dynamic data loading functions accepts arguments [Feat] Make dynamic data loading functions accept arguments May 14, 2024
Base automatically changed from tidy/data-source-mapping to main May 14, 2024 21:06
Copy link
Contributor

@petar-qb petar-qb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thorough code (and tests) inspection and manual testing with a few the most representative combinations from #398 (comment), I didn't spot anything that amiss. As expected 😄

Also, many thanks for all your code/tests and PR comments. They made my life easier.

@antonymilne antonymilne force-pushed the feat/dynamic-data-load-with-args branch from b9c4c95 to c8892d1 Compare May 20, 2024 11:18
@antonymilne antonymilne enabled auto-merge (squash) May 20, 2024 11:55
@antonymilne antonymilne merged commit 9fef3c0 into main May 20, 2024
34 checks passed
@antonymilne antonymilne deleted the feat/dynamic-data-load-with-args branch May 20, 2024 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants