Skip to content

Commit

Permalink
[Feat] Enable parametrisation of dynamic data functions (#482)
Browse files Browse the repository at this point in the history
Signed-off-by: Antony Milne <49395058+antonymilne@users.noreply.github.com>
Co-authored-by: Petar Pejovic <108530920+petar-qb@users.noreply.github.com>
  • Loading branch information
antonymilne and petar-qb committed May 28, 2024
1 parent cdd4cec commit 31d706e
Show file tree
Hide file tree
Showing 22 changed files with 670 additions and 160 deletions.
4 changes: 4 additions & 0 deletions .vale/styles/Microsoft/ignore.txt
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,7 @@ Plotly's
Gunicorn
dataframe
streamlit
memoization
setosa
versicolor
virginica
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ A new scriv changelog fragment.
Uncomment the section that is right (remove the HTML comment wrapper).
-->

<!--
### Highlights ✨
- A bullet item for the Highlights ✨ category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
<!--
A new scriv changelog fragment.
Uncomment the section that is right (remove the HTML comment wrapper).
-->

### Highlights ✨

- Enable dynamic data parametrization, so that different data can be loaded while the dashboard is running ([#482](https://github.com/mckinsey/vizro/pull/482))

<!--
### Removed
- A bullet item for the Removed category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
-->
<!--
### Added
- A bullet item for the Added category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
-->
<!--
### Changed
- A bullet item for the Changed category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
-->
<!--
### Deprecated
- A bullet item for the Deprecated category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
-->
<!--
### Fixed
- A bullet item for the Fixed category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
-->
<!--
### Security
- A bullet item for the Security category with a link to the relevant PR at the end of your entry, e.g. Enable feature XXX ([#1](https://github.com/mckinsey/vizro/pull/1))
-->
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 7 additions & 3 deletions vizro-core/docs/pages/user-guides/actions.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Many components of a dashboard (for example, [`Graph`][vizro.models.Graph] or [`

By combining the [`Action`][vizro.models.Action] model with an action function, you can create complex dashboard interactions triggered by various events.

There are already lots of action functions you can reuse.
There are already a few action functions you can reuse.

???+ info "Overview of currently available pre-defined action functions"

Expand All @@ -24,8 +24,12 @@ The below sections are guides on how to use pre-defined action functions.

### Export data

To enable downloading data, you can add the [`export_data`][vizro.actions.export_data] action function to the [`Button`][vizro.models.Button] component. Hence, as
a result, when a dashboard user now clicks the button, all data on the page will be downloaded.
To enable downloading data, you can add the [`export_data`][vizro.actions.export_data] action function to the [`Button`][vizro.models.Button] component.
Hence, as a result, when a dashboard user now clicks the button, all data on the page will be downloaded.

When data from a [custom chart](custom-charts.md) is exported it is the contents of the `data_frame` input argument that is exported.
Therefore, the exported data will reflect any native filters and parameters, but no transformations to the `data_frame` done inside the chart function.


!!! example "`export_data`"

Expand Down
3 changes: 2 additions & 1 deletion vizro-core/docs/pages/user-guides/custom-charts.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,12 @@ def minimal_example(data_frame:pd.DataFrame=None):

Building on the above, there are several routes one can take. The following examples are guides on the most common custom requests, but also serve as an illustration of more general principles.

To alter the data in the `data_frame` argument, consider using a [Filter](filters.md) or [parametrized data loading](data.md/#parametrize-data-loading) and [dynamic data](data.md/#dynamic-data). The `data_frame` argument input to a custom chart contains the data **after** filters and parameters have been applied.

!!! tip

Custom charts can be targeted by [Filters](filters.md) or [Parameters](parameters.md) without any additional configuration. We will showcase both possibilities in the following examples. In particular the `Parameters` in combination with custom charts can be highly versatile in achieving custom functionality.


## Enhanced `plotly.express` chart with reference line

The below examples shows a case where we enhance an existing `plotly.express` chart. We add a new argument (`hline`), that is used to draw a grey reference line at the height determined by the value of `hline`. The important thing to note is that we then
Expand Down
86 changes: 81 additions & 5 deletions vizro-core/docs/pages/user-guides/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ The below example uses the Iris data saved to a file `iris.csv` in the same dire
page = vm.Page(
title="Static data example",
components=[
vm.Graph(figure=px.box("iris", x="species", y="petal_width", color="species")),
vm.Graph(figure=px.box(iris, x="species", y="petal_width", color="species")),
]
)

Expand Down Expand Up @@ -137,7 +137,6 @@ Unlike static data, dynamic data cannot be supplied directly into the `data_fram

The example below shows how data is fetched dynamically every time the page is refreshed. When you run the code and refresh the page the function `load_iris_data` is re-run, which returns different data each time. The example uses the Iris data saved to a file `iris.csv` in the same directory as `app.py`. This data can be generated using `px.data.iris()` or [downloaded](../../assets/user_guides/data/iris.csv).


!!! example "Dynamic data"
=== "app.py"
```py
Expand Down Expand Up @@ -167,7 +166,7 @@ The example below shows how data is fetched dynamically every time the page is r
```

1. `iris` is a pandas DataFrame created by reading from the CSV file `iris.csv`.
2. To demonstrate that dynamic data can change when the page is refreshed, select 30 points at random. This simulates what would happen if your file `iris.csv` were constantly changing.
2. To demonstrate that dynamic data can change when the page is refreshed, select 50 points at random. This simulates what would happen if your file `iris.csv` were constantly changing.
3. To use `load_iris_data` as dynamic data it must be added to the data manager. You should **not** actually call the function as `load_iris_data()`; doing so would result in static data that cannot be reloaded.
4. Dynamic data is referenced by the name of the data source `"iris"`.

Expand Down Expand Up @@ -217,8 +216,6 @@ In a development environment the easiest way to enable caching is to use a [simp
Vizro().build(dashboard).run()
```



By default, when caching is turned on, dynamic data is cached in the data manager for 5 minutes. A refresh of the dashboard within this time interval will fetch the pandas DataFrame from the cache and _not_ re-run the data loading function. Once the cache timeout period has elapsed, the next refresh of the dashboard will re-execute the dynamic data loading function. The resulting pandas DataFrame will again be put into the cache and not expire until another 5 minutes has elapsed.

If you would like to alter some options, such as the default cache timeout, then you can specify a different cache configuration:
Expand Down Expand Up @@ -268,3 +265,82 @@ data_manager["slow_expire_data"].timeout = 60 * 60
data_manager["no_expire_data"] = load_iris_data
data_manager["no_expire_data"].timeout = 0
```

### Parametrize data loading

You can supply arguments to your dynamic data loading function that can be modified from the dashboard.
For example, if you are handling big data then you can use an argument to specify the number of entries or size of chunk of data.

To add a parameter to control a dynamic data source, do the following:

1. add the appropriate argument to your dynamic data function and specify a default value for the argument.
2. give an `id` to all components that have the data source you wish to alter through a parameter.
3. [add a parameter](parameters.md) with `targets` of the form `<target_component_id>.data_frame.<dynamic_data_argument>` and a suitable [selector](selectors.md).

For example, let us extend the [dynamic data example](#dynamic-data) above to show how the `load_iris_data` can take an argument `number_of_points` controlled from the dashboard with a [`Slider`][vizro.models.Slider].

!!! example "Parametrized dynamic data"
=== "app.py"
```py hl_lines="8 10 20-23"
from vizro import Vizro
import pandas as pd
import vizro.plotly.express as px
import vizro.models as vm

from vizro.managers import data_manager

def load_iris_data(number_of_points=10): # (1)!
iris = pd.read_csv("iris.csv") # (2)!
return iris.sample(number_of_points) # (3)!

data_manager["iris"] = load_iris_data # (4)!

page = vm.Page(
title="Update the chart on page refresh",
components=[
vm.Graph(id="graph", figure=px.box("iris", x="species", y="petal_width", color="species")) # (5)!
],
controls=[
vm.Parameter(
targets=["graph.data_frame.number_of_points"], # (6)!
selector=vm.Slider(min=10, max=100, step=10, value=10),
)
],
)

dashboard = vm.Dashboard(pages=[page])

Vizro().build(dashboard).run()
```

1. `load_iris_data` takes a single argument, `number_of_points`, with a default value of 10.
2. `iris` is a pandas DataFrame created by reading from the CSV file `iris.csv`.
3. Sample points at random, where `number_of_points` gives the number of points selected.
4. To use `load_iris_data` as dynamic data it must be added to the data manager. You should **not** actually call the function as `load_iris_data()` or `load_iris_data(number_of_points=...)`; doing so would result in static data that cannot be reloaded.
5. Give the `vm.Graph` component `id="graph"` so that the `vm.Parameter` can target it. Dynamic data is referenced by the name of the data source `"iris"`.
6. Create a `vm.Parameter` to target the `number_of_points` argument for the `data_frame` used in `graph`.

=== "Result"
[![ParametrizedDynamicData]][ParametrizedDynamicData]

[ParametrizedDynamicData]: ../../assets/user_guides/data/parametrized_dynamic_data.gif

Parametrized data loading is compatible with [caching](#configure-cache). The cache uses [memoization](https://flask-caching.readthedocs.io/en/latest/#memoization), so that the dynamic data function's arguments are included in the cache key. This means that `load_iris_data(number_of_points=10)` is cached independently of `load_iris_data(number_of_points=20)`.

!!! warning

You should always [treat the content of user input as untrusted](https://community.plotly.com/t/writing-secure-dash-apps-community-thread/54619). For example, you should not expose a filepath to load without passing it through a function like [`werkzeug.utils.secure_filename`](https://werkzeug.palletsprojects.com/en/3.0.x/utils/#werkzeug.utils.secure_filename), or you might enable arbitrary access to files on your server.

It is not possible to pass [nested parameters](parameters.md#nested-parameters) to dynamic data. You can only target top-level arguments of the data loading function and not address nested keys in a dictionary.

### Filter update limitation

If your dashboard includes a [filter](filters.md) then the values shown on a filter's [selector](selectors.md) _do not_ update while the dashboard is running. This is a known limitation that will be lifted in future releases, but if is problematic for you already then [raise an issue on our GitHub repo](https://github.com/mckinsey/vizro/issues/).

This limitation is why all arguments of your dynamic data loading function must have a default value. Regardless of the value of the `vm.Parameter` selected in the dashboard, these default parameter values are used when the `vm.Filter` is built. This determines the type of selector used in a filter and the options shown, which cannot currently be changed while the dashboard is running.

Although a selector is automatically chosen for you in a filter when your dashboard is built, remember that [you can change this choice](filters.md#changing-selectors). For example, we could ensure that a dropdown always contains the options "setosa", "versicolor" and "virginica" by explicitly specifying your filter as follows.

```py
vm.Filter(column="species", selector=vm.Dropdown(options=["setosa", "versicolor", "virginica"])
```
2 changes: 1 addition & 1 deletion vizro-core/docs/pages/user-guides/filters.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Currently available selectors are [`Checklist`][vizro.models.Checklist], [`Dropd
vm.Graph(figure=px.scatter(iris, x="sepal_length", y="petal_width")),
],
controls=[
vm.Filter(column="species",selector=vm.RadioItems()),
vm.Filter(column="species", selector=vm.RadioItems()),
],
)

Expand Down
6 changes: 5 additions & 1 deletion vizro-core/docs/pages/user-guides/parameters.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# How to use parameters

This guide shows you how to add parameters to your dashboard. One main way to interact with the charts/components on your page is by changing the parameters of the underlying function that creates the chart/component.
This guide shows you how to add parameters to your dashboard. One main way to interact with the charts/components on your page is by changing the parameters of the underlying function (`figure` argument) that creates the chart/component. Parameters can also be used to [modify the data loaded into the dashboard itself](data.md/#parametrize-data-loading).

The [`Page`][vizro.models.Page] model accepts the `controls` argument, where you can enter a [`Parameter`][vizro.models.Parameter] model. For example, if the charting function has a `title` argument, you could configure a parameter that enables the user to select the chart title with a dropdown.

Expand Down Expand Up @@ -177,3 +177,7 @@ If you want to change nested parameters, you can specify the `targets` argument
In the above example, the object passed to the function argument `color_discrete_map` is a dictionary which maps the different flower species to fixed colors (for example, `{"virginica":"blue"}`). In this case, only the value `blue` should be changed instead of the entire dictionary. This can be achieved by specifying a target as `scatter.color_discrete_map.virginica`.

Note that in the above example, one parameter affects multiple targets.

## Dynamic data parameters

If you use [dynamic data](data.md/#dynamic-data) that can be updated while the dashboard is running then you can pass parameters to the dynamic data function to alter the data loaded into your dashboard. For detailed instructions, refer to the section on [parametrized data loading](data.md/#parametrize-data-loading).
113 changes: 83 additions & 30 deletions vizro-core/examples/_dev/app.py
Original file line number Diff line number Diff line change
@@ -1,48 +1,101 @@
"""Dev app to try things out."""
"""Example to show dashboard configuration."""

import numpy as np
import vizro.models as vm
import vizro.plotly.express as px
from flask_caching import Cache
from vizro import Vizro
from vizro.tables import dash_data_table
from vizro.actions import export_data
from vizro.managers import data_manager

df = px.data.gapminder()

dropdown_column = "Label"
dropdown_options = ["-- A --", "-- B --", "-- C --"]
# Note need to specify default value if have Filter since that calls data load function
# Then have problem that filter options don't get updated when data source changes
def load_iris_data(points=1, additional_points=1):
"""Load iris data."""
iris = px.data.iris()
return iris.sample(points + additional_points)

# Add a 'Label' column to the data where options are randomly selected between 'A', 'B', 'C'
df[dropdown_column] = np.random.choice(dropdown_options, size=len(df))

# Drop the 'iso_alpha' and 'iso_num' columns
df.drop(["iso_alpha", "iso_num"], axis=1, inplace=True)
data_manager["iris"] = load_iris_data

# If you want to cache the data on the page_2 differently from page_1, you can define another data_manager entry with
# the same function and assign it to the page_2 graphs. e.g. `data_manager["iris_2"] = load_iris_data`

page = vm.Page(
title="Table Page",
# SimpleCache
data_manager.cache = Cache(config={"CACHE_TYPE": "SimpleCache"})

# RedisCache
# data_manager.cache = Cache(
# config={"CACHE_TYPE": "RedisCache", "CACHE_REDIS_HOST": "localhost", "CACHE_REDIS_PORT": 6379}
# )

# Timeout
data_manager["iris"].timeout = 30


# TEST CASE:
# There are 2 Parameters per page that control the number of points and additional points.
# Set all of them to same number and see that the output for all of them will be the same if cache is configured,
# otherwise the output will be different.

page_1 = vm.Page(
title="My first page",
components=[
vm.Graph(
id="graph_1", figure=px.scatter(data_frame="iris", x="sepal_length", y="petal_width", color="species")
),
vm.Graph(
id="graph_2", figure=px.scatter(data_frame="iris", x="sepal_length", y="petal_width", color="species")
),
vm.Button(text="Export", actions=[vm.Action(function=export_data())]),
],
controls=[
vm.Parameter(
targets=["graph_1.x", "graph_2.x"], selector=vm.RadioItems(options=["sepal_length", "sepal_width"])
),
vm.Parameter(
targets=["graph_1.data_frame.points", "graph_1.data_frame.additional_points"],
selector=vm.Slider(title="Graph 1 points / Graph 1 additional_points", min=1, max=10, step=1),
),
vm.Parameter(
targets=["graph_2.data_frame.points", "graph_2.data_frame.additional_points"],
selector=vm.Slider(title="Graph 2 points / Graph 2 additional_points", min=1, max=10, step=1),
),
vm.Filter(column="species", selector=vm.Dropdown(options=["setosa", "versicolor", "virginica"])),
],
)

page_2 = vm.Page(
title="My second page",
components=[
vm.Table(
title="Table",
figure=dash_data_table(
data_frame=df,
columns=[
{"name": i, "id": i, "presentation": "dropdown"} if i == dropdown_column else {"name": i, "id": i}
for i in df.columns
],
editable=True,
dropdown={
dropdown_column: {
"options": [{"label": i, "value": i} for i in dropdown_options],
"clearable": False,
},
},
),
vm.Graph(
id="graph_second_1",
figure=px.scatter(data_frame="iris", x="sepal_length", y="petal_width", color="species"),
),
vm.Graph(
id="graph_second_2",
figure=px.scatter(data_frame="iris", x="sepal_length", y="petal_width", color="species"),
),
vm.Button(text="Export", actions=[vm.Action(function=export_data())]),
],
controls=[
vm.Parameter(
targets=["graph_second_1.x", "graph_second_2.x"],
selector=vm.RadioItems(options=["sepal_length", "sepal_width"]),
),
vm.Parameter(
targets=["graph_second_1.data_frame.points", "graph_second_2.data_frame.points"],
selector=vm.Slider(title="Graph 1 points / Graph 2 points", min=1, max=10, step=1),
),
vm.Parameter(
targets=["graph_second_1.data_frame.additional_points", "graph_second_2.data_frame.additional_points"],
selector=vm.Slider(title="Graph 1 additional_points / Graph 2 additional_points", min=1, max=10, step=1),
),
vm.Filter(column="species", selector=vm.Dropdown(options=["setosa", "versicolor", "virginica"])),
],
controls=[vm.Filter(column="continent")],
)

dashboard = vm.Dashboard(pages=[page])
dashboard = vm.Dashboard(pages=[page_1, page_2])

if __name__ == "__main__":
Vizro().build(dashboard).run()

0 comments on commit 31d706e

Please sign in to comment.