-
Notifications
You must be signed in to change notification settings - Fork 273
/
reactivity.md
197 lines (144 loc) · 6.06 KB
/
reactivity.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
# Reactivity
Every marimo notebook is a directed acyclic graph (DAG) that models how data
flows across blocks of Python code, i.e., cells.
marimo _react_ to code changes, automatically executing cells with the latest
data. Execution order is determined by the DAG, not by the order of cells on
the page.
Reactive execution is based on a single rule:
```{admonition} Runtime Rule
:class: tip
When a cell is run, marimo automatically runs all other cells that
**reference** any of the global variables it **defines**.
```
```{admonition} Lazy evaluation
:class: note
The runtime can be configured to be lazy, only running cells when you ask for
them to be run and marking affected cells as stale, instead of automatically
running them. Learn more in the
[runtime configuration guide](/guides/configuration/runtime_configuration.md)
```
## References and definitions
A marimo notebook is a DAG where nodes are cells and edges are data
dependencies. marimo creates this graph by statically analyzing each cell
(i.e., without running it) to determine its
- references, the global variables it reads but doesn't define;
- definitions, the global variables it defines.
```{admonition} Global variables
:class: tip
A variable can refer to any Python object. In particular, functions,
classes, and imported names are all variables.
```
There is an edge from one cell to another if the latter cell references any
global variables defined by the former cell. The rule for reactive execution
can be restated in terms of the graph: when a cell is run, its descendants are
run automatically.
## Global variable names must be unique
To make sure your notebook is DAG, marimo requires that every global
variable be defined by only one cell.
```{admonition} Local variables
:class: note
Variables prefixed with an underscore are local to a cell (_.e.g._, `_x`). You
can use this in a pinch to fix multiple definition errors, but try instead to
refactor your code.
```
This rule encourages you to keep the number of global variables in your
program small, which is generally considered good practice.
## Local variables
Global variables prefixed with an underscore (_e.g._, `_x`) are "local" to a
cell: they can't be read by other cells. Multiple cells can reuse the same
local variables names.
If you encapsulate your code using functions and classes when needed,
you won't need to use many local variables, if any.
## No hidden state
Traditional notebooks like Jupyter have _hidden state_: running a cell may
change the values of global variables, but these changes are not propagated to
the cells that use them. Worse, deleting a cell removes global
variables from visible code but _not_ from program memory, a common
source of bugs. The problem of hidden state has been discussed by
many others
[[1]](https://austinhenley.com/pubs/Chattopadhyay2020CHI_NotebookPainpoints.pdf)
[[2]](https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g362da58057_0_1).
**marimo eliminates the problem of hidden state**: running
a cell automatically refreshes downstream outputs, and _deleting a cell
deletes its global variables from program memory_.
<div align="center">
<figure>
<img src="/_static/docs-delete-cell.gif"/>
<figcaption>No hidden state: deleting a cell deletes its variables.</figcaption>
</figure>
</div>
<a name="reactivity-mutations"></a>
## Avoid mutating variables
marimo's reactive execution is based only on the global variables a cell reads
and the global variables it defines. In particular, _marimo does not track
mutations to objects_, _i.e._, mutations don't trigger reactive re-runs of
other cells. It also does not track the definition or mutation of object
attributes. For this reason, **avoid defining a variable in one cell and
mutating it in another**.
If you need to mutate a variable (such as adding a new column to a dataframe),
you should perform the mutation in the same cell as the one that defines it,
Or try creating a new variable instead.
### Examples
**Create a new variable instead of mutating an existing one.**
_Don't_ do this:
```python
l = [1]
```
```python
l.append(2)
```
_Instead_, do this:
```python
l = [1]
```
```python
extended_list = l + [2]
```
**Mutate variables in the cells that define them.**
_Don't_ do this:
```python
df = pd.DataFrame({"my_column": [1, 2]})
```
```python
df["another_column"] = [3, 4]
```
_Instead_, do this:
```python
df = pd.DataFrame({"my_column": [1, 2]})
df["another_column"] = [3, 4]
```
```{admonition} Why not track mutations?
:class: note
Tracking mutations reliably is a fundamentally impossible task in Python; marimo
could never detect all mutations, and even if we could, reacting to mutations could
result in surprising re-runs of notebook cells. The simplicity of marimo's
static analysis approach, based only on variable definitions and references,
makes marimo easy to understand and encourages well-organized notebook code.
```
## Runtime configuration
Through the notebook settings menu, you can configure how and when marimo runs
cells. In particular, you can disable autorun on startup, disable autorun
on cell execution, and enable a powerful module autoreloader. Read our
[runtime configuration guide](/guides/configuration/runtime_configuration.md) to learn more.
## Disabling cells
Sometimes, you may want to edit one part of a notebook without triggering
automatic execution of its dependent cells. For example, the dependent cells
may take a long time to execute, and you only want to iterate on the first part
of a multi-cell computation.
For cases like this, marimo lets you **disable** cells: when a cell is
disabled, it and its dependents are blocked from running.
<div align="center">
<figure>
<img src="/_static/docs-disable-cell.gif"/>
<figcaption>Disabling a cell blocks it from running.</figcaption>
</figure>
</div>
When you re-enable a cell, if any of the cell's ancestors ran while it was
disabled, marimo will automatically run it.
<div align="center">
<figure>
<img src="/_static/docs-enable-cell.gif"/>
<figcaption>Enable a cell through the context menu. Stale cells run
automatically.</figcaption>
</figure>
</div>