Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

[FEATURE REQUEST]: store state between executions of blocks #8

Closed
mitinarseny opened this issue Nov 6, 2019 · 2 comments
Closed

[FEATURE REQUEST]: store state between executions of blocks #8

mitinarseny opened this issue Nov 6, 2019 · 2 comments

Comments

@mitinarseny
Copy link

mitinarseny commented Nov 6, 2019

How about storing state of python objects between executions of different blocks?

The idea is following: imagine we have two code blocks and in the second one we want to have variable available, which was defined in previous block. At the moment each code block executes in isolated (in terms of global objects available) python enviroment. So, if you are unable to do the following:

# First
```{.pyplot capture="a,b"}
import matplotlib.pyplot as plt

a = list(range(5))
b = list(range(10))

plt.figure()
plt.plot(a, list(map(lambda x: x**2, a)))
plt.title('This is an example figure')
```
# Second
```{.pyplot needs="a,b"}
import matplotlib.pyplot as plt

plt.figure()
plt.plot(a, list(map(lambda x: x**3, a)))
plt.plot(a, list(map(lambda x: x**2, b)))
plt.title('This is an example figure')
```

Sure, we can simply copy-paste a = list(range(5)) and b = list(range(10))or put it in separate file and include each time we want to have it available in our code. But this could result in overhead in performance if it something harder than list(range(10)). For example, we might need to preprocess out data, show the density histogram, then normalize the preprocessed data and show new density plot.

I can think of following approach:

Firstly, we need to launch only one instance of python process while filter is working. And feed it with our scripts. It should store our variables and function definitions and they will persist in memory during work of filter. Look how it could be achieved with bash:

# generate script files
for ((i = 0 ; i < 10 ; i++)); do echo "print($i)" > "file${i}.py"; done

# feed them to python
for ((i = 0 ; i < 10 ; i++)); do cat "file${i}.py"; done | python3

Secondly, I don't know yet how, but it would be cool if we can externally provide python process to this filter to maintain state between running of pandoc with this filter. Consider this example:

# Example
Here is only one plot
```{.pyplot capture="a"}
import matplotlib.pyplot as plt

a = list(range(5)) # there will be much harder task in real-life example
plt.figure()
plt.plot(a, list(map(lambda x: x**3, a)))
plt.title('This is an example figure')
```
Here a two plots:
```{.pyplot needs="a"}
import matplotlib.pyplot as plt

b = list(range(10))

plt.figure()
plt.plot(a, list(map(lambda x: x**3, a)))
plt.plot(a, list(map(lambda x: x**2, b)))
plt.title('This is an example figure')
```

Then I convert it with pandoc:

$ pandoc --filter pandoc-pyplot -t html5 example.md

And imagine if I slightly change the code:

  Here a two plots:
  ```{.pyplot needs="a"}
  import matplotlib.pyplot as plt
  
  b = list(range(10))

  plt.figure()
- plt.plot(a, list(map(lambda x: x**3, a)))
+ plt.plot(a, list(map(lambda x: x**4, a)))
  plt.plot(a, list(map(lambda x: x**2, b)))
  plt.title('This is an example figure')
  ```

If I try to convert it again, first code block will have to be executed as well:

$ pandoc --filter pandoc-pyplot -t html5 example.md

Maybe we can borrow some ideas from here:

dir=`mktemp -d /tmp/temp.XXX`
keep_pipe_open=$dir/keep_pipe_open
pipe=$dir/pipe

mkfifo $pipe
touch $keep_pipe_open

# Read from pipe:
python3 < $pipe &

# Keep the pipe open:
while [ -f $keep_pipe_open ]; do sleep 1; done > $pipe &

# Write to pipe:
for ((i = 0 ; i < 10 ; i++)); do cat "file${i}.py"; done > $pipe

# close the pipe:
rm $keep_pipe_open
wait

rm -rf $dir
@LaurentRDC
Copy link
Owner

Resetting the state between each figures make reasoning about code much easier.

That's the idea behind providing the script source in the figure caption: users can read the source and immediately know how to make this figure again. Adding hidden state introduces complications in this regard.

Not only that, but then we would need to keep track of the order in which blocks are processed by the filter.

@mitinarseny
Copy link
Author

I'm just trying to reinvent Jupyter, I think :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants