Evaluate and Cache new Code Chunks in Documentation Mode #19

brandonwillard · 2015-02-04T17:57:58Z

If I add a new chunk after the previous chunks are cached, I get the following exception:

Pweave -f texminted -c -d missing_chunk_test.texw
Traceback (most recent call last):
  File "/usr/local/bin/Pweave", line 9, in <module>
    load_entry_point('Pweave==0.23', 'console_scripts', 'Pweave')()
  File "/usr/local/lib/python2.7/dist-packages/Pweave-0.23-py2.7.egg/pweave/scripts.py", line 53, in weave
    pweave.weave(infile, **opts_dict)
  File "/usr/local/lib/python2.7/dist-packages/Pweave-0.23-py2.7.egg/pweave/__init__.py", line 69, in weave
    doc.weave(shell)
  File "/usr/local/lib/python2.7/dist-packages/Pweave-0.23-py2.7.egg/pweave/pweb.py", line 141, in weave
    self.run(shell)
  File "/usr/local/lib/python2.7/dist-packages/Pweave-0.23-py2.7.egg/pweave/pweb.py", line 109, in run
    runner.run()
  File "/usr/local/lib/python2.7/dist-packages/Pweave-0.23-py2.7.egg/pweave/processors.py", line 53, in run
    success = self._getoldresults()
  File "/usr/local/lib/python2.7/dist-packages/Pweave-0.23-py2.7.egg/pweave/processors.py", line 260, in _getoldresults
    executed.append(self._oldresults[i].copy())
IndexError: list index out of range
Makefile:14: recipe for target 'missing_chunk_test.tex' failed
make: *** [missing_chunk_test.tex] Error 1

I was assuming that the caching mechanism would notice the missing chunk, evaluate and cache it, then proceed. Is that the intended functionality?

sgi3 · 2015-03-09T00:03:59Z

I have the same problem, please fix this. The way it works now is that i have to cache all chunks again with Pweave -f texminted -c %.texw when adding a new chunk

mpastell · 2015-03-31T20:34:13Z

I don't have time to work on this at the moment. I agree that the implementation is not ideal, you're welcome to submit a pull request if you have a suggestion on how to fix it.

Note that Pweave only caches input and output text and not Python objects, so if new chunks need the data from old ones there is no easy fix to this problem.

brandonwillard · 2015-03-31T20:54:42Z

Gotcha. I’ve been making some small changes toward those ends,
so—hopefully—I’ll have a pull request for you.

On Tue, Mar 31, 2015 at 3:34 PM, Matti Pastell notifications@github.com
wrote:

I don't have time to work on this at the moment. I agree that the
implementation is not ideal, you're welcome to submit a pull request if you
have a suggestion on how to fix it.

Note that Pweave only caches input and output text and not Python objects,
so if new chunks need the data from old ones there is no easy fix to this
problem.

—
Reply to this email directly or view it on GitHub
#19 (comment).

brandonwillard · 2016-03-29T23:26:15Z

Seems like one could simply bypass caching in documentation mode and use the caching magic in an IPython processor. A subclass of PwebIPythonProcessor that loads the extension and adds the magic before the self.IPy.run_* statements might do the trick.

scfrank · 2017-08-24T00:10:56Z

Has there been any activity on this?

I'd really appreciate chunk-level caching functionality, which seems like it would be closely related.
My use case: I have an increasingly long document with more and more pweave-generated figures, where I'd like to only have to recompile the one I'm currently working on.

Thanks for creating pweave! It's encouraged me to plot more graphs, which is always good :-)

brandonwillard · 2017-09-09T20:28:52Z

I've been slowly taking a shot at improved caching (see here), but progress has been slow due to multiple competing interests. Namely, a desire to

fold inline chunks into the general chunk framework,
provide multi-line chunk options,
provide generalized caching
- e.g. naive output-only caching that considers changes in buffer content/source and chunk settings,
make everything work almost entirely within the Jupyter ecosystem
- every chunk evaluation engine is necessarily a Jupyter kernel
- use of nbformat as the underlying parsed document format,
and provide precision Python-only caching
- bytecode-aware caching, via the mechanics behind the with hack given here.

mpastell · 2017-09-10T10:25:43Z

@brandonwillard Those are multiple big changes that you are talking about. Please don't submit them as one pull request, but split it into separate ones.

Note:

Every chunk evaluation engine is already a Jupyter kernel
I don't see the benefit of using nbformat as the parsed document format, you can already use it for output.

I suggest you first do:

fold inline chunks into the general chunk framework
provide generalized caching e.g. naive output-only caching that considers changes in buffer content/source and chunk settings,

I have decided not to allow multi-line chunk options as it breaks editor support and I haven't seen a compelling need for it. If you can up with a proper implementation with tests I can accept it, but put it as separate pull request.

brandonwillard · 2017-09-10T18:31:16Z

Oh, sorry, I hadn't done that work with a PR in mind; it was just a test branch that started with caching and turned into all sorts of stuff. If there's an interest in those latter two goals, I can separate them and make PRs. As for the nbformat idea, I can start an issue discussing my reasons.

fgregg · 2018-04-13T03:38:11Z

@brandonwillard how were you thinking of implementing save_chunk_state master...brandonwillard:caching-changes#diff-2747ccbd23b5ea3c1c42eb01071e5a6eR166

brandonwillard · 2018-04-13T04:06:00Z

Ah, yeah, I left off with the idea of incrementally pickling the session in _[save|load]_chunk_state. This idea isn't all that efficient/feasible without, perhaps, an incremental approach.

At around the same time, I was experimenting with a more granular, variable-level caching that uses code/ASTs extracted from with bodies and had intended to port this idea instead of using (incremental) session caching.

Regardless, I've gone full org-mode nowadays, so I don't know when I'll get time to jump back into this!

fgregg · 2018-04-13T14:50:24Z

Thanks @brandonwillard.

fgregg · 2018-04-13T15:04:41Z

@brandonwillard, both of the approaches you considered seem particular to python. Currently, it looks like Pweave is trying to not be tied to Python by using Jupyter to allow different kernels. Do you know if Jupyter kernel managers have a language-independent means to serialize the state of a kernel?

Stack Overflow seems to suggest no

brandonwillard · 2018-04-14T00:06:56Z

Yeah, I think that any non-naive caching (e.g. more than just caching output and validating against source text differences) is necessarily language-specific.

However, it seems like more than a few popular languages have straight-forward runtime bytecode tools, AST generation and — at the very least — introspection capabilities. As with Python, it's possible to implement a less naive caching with those.

Regarding Jupyter, it would be fantastic to see an abstraction of bytecode and/or AST objects exposed by the client protocol. The project has a somewhat related idea in its instrospection messages. Otherwise, one can always implement smart caching at the kernel level and use custom messages.

fgregg mentioned this issue Apr 13, 2018

Start implementing caching for unchanging code blocks datamade/Pweave#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate and Cache new Code Chunks in Documentation Mode #19

Evaluate and Cache new Code Chunks in Documentation Mode #19

brandonwillard commented Feb 4, 2015

sgi3 commented Mar 9, 2015

mpastell commented Mar 31, 2015

brandonwillard commented Mar 31, 2015

brandonwillard commented Mar 29, 2016

scfrank commented Aug 24, 2017

brandonwillard commented Sep 9, 2017

mpastell commented Sep 10, 2017

brandonwillard commented Sep 10, 2017

fgregg commented Apr 13, 2018

brandonwillard commented Apr 13, 2018

fgregg commented Apr 13, 2018

fgregg commented Apr 13, 2018

brandonwillard commented Apr 14, 2018

Evaluate and Cache new Code Chunks in Documentation Mode #19

Evaluate and Cache new Code Chunks in Documentation Mode #19

Comments

brandonwillard commented Feb 4, 2015

sgi3 commented Mar 9, 2015

mpastell commented Mar 31, 2015

brandonwillard commented Mar 31, 2015

brandonwillard commented Mar 29, 2016

scfrank commented Aug 24, 2017

brandonwillard commented Sep 9, 2017

mpastell commented Sep 10, 2017

brandonwillard commented Sep 10, 2017

fgregg commented Apr 13, 2018

brandonwillard commented Apr 13, 2018

fgregg commented Apr 13, 2018

fgregg commented Apr 13, 2018

brandonwillard commented Apr 14, 2018